In one of my recent courses, we talked about Java 5 annotations.
I told my students that before that time, one had to use marker interface instead:
an interface without any method.
Then, I showed the Serializable
interface as an example.
I started to explain it, then realized I would need a lot of time to fully cover it.
This post is an attempt at that.
Serialization is the process of transforming an existing in-memory Java object to a stream of bytes. That stream can then be transferred over the network, or written to a file.
Use-case(s)
I’ve no proof to back that up, but I believe that serialization was initially meant to transfer objects from one running JVM instance to another one. For example, a long time ago, EJBs were meant to cross JVM boundaries.
In order for an EJB to move from a JVM to another JVM, it had to be serialized.
Hence, the first version of EJBs had to be Serializable
.
It’s not the case anymore. |
The only current serialization use-case I know about is the storage of session data between runs in the Tomcat servlet/JSP container: when Tomcat stops, a shutdown hook writes session data on disk. When it starts again, session data is read from disk, so that users have still access to their session after restart. In that regard, the process is pretty similar to Windows hibernate feature.
This behavior is obviously not enforced by the API, as HttpSession.setAttribute() accepts a value of type Object .
However, it’s recommended that objects stored in the session implement Serializable to cover all bases.
|
Requirements
As stated above, Serializable
is a marker interface:
it has no methods.
However, there are requirements, even if they are not related to interface implementation:
- An attribute of a
Serializable
class must either be:- A primitive type
- A
Serializable
type - Marked
transient
, so that it won’t be serialized
- The first non-serializable class in a
Serializable
class hierarchy must offer a no-arg constructor. It will be used during the deserialization process.
Customizing serialization
Sometimes, an object has to be serialized, but its class cannot satisfy the rule #2 above:
an attribute is of a type that is not Serializable
and outside the developer’s control .e.g InitialContext
.
To overcome that issue, Java allows to customize the serialization process via 3 methods:
private void readObject(java.io.ObjectInputStream stream)
private void writeObject(java.io.ObjectOutputStream stream)
private void readObjectNoData()
While the first 2 methods are pretty self-explanatory, the last one deserves a description:
The readObjectNoData method is responsible for initializing the state of the object for its particular class in the event that the serialization stream does not list the given class as a superclass of the object being deserialized. This may occur in cases where the receiving party uses a different version of the deserialized instance’s class than the sending party, and the receiver’s version extends classes that are not extended by the sender’s version. This may also occur if the serialization stream has been tampered; hence, readObjectNoData is useful for initializing deserialized objects properly despite a "hostile" or incomplete source stream.
https://docs.oracle.com/javase/9/docs/api/java/io/ObjectInputStream.html
By design, all previous methods must have the private modifier.
This is designed so that they may not be overriden in sub-classes.
|
Externalizable
Externalizable
is a specialization of Serializable
that relies on interface implementation to customize serialization.
The de/serialization process will check if a Serializable is also an Externalizable .
In the later case, it will call the external-related methods.
If not, it will default to "default" serialization.
|
Class version
In the Tomcat session serialization scenario above, I made an implicit assumption:
that the Class
of the object being serialized will be the same as the Class
of the one being deserialized.
Although not frequent, that might not be always the case.
For example, Tomcat was stopped to update the webapp, and the class has been updated.
To solve that issue, the compiler writes a version in a static final long serialVersionUID
field.
Any access modifier is allowed, but private should be preferred
|
The serialization process will write the serialVersionUID
value along with the object.
During deserialization, the value will be compared to the one of the class currently on the classpath.
If both are different, deserialization will fail with an InvalidClassException
.
There’s no guarantee that keeping a class unchanged will generate the serialVersionUID
across compiler versions and over time.
Hence, it’s recommended to write that value yourself - and change it only for incompatible class changes e.g.:
- adding an instance method is considered compatible
- removing an attribute is not
Given that information, one may now understand why generating a random value with the IDE will fix the IDE warning but is utterly useless. |
Conclusion
Java serialization is seldom necessary. However, when it is, it’s important to know about its finer points.