Is Serialization Slow? Debunk another Java urban myth with JBoss Serialization
Well, if the past is any guidance in the short history of J2EE, conventional wisdom is often wrong. This one is no exception. As I will prove with JBoss Serialization, you can in fact make object copying a lot faster (more than 70% faster to be accurate) and hence drastically boost performance of call-by-value operations in your applications.
First, I must note that object serialization is not just about I/O. Serialization is about transformation – converting objects into bytes for easy transportation. The only thing that makes ObjectOutputStream an I/O operation is the fact that it extends an I/O class. Everything happening inside ObjectOutputStream is about reflection, and converting these objects into bytes back and forth.
Now, to improve serialization performance, the interesting questions are: What if you could have intermediate metadata to do more than just send bytes across the stream? What if you only wanted to copy an object between different class loaders (or different applications inside the same VM)?
I had that need, and created such metadata as the way to copy objects between different applications without actually converting then to Byte Arrays. Doing this saved a lot of CPU time on call-by-value operations. The new high performance Java object serialization framework is known as JBoss Serialization. It is now used inside the JBoss application server to support JBoss EJB3 container implementation.
The core concept behind JBoss Serialization is "smart cloning": a reflection layer capturing properties into a DataContainer (the metadata), using every aspect of the Serialization specification (every aspect except one, the protocol wire, as I don’t have a wire on that case). It’s much faster to get a reference for each final and immutable object (Strings for example) than converting them individually to byte arrays. For example, this is what happens every time you send an integer over the wire (or the temporary byte array used to do a copy):
Byte1= (someInteger>>> 24) & 0xFF
Byte2= (someInteger >>> 16) & 0xFF
Byte3= (someInteger >>> 8) & 0xFF
Byte4= (someInteger >>> 0) & 0xFF
Smart cloning instead uses a DataContainer (through java.io interfaces like DataOutput and DataInput), and reuses the entire integer between two different Class/Classloaders. It also uses other parts of Serialization like Externalization, writeReplace, and some private methods like writeObject and readObject (part of the Serialization Specification).
Originally, I started JBoss Serialization as a way to smart clone objects, but then I realized that I could easily save the DataContainer to a regular stream (like saving the actual state of a transformation). I expected this to be as expensive as serialization due to the data transformations, but to my surprise in most cases JBoss Serialization was faster than Java Serialization. (about 15% on serialization over the wire, and 70% on call by value operations). The result is a really nice project that can be used to copy objects and serialize objects (even non serializable objects now, as I can ignore the tag interface if specified so).
JBoss Serialization is also avoiding any synchronization bottlenecks on the metadata, what improves the metadata discovery significantly
So, if you need to copy an object, don’t use ObjectOutputStream any more. Use JBossObjectOutputStream.smartClone method.
Or if you want to serialize something non serializable, you might want also to try JBoss Serialization. All you have to do is:
ObjectOutputStream out = new JBossObjectOutputStream(someOutput);
ObjectInputStream inp = new JBossObjectInputStream(someInput);
Increasing call-by-value performance by 70% with a simple refactor. That is the power of JBoss Serialization.
To learn more, you can refer to the JBoss Serialization project page: