12 Replies Latest reply on Feb 7, 2006 12:12 PM by starksm64

    Binary Incompatibilities between 1.2.4 and 1.2.3

    brian.stansberry

      There are some wire-format incompatibilites between versions the 1.2.4/1.2.4SP1 releases of JBossCache and the earlier releases. The releases are API-compatible, so the later releases can be dropped in as replacements for earlier releases, but earlier and later releases cannot interoperate.

      Some details re: the binary incompatibilities so everyone is on the same page if we want to consider restoring compatibility.

      There are four known incompatibilites between the versions (please advise if you know more):

      1) Serialization of in-memory node state for state transfer. In 1.2.3, the root node of the cache, an instance of *class* Node was serialized, with all children thereby being recursively serialized. In 1.2.4 Node became an interface, and the actual class was TreeNode. However, the two class' internal data structures remain the same and serialization was done via read/writeExternal. It should be possible to restore compatibility if we were willing to rename classes so TreeNode becomes the interface and Node once again is the implementation class. I see no reference to the Node interface in any public API, so doing this won't break api compatibility. This is obviously ugly and would break any compatibility with the existing 1.2.4 releases, but it is an option.

      2) Overall assembly of state for state transfer. This is done differently in 1.2.4, but the code already provides for inter-version compatibility via a "StateTransferVersion" config attribute. If problem #1 were solved we could add versions of StateTransferGenerator/StateTransferIntegrator that work with the 1.2.3 binary format.

      3) Change in the read/writeExternal implementation of Fqn. This was just an optimization (I'm pretty sure a minor one) that can be rolled back.

      4) serialVersionUID incompatibilities. These can all be fixed.

        • 1. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
          manik

          The problem is that the read/writeExternal change in the TreeNode is a major optimisation to prevent a recursive writing of the entire tree structure when only specific nodes needed writing.

          • 2. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
            brian.stansberry

            Yes, the old Externalizable implementation was very inefficient. I wouldn't advocate going back to it as the default way of doing state transfer. But the StateTransferVersion config option and the related factory pattern for generating/integrating transferred state makes that unnecessary. If a > 1.2.3 instance of TreeCache wants to do a state transfer, I think it should by default use the more efficient approach. Only if the cache has its StateTransferVersion set to 123 will it use the old inefficient externalization technique. This would be something users would have to specifically configure to allow interoperability, at the cost of performance.

            I looked at the current TreeNode, and it doesn't implement Externalizable (or even Serializable). So, there is no other code in JBossCache that is depending on a particular serialization format for this class. If the old inefficient read/writeExternal were restored, the only use of it would be for 1.2.3-compatible state transfer. (But we should comment the hell out of the method to ensure no other usage creeps in!).

            • 3. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
              brian.stansberry

              My purpose in starting this thread was to document the technical issues related to version incompatibility in order to facilitate policy discussions about what to do about it. But I'll go ahead and post a "policy" related comment :)

              I initially wasn't particularly in favor of trying to restore compatibility, since the fix is fairly ugly and doing it would leave the 1.2.4 releases "marooned", incompatible with the versions both before and after.

              But, if the 4.0 series is like 3.2, it could be up to another couple of years before we stop cutting releases on that branch. Having JBossCache stuck at version 1.2.3 in those releases would IMHO be a very bad thing for the health of the project and for sure will drive up support costs. If it was just a matter of waiting another few months until 5.0 comes out and then the issue goes away, I'd feel differently.

              • 4. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                starksm64

                We have to support 3.2.x for 2 years after the release of 5.0. The first step is to get a wire format that will allow for evolution while maintaining backward compatibilty. If this can be and restore backward compatibility, great. If it cannot, we can consider a one time incompatibility to allow for improved behavior and supportability.

                • 5. Re: Binary Incompatibilities between 1.2.4 and 1.2.3

                  Scott, when you talk about this one time incompatibility deal, you are not referring specifically to 3.2.x only right? Just need clarification since we are not planning to update 3.2.x with new JBossCache release.

                  • 6. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                    manik

                    Perhaps a 'one-time' incompatibility to break the readExternal/writeExternal wire formats in 1.2.3? And since this would break anyway, stick with the Node being an Interface change as well? I don't foresee either of these changing again for the foreseeable future, but perhaps this what we should discuss here as well.

                    • 7. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                      belaban

                      I have added this as item to be discussed during our Neuchatel meeting in 2 weeks, I will schedule a conf call after the meeting to discuss our findings and suggest a strategy going forward

                      • 8. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                        brian.stansberry

                        I was able to check out 1.2.4SP1 and resolve the four issues listed in my first post. Ran the "all-functionaltests" target from the test suite and saw no regressions. I then did some (successful) manual interoperability testing in JBossAS as follows:

                        1) Start an instance of 4.0.3SP1; it uses 1.2.3.1.
                        2) Drop the new jboss-cache.jar in /server/all/lib in my Branch_4_0 build module
                        3) Update the build module's tc5-cluster-service.xml to set attribute "StateTransferVersion" to "123".
                        4) Run the http session replication unit tests in the test suite with REPL_ASYNC. This launches 2 4.0.4RC1 instances, which saw and formed a cluster with the 4.0.3SP1 instance. Tests passed, saw no errors in the logs of the 4.0.3SP1 server.
                        5) Reconfigured and re-ran with REPL_SYNC. All OK.
                        6) 4.0.3SP1 server was still running, so started another 4.0.4RC1 instance. From the 4.0.3SP1 server it successfully received the left-over state from the unit tests.
                        7) Restarted the 4.0.3SP1 server. It successfully received state from the 4.0.4 instance.

                        This isn't a full, formal interoperability test, but it shows the basic functions working fine between the versions.

                        • 9. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                          belaban

                          Great ! When you're done can we get doco on this (in docbook and wiki format) ?
                          Thanks !

                          • 10. Re: Binary Incompatibilities between 1.2.4 and 1.2.3

                            This is great!

                            1. Have you tried to run the AS compatibility Junit test?

                            2. Regarding to the flag, since it also covers the Fqn read/write external. Can we use a more generic name, say, "ReplicationVersion"?

                            -Ben

                            • 11. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                              brian.stansberry

                              1) Not yet, although I expect no problems there.

                              2) Good idea, although the flag already exists in 1.2.4SP1. But I guess there is no harm in renaming it. Right now values are string versions of shorts "123", "124", "1241" , "130" etc. Could replace w/ something more string-like, "1_2_3", "1_2_4_SP1" etc. and convert to short internally. Let me know if you want that.

                              • 12. Re: Binary Incompatibilities between 1.2.4 and 1.2.3
                                starksm64

                                See the updated versioning conventions:
                                http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossProductVersioning

                                We should be using a string that is compatible with theses conventions so that any version manipulation utilities can be applied. How a version string gets compacted to a short is one such utility function that needs consistent handling.