1 2 Previous Next 24 Replies Latest reply on Jun 9, 2006 8:35 AM by galder.zamarreno

    Improving marshalling in JBossCache

    manik

      This is related to JBCACHE-198, JBCACHE-501, JBCACHE-505.

      The goal of this was to drastically reduce the message size when performing RPC in JBoss Cache. The current mechanism is to use JGroups MethodCalls and RpcDispatcher blocks.

      My prototype uses my own implementation of an rpc dispatcher (called RpcEngine for want of a better name), which extends JGroups' MessageDispatcher.

      This in turn would call TreeCacheMarshaller to marshall the MethodCall into a byte stream and this is where we can work on size efficiency.

      1. Instead of serializing MethodCall, use 1 byte for method id (maps to a list of known Methods in TreeCache which may be called remotely), 1 int representing numArgs, and the arg array, serialized individually using TreeCacheMarshaller again.

      2. Fqns, GlobalTransactions and other *known* types to use a single byte magic number to represent type, and contents efficiently packed into bytes.

      3. Simple Java types (Integers, Strings) to use a similar technique of magic numbers to represent type followed by content.

      4. User defined types, generic objects to be serialized using JBossSerialization (pending Clebert's JBossSerialization byte array size fixes)

      To provide some degree of backward compatibility, I propose a 'compat' mode in the configs or perhaps a JVM environment variable (the way we do with Fqn externalization) which will result in simply reverting to the original RpcDispatcher code rather than the RpcEngine.


        • 1. Re: Improving marshalling in JBossCache
          manik

          For even better serialization efficiency (at the cost of CPU cycles) is to introduce some reference counting logic. This needs design first - Ben, I believe you had some ideas here from a PojoCache perspective.

          • 2. Re: Improving marshalling in JBossCache
            timfox

             

            "manik.surtani@jboss.com" wrote:
            This is related to JBCACHE-198, JBCACHE-501, JBCACHE-505.

            The goal of this was to drastically reduce the message size when performing RPC in JBoss Cache. The current mechanism is to use JGroups MethodCalls and RpcDispatcher blocks.

            My prototype uses my own implementation of an rpc dispatcher (called RpcEngine for want of a better name), which extends JGroups' MessageDispatcher.

            This in turn would call TreeCacheMarshaller to marshall the MethodCall into a byte stream and this is where we can work on size efficiency.

            1. Instead of serializing MethodCall, use 1 byte for method id (maps to a list of known Methods in TreeCache which may be called remotely), 1 int representing numArgs, and the arg array, serialized individually using TreeCacheMarshaller again.

            2. Fqns, GlobalTransactions and other *known* types to use a single byte magic number to represent type, and contents efficiently packed into bytes.

            3. Simple Java types (Integers, Strings) to use a similar technique of magic numbers to represent type followed by content.

            4. User defined types, generic objects to be serialized using JBossSerialization (pending Clebert's JBossSerialization byte array size fixes)

            To provide some degree of backward compatibility, I propose a 'compat' mode in the configs or perhaps a JVM environment variable (the way we do with Fqn externalization) which will result in simply reverting to the original RpcDispatcher code rather than the RpcEngine.




            FWIW This is basically what we do in JBoss Messaging for RPC from client to server.

            Not using serialization gave us a massive performance boost (about 500%) since we're not passing about 150 bytes of unnecessary rubbish (class version information etc) on every call.

            Sending 10 or 20 bytes as opposed to 150 bytes when you're making 10000 invocations per sec makes a massive difference.

            Now we only use serialization (jboss serialization) for user defined objects not known at compile time.

            We also add a version byte at the beginning of the stream so we can provide a compatibility guarantee between different versions of clients and servers. :)

            • 3. Re: Improving marshalling in JBossCache
              brian.stansberry

               

              "manik.surtani@jboss.com" wrote:

              To provide some degree of backward compatibility, I propose a 'compat' mode in the configs or perhaps a JVM environment variable (the way we do with Fqn externalization) which will result in simply reverting to the original RpcDispatcher code rather than the RpcEngine.


              TreeCache already exposes a ReplicationVersion property that was meant for controlling these kinds of compatibility issues.

              The exposed property is a String, but internally its carried as a short, conversion via the Version class.

              This short can be included at the start of replicated byte arrays (that's what we do in state transfer), although I'm starting to question the purpose of that for the JBC use case. Including it in the message allows a recipient configured for replication version 1.4.0 to understand a message sent by a sender configured for 1.3.0. But, if the 1.4.0 cache sends a message, it is going to be in 1.4.0 format, which the 1.3.0 server won't understand. So, true communication between servers configured for different replication versions won't work anyway. Therefore it makes sense for servers to assume all messages are encoded using the version for which they themselves are configured, and there is no need include the version short in each message.



              • 4. Re: Improving marshalling in JBossCache

                1. In TreeCacheMarshaller, in addition to serializaing the original MethodCall, we also extract Fqn for use in determining the corresponding classloader. This is not alwayes needed unless you need a scoped classloader. Do you need to have a special flag inside TreeCacheMarshaller?

                2. As for PojoCache, I will wait for your prototype check in, I will then need to come up a marshalling scheme sitting on top (overrid TreeCacheMarshaller).

                • 5. Re: Improving marshalling in JBossCache
                  brian.stansberry

                  For a flag in TreeCacheMarshaller, we have the existing UseMarshalling flag. Now it controls whether we use TreeCacheMarshaller, but now I guess we always will. So the flag could be used to control whether we drop into the Region-based logic.

                  Pro: existing flag continues to function with its current fundamental purpose -- triggering unmarshalling w/ a region-based classloader.

                  Con: UseMarshalling as an attribute name isn't really accurate.

                  • 6. Re: Improving marshalling in JBossCache
                    manik

                     


                    Con: UseMarshalling as an attribute name isn't really accurate.


                    Change the name of the flag?

                    Also, I've noticed that the TreeCacheMarshaller converts Fqns to Strings before marshalling - surely this adds constraints that the Fqn can only contain String type components?



                    • 7. Re: Improving marshalling in JBossCache

                      This is kind of boostrapping issue. Before we do any marhsalling, we need to know the corresponding classloader first. And we get the cl from the fqn string corresponding to that region. If fqn is not a primitive type, it would be loaded by the system classloader. This is not right.

                      • 8. Re: Improving marshalling in JBossCache
                        brian.stansberry

                         

                        "manik.surtani@jboss.com" wrote:
                        Change the name of the flag?


                        Sounds good, although we should probably deprecate the old flag and keep it around -- say 'til 2.0 when we change lots of stuff anyway? Not a long deprecation period, but the flag has only been around since 1.2.4.

                        • 9. Re: Improving marshalling in JBossCache
                          manik

                          Yeah - ok with me

                          • 10. Re: Improving marshalling in JBossCache
                            manik

                            what shall we call this flag? UseRegionBasedMarshalling? UseMarshalling will be a deprecated synonym for the same thing, and we will always use the EnhancedTreeCacheMarshaller (I may roll this code into the existng TreeCacheMarshaller since there is no point having 2 marshallers except for comparison/POC)

                            • 11. Re: Improving marshalling in JBossCache
                              brian.stansberry

                              Sounds good.

                              • 12. Re: Improving marshalling in JBossCache
                                manik

                                 


                                This is kind of boostrapping issue. Before we do any marhsalling, we need to know the corresponding classloader first. And we get the cl from the fqn string corresponding to that region. If fqn is not a primitive type, it would be loaded by the system classloader. This is not right.


                                Doesn't this cause problems for non-String based Fqns?



                                • 13. Re: Improving marshalling in JBossCache

                                  No, why? We only use fqn.toString() as an identifier for the region but fqn itself can be an object. Of course, toString() would need to be unique.

                                  • 14. Re: Improving marshalling in JBossCache
                                    manik

                                    True, good point.

                                    1 2 Previous Next