7 Replies Latest reply on Apr 9, 2018 5:44 AM by rajiv.shivane Branched from an earlier discussion.

    Could we optimize the Helper.linkMap API?

    adinn

      Hi Rajiv,

       

      rajiv.shivane  wrote:

      Now that you mention switching from a HashMap to field, it reminds me of a feature request. I will start a new thread about it, but here is roughly what I was thinking:

      In our usage, there are many times a RuleHelper wants to store some properties against an object that was instrumented. For example when a user does DataSource.getConnection(), we want to store the Datasource.name onto the connection, so that later when we enter/exit methods on the Connection object the RuleHelper has access to the DataSource name. Today we keep these properties against the object in a WeakHashMap. It would be a lot more convenient if we added a HashMap field to all the classes we do bytecode transformation and the RuleHelpers had access to this HashMap. That way it would be simpler, performant, less prone to GC errors etc

      Have you looked at the linkMap API in the standard Helper? I think this ought to do what you want. The basic API is

       

      public void link(Object key, Object value)  adds a link from key to value in the default LinkMap --- equivalent to calling link("default", key, value)

      public Object linked(Object key) retrieves the value linked by key in the default LinkMap or null if no such key exists --- equivalent to calling linked("default", key)

      public void link(Object linkMapId, Object key, Object value) inserts a link from key to value in the LinkMap identified by linkMapId, creating the LinkMap if needed

      public Object linked(Object linkMapId, Object key) retrieves the value linked by key in the LinkMap identified by linkMapId or null if no such LinkMap or key is found

       

      So, if I am understanding your requirement correctly you could use the following rule to establish the link:

       

      RULE label Connection with DataSource name
      CLASS DataSource
      METHOD getConnection
      AT RETURN
      DO link($!, $0.name)
      ENDRULE

       

      Other rules which have a handle on the connection (say it is parameter or local var $connection) would use a rule (BIND) var to downcast the retrieved link to a String

       

      RULE use connection data source name
      . . .
      BIND name : String = linked($connection)
      DO . . .
      ENDRULE

       

      The default LinkMap is often all you need. However, occasionally you need to create multiple links from the same object and that is where alternative LinkMaps become useful. Imagine you sometimes want to retrieve the dataSource and other times just want the name. In that case you would install the two values in two different named LinkMaps

       

      . . .
      DO link("dataSourceName", $!, $0.name);
         link("dataSource", $!, $0);
      ENDRULE

       

      You would then pass the appropriate LinkMap id at the point of retrieval

       

      Rule use connection data source name
      . . .
      BIND name : String = linked("dataSourceName", $connection)
      DO . . .
      ENDRULE

       

      or

       

      Rule use connection data source 
      . . .
      BIND dataSource : org.my.package.DataSource = linked("dataSource", $connection)
      DO . . .
      ENDRULE

       

      I hope that is what you need. If not perhaps you could explain your use case in a bit more detail.

       

      regards,

       

       

      Andrew Dinn

        • 1. Re: Could we optimize the Helper.linkMap API?
          rajiv.shivane

          Hi Andrew, you nailed the use-case.

          The issue is that Helper.link has a synchronized on a static final field. There is lot of thread contention on this, especially since we are instrumenting a lot of methods and there is a lot of multi-threading going on in the app server.

           

          For performance reasons, we have a variation of your linking so instead of link(propertyName, targetObject, propertyValue) we have a very similar link(targetObject, propertyName, propertyValue). This reduces the contention (there could be multiple threads looking for the same property name, but there is mostly a single thread working on a targetObject).

           

          However, the ideally when we instrumented targetClass, if we added a field called _bytemanProperties to the class, then the Helper could just lookup the property value in this field instead of maintaining a Map in the Helper with a VM wide synchronized scope.

           

          What do you think?

           

          Thanks,

          Rajiv

          • 2. Re: Could we optimize the Helper.linkMap API?
            adinn

            Hi Rajiv,

            rajiv.shivane  wrote:

             

            Hi Andrew, you nailed the use-case.

            The issue is that Helper.link has a synchronized on a static final field. There is lot of thread contention on this, especially since we are instrumenting a lot of methods and there is a lot of multi-threading going on in the app server.

             

            For performance reasons, we have a variation of your linking so instead of link(propertyName, targetObject, propertyValue) we have a very similar link(targetObject, propertyName, propertyValue). This reduces the contention (there could be multiple threads looking for the same property name, but there is mostly a single thread working on a targetObject).

             

            Yes, of course, these two alternatives are completely equivalent in terms of function -- a double indirection can be linked and traversed in either order to the same destination -- but the performance may well be significantly worse when the key for the primary indirection vector table is shared between threads.

             

            Unfortunately, some form of locking really is needed to ensure that linkMaps and links are created and removed atomically. It would probably be better if the HashMap in field linkMaps, the primary linkMap lookup table, was replaced with a ConcurrentHashMap. That would avoid the need to synchronize on this field value when doing the map lookup. The striping in ConcurrentHashMap should do a better job of dealing with any contention here. Of course, inserting a new LinkMap would need to be changed to use putIfAbsent but that is quite doable. Perhaps this might help you.

             

            Perhaps you could prototype this by tweaking the code in Helper (or you could maybe write your own subclass of helper which provides an alternative version of the link API using slightly different names)

             

            rajiv.shivane  wrote:

             

            However, the ideally when we instrumented targetClass, if we added a field called _bytemanProperties to the class, then the Helper could just lookup the property value in this field instead of maintaining a Map in the Helper with a VM wide synchronized scope.

             

            What do you think?

             

            Byteman is very careful not to modify target class structure. That's because it is only an option when instrumenting a class that is being loaded for the first time. If you try to retrasform an existing class and change the structure the retransform wil fail with an exception. So, if Byteman had gone down the route of relying on stuctural changes to target classes then

             

            1. rules provided at startup would inject into app classes successfully but they might well fail to work when the agent was loaded dynamically after app startup
            2. rules might well fail to work when injected into classes loaded via the bootstrap or system classpath that were already loaded before the Byteman agent was installed  (n.b. this is a problem even when the agent load happens from the command line -- many classes like Thread, String, List etc are already present).

             

            So, there is no mechanism in Byteman to specify how to transform target classes and I don't really see how to implement one.

            1 of 1 people found this helpful
            • 3. Re: Could we optimize the Helper.linkMap API?
              rajiv.shivane

              adinn  wrote:

              Yes, of course, these two alternatives are completely equivalent in terms of function -- a double indirection can be linked and traversed in either order to the same destination -- but the performance may well be significantly worse when the key for the primary indirection vector table is shared between threads.

              True, choosing the primary key is critical to reduce thread content on the lock. In the example you listed, every worker thread in the server is likely to do a datasource.getConnection. So in our case every thread is likely to do "link" on "datasourceName". Hence our choice to use the datasource object as the primary key. Only one worker thread accesses a datasource object at a time, the data source instances are not shared among threads. ie we are using link($!, "dataSourceName", $0.name); instead of link("dataSourceName", $!, $0.name);

               

              adinn  wrote:

              Perhaps you could prototype this by tweaking the code in Helper (or you could maybe write your own subclass of helper which provides an alternative version of the link API using slightly different names)

              We are using custom Rule Helper with a synchronizedMap. That reduces the amount of code within the synchronized block like so:

               

              private static final Map<Object, HashMap<Object, Object>> linkMaps = Collections.synchronizedMap(new HashMap<>());

              public Object link(Object mapName, Object name, Object value)

              {

                HashMap<Object, Object> map = linkMaps.computeIfAbsent(mapName, k -> new HashMap<>());

                 return map.put(name, value);

              }

               

              adinn  wrote:

              Byteman is very careful not to modify target class structure. That's because it is only an option when instrumenting a class that is being loaded for the first time. If you try to retrasform an existing class and change the structure the retransform wil fail with an exception. So, if Byteman had gone down the route of relying on stuctural changes to target classes then

               

              1. rules provided at startup would inject into app classes successfully but they might well fail to work when the agent was loaded dynamically after app startup
              2. rules might well fail to work when injected into classes loaded via the bootstrap or system classpath that were already loaded before the Byteman agent was installed  (n.b. this is a problem even when the agent load happens from the command line -- many classes like Thread, String, List etc are already present).

               

              So, there is no mechanism in Byteman to specify how to transform target classes and I don't really see how to implement one.

              I was hoping we could reduce the contention further (eliminate the global linkMaps and make it an Object field (on the Connection object in our example) .. it would still have to be synchronized map, but number of threads accessing the field on the connection object is always 1.

               

              I see your point about supporting dynamic transformation after app startup.

               

              Thanks,

              Rajiv

              • 4. Re: Could we optimize the Helper.linkMap API?
                adinn

                Hi Rajiv,

                We are using custom Rule Helper with a synchronizedMap. That reduces the amount of code within the synchronized block like so:

                 

                private static final Map<Object, HashMap<Object, Object>> linkMaps = Collections.synchronizedMap(new HashMap<>());

                public Object link(Object mapName, Object name, Object value)

                {

                  HashMap<Object, Object> map = linkMaps.computeIfAbsent(mapName, k -> new HashMap<>());

                   return map.put(name, value);

                }

                 

                I'm sorry but I cannot adopt this directly as is because it employs an API only introduced in JDK8 (method computeIfAbsent and interface Function). Byteman's builtin Helper class still has to function on JDK6 and JDK7.

                 

                What I can do is retype field Helper.linkMaps to be a ConcurrentHashMap, avoiding synchronization for accesses to the outer map (even in JDK6 I can rely on ConcurrentHashMap.putIfAbsent when I need to add a new LinkMap). I will still have to synchronize on the nested maps when doing a get, put, etc (well, I could replace them with ConcurrentHashMap too, but that might actually have some unexpected costs in space and get/put times so I'd prefer to leave them as HashMap until I find it is better to switch them). This change implies a slightly different exclusion strategy but it should not mean that any previously unobtainable interleavings will occur or be barred.

                 

                I have a test version of this ready to push as a SNAPSHOT. Would you like to try comparing it against your own version?

                 

                regards,

                 

                 

                Andrew Dinn

                • 5. Re: Could we optimize the Helper.linkMap API?
                  rajiv.shivane

                  Hi Andrew,

                  adinn  wrote:

                   

                  What I can do is retype field Helper.linkMaps to be a ConcurrentHashMap, avoiding synchronization for accesses to the outer map (even in JDK6 I can rely on ConcurrentHashMap.putIfAbsent when I need to add a new LinkMap). I will still have to synchronize on the nested maps when doing a get, put, etc (well, I could replace them with ConcurrentHashMap too, but that might actually have some unexpected costs in space and get/put times so I'd prefer to leave them as HashMap until I find it is better to switch them). This change implies a slightly different exclusion strategy but it should not mean that any previously unobtainable interleavings will occur or be barred.

                  I agree, you still need to synchronize access to the nested maps to be consistent with current behavior. The nested map is synchronized even in our case, though most of the times only one thread access it. It helps reduce contention as the synchronized blocks are much smaller.

                   

                  Turns out it will need some work on our side to try out this change. The outer-map in our case is a WeakHashMap (so that all the properties like dataSourceName are automatically GC'ed when the Connection object is GC'ed). See: JInsight/RuleHelper.java at c1d1cf90faac888a0d7289b23e95e43df1ced319 · ApptuitAI/JInsight · GitHub

                   

                  If you can go ahead and make your release, I will try it as part of our next round of performance related changes.

                   

                  Thanks,

                  Rajiv

                  • 6. Re: Could we optimize the Helper.linkMap API?
                    adinn

                    Hi Rajiv,

                    rajiv.shivane  wrote:

                     

                    Turns out it will need some work on our side to try out this change. The outer-map in our case is a WeakHashMap (so that all the properties like dataSourceName are automatically GC'ed when the Connection object is GC'ed). See: JInsight/RuleHelper.java at c1d1cf90faac888a0d7289b23e95e43df1ced319 · ApptuitAI/JInsight · GitHub

                     

                    If you can go ahead and make your release, I will try it as part of our next round of performance related changes.

                     

                    Ok, I think I'll push the change to use ConcurrentHashMap anyway. I think I probably ought to consider doing the same for some of the other maps managed by Helper (for Counters, CountDowns etc). I believe there may be other cases where the map is synchronized on while operating on the Byteman objects stored in the map.

                     

                    I don't think it I want to go down the route of using a WeakHashMap. Although I do use one for the load cache I am not really very happy about relying on weak reference processing and would prefer to minimise its use where I can (especially when the references might be to application objects whose lifetimes and memory footprint may well be opaque to the JVM).

                     

                    One of the recent changes I made was to ensure that all the maps maintained by Helper get cleared when there are no rules using it as their helper (whether directly or via a subclass of Helper). This guarantees that all link keys and values are dropped when rules are removed, providing one way of managing the reference problem (even though it not one which will work for your case).

                     

                    With that out of the way I hope to release 4.0.2 and 3.0.12 this week.

                     

                    regards,

                     

                     

                    Andrew Dinn

                    • 7. Re: Could we optimize the Helper.linkMap API?
                      rajiv.shivane

                      Hi Andrew,

                      adinn  wrote:

                      I don't think it I want to go down the route of using a WeakHashMap. Although I do use one for the load cache I am not really very happy about relying on weak reference processing and would prefer to minimise its use where I can (especially when the references might be to application objects whose lifetimes and memory footprint may well be opaque to the JVM).

                      Given Byteman is a generic framework, it makes sense to me that you wouldn't want to rely on WeakHashMap. Instead, where needed, users of Byteman (viz JInsight) could use custom Helpers to implement linkMaps using WeakHashMaps/Unsynchronized-Maps etc that is optimized for their use-case/need.

                      adinn  wrote:

                      One of the recent changes I made was to ensure that all the maps maintained by Helper get cleared when there are no rules using it as their helper (whether directly or via a subclass of Helper). This guarantees that all link keys and values are dropped when rules are removed, providing one way of managing the reference problem (even though it not one which will work for your case).

                      Initially, I started with a similar aspiration that any linked properties that my Helper adds, should be cleared by my Helper. For example, I would delete the "dataSourceName" property when user calls Connection.close(). However, it turns out that calling Connection.close is optional (though recommended). There are many applications that do not actually close the Connection. So saving these properties in a WeakHashMap is a fallback for such situations.

                      adinn  wrote:

                      With that out of the way I hope to release 4.0.2 and 3.0.12 this week.

                      That is great news. We will release a new version of JInsight with 4.0.2!

                       

                      Thanks,

                      Rajiv