1 Reply Latest reply on Mar 25, 2019 5:11 AM by adinn

    Out of memory due to strong ClassLoader references

    rajiv.shivane

      Hi adinn, Hope you are doing well.

       

      We have a customer application where the app constantly creates/destroys ClassLoaders at run time. We try to instrument the classes loaded using these ClassLoaders using Byteman. We are noticing that there are static Maps in Rule.java and RuleScript.java that hold strong references to these ClassLoaders and the application eventually runs out of memory.

       

      Here is the reference graph in memory analyzer:

      I wanted to check if this is a limitation of the current design/implementation, or if this is a side effect of incorrect/unsupported usage of Byteman?

       

      If this is a limitation, any suggestions/ideas on how to tackle this?

       

      Thanks,

      Rajiv

        • 1. Re: Out of memory due to strong ClassLoader references
          adinn

          Hi Rajiv,

           

          Nice to hear form you and thanks for reporting this issue.,

           

          I am not 100% sure why the rules are still present. However, I can think of several reasons why this might happen. Let me explain some of what is going on before I get on to the question of whether you can avoid this issue (I think the answer is not really).

           

          Firstly, the obvious thing going on here is that Byteman maintains strong references to classes it has injected a rule into, also to their loaders (although linking to the class would be enough for that). It does so from several global roots.

           

          The obvious root is the static HashMap in class Rule which associates a Rule instance with an injection site. The key for this hashmap is unique String that gets injected into the transformed code. This key is needed to locate the Rule that needs to be executed at the injection point. The injected code only  marshals values used by the rule and, possibly, reassigns values after the rule has fired. It relies on a call to the Rule's execute method to run the rule code itself.

           

          Note that an instance of class Rule represents a specific rule script but it is constructed and initialized with a type context that can be used to type elements of the script wrt the types in scope at a specific injection point. The primary component of that type context is, of course, a class loader. Also, note that a single script for some given rule text can actually give rise to more than one Rule instance. Any given rule can be injected in one or more locations in one or more methods of one or more classes.

           

          So, this association between injection point and rule is critical to Byteman being able to type check and execute rule code. However, since the Rule instance retains a strong reference to its classloader it is going to keep the loader and all its classes alive unless one of two things happens. If it's rule script  is unloaded then the map entries for its current injection points are identified and deleted. The same is supposed to happen when a rule script is updated - the agent removes map entries for current injection points before retransforming the affected members of the current class base. The retransform then installs new entries for any newly injected code.

           

          This is one point where things can go wrong when you are reloading classes (although this may not necessarily be what is happening in your case). The biggest problem is that an agent like Byteman does not get told that a class needs to be let go of when you want to unload it. Assume you have reloaded a class A to a new version A' and dropped all references to the loader. If some loaded rule is injected into A then  it will still be hanging around when the new class is loaded because the injected Byteman rule is keeping it alive. Indeed if you use bmsubmit to list details of injected rules you will see both A and A' at this point.

           

          If you upload new versions of the rules then the old entries will be unindexed and the retransform will inject the rule into all classes that match which will still be both A and A'. If you were very lucky and a GC occured somewhere between the unindex and the retransform and that GC also triggered unloading of classes ... well, the loader for A might just get removed in time to avoid reinjecting into A. However, that is not exactly likely.

           

          The same thing can actually happen if a rule that references A is simply unloaded and then at some later date reloaded after you have tried to drop A and load A'. Unloading the rule first means that Byteman will  drop the reference to the A. However, there is no guarantee that A gets GCed straight away. If you are unlucky then, after you have defined A' and reloaded the rule, A and A' might still both be in the JVM and Byteman will match them both.

           

          So, the problem here is not just that Byteman is retaining the rules because it is holding strong references. The latter two outcomes are a typical issues that arose from a lack of timely GC management. The JVM is not in a position to tell the agent that a given loaded class is not referenced. So, the agent cannot determine whether it is appropriate 1) to transform the class, retaining a reference needed to track the transform and thereby keeping the class alive, or 2) to ignore it, allowing it to be reclaimed. n.b. this timing issue is why I don't think there is a satisfactory way of fixing this even if Byteman were to do its best to retain weak references to loaders, classes, rules etc -- but see below for more on that story.

           

          The other important 'global' root is the agent's ScriptRepository. It's not strictly global as it is referenced from an instance field of the ClassFileTransformer (instance of Transformer) registered by Byteman. However, it is effectively a global root since the agent never unregisters the transformer.

           

          Each currently loaded rule correlates 1-1 with a RuleScript in the ScriptRepository. The RuleScript includes a List<TransformSet> indexed by classloader. Each TransformSet contains a List<Transform> indexed by target class which records all the important details of an injection attempt (most importantly, the trigger class name, the trigger method name including signature and the Rule instance). These details are precisely what is needed in order to do the housekeeping described above when a rule is unloaded or reloaded. Once again, it might be possible to use a combination of weak references to automatically drop some of these details but it's not going to get round the timely GC issue.

           

          I hope that explains what Byteman is doing and why clearly enough for you to understand the problem that you are facing here. It may be that some liberal sprinkling of weak references will help here but I very much doubt it (I'll be happy to consider any recommendations you want to make, of course :-). At root, the problem is that agents have to identify and transform loaded classes using the list of loaded classes provided to them by the Instrumentation API methods getAllLoadedClasses() and getInitiatedClasses(). Those APIs are always going to be able to list classes for a loader you have poisoned and replaced before the JVM has been able to garbage collect unreferenced classes. So, an agent is not going to be able to make an informed choice as to whether or not to create a reference to such classes.

           

          regards,

           

           

          Andrew Dinn