1 2 Previous Next 18 Replies Latest reply on Sep 8, 2008 4:06 PM by brian.stansberry

    Classloading problem in proxy factories

    brian.stansberry

      I'm testing proxy-clustered and am seeing a failure that looks like a problem in the core proxy module.

      Test is doing a lookup of an SLSB; call to ProxyFactory to create the proxy fails:

      javax.naming.NamingException: Could not dereference object [Root exception is java.lang.RuntimeException: Could not create the EJB3 Business Proxy implementing "org.jboss.ejb3.test.clusteredsession.ClusteredStatelessRemote" for clusteredStateless]
       at org.jnp.interfaces.NamingContext.getObjectInstanceWrapFailure(NamingContext.java:1340)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:765)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:629)
       at javax.naming.InitialContext.lookup(InitialContext.java:351)
       at org.jboss.ejb3.test.clusteredsession.unit.StatelessUnitTestCase.testLoadbalance(StatelessUnitTestCase.java:72)
       at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
       at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
       at junit.extensions.TestSetup.run(TestSetup.java:27)
      Caused by: java.lang.RuntimeException: Could not create the EJB3 Business Proxy implementing "org.jboss.ejb3.test.clusteredsession.ClusteredStatelessRemote" for clusteredStateless
       at org.jboss.ejb3.proxy.factory.session.SessionProxyFactoryBase.createProxyBusiness(SessionProxyFactoryBase.java:234)
       at org.jboss.aop.Dispatcher.invoke(Dispatcher.java:121)
       at org.jboss.aspects.remoting.AOPRemotingInvocationHandler.invoke(AOPRemotingInvocationHandler.java:82)
       at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:908)
       at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:742)
       at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:695)
       at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:522)
       at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:230)
      Caused by: java.lang.LinkageError: loader constraints violated when linking org/jboss/ejb3/test/clusteredsession/NodeAnswer class
       at java.lang.Class.getDeclaredConstructors0(Native Method)
       at java.lang.Class.privateGetDeclaredConstructors(Class.java:2357)
       at java.lang.Class.getConstructor0(Class.java:2671)
       at java.lang.Class.getConstructor(Class.java:1629)
       at org.jboss.ejb3.proxy.factory.ProxyFactoryBase.createProxyConstructor(ProxyFactoryBase.java:124)
       at org.jboss.ejb3.proxy.factory.session.SessionProxyFactoryBase.createProxyBusiness(SessionProxyFactoryBase.java:212)
       at org.jboss.aop.Dispatcher.invoke(Dispatcher.java:121)
       at org.jboss.aspects.remoting.AOPRemotingInvocationHandler.invoke(AOPRemotingInvocationHandler.java:82)
       at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:908)
       at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:742)
       at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:695)
       at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:522)
       at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:230)
       at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:206)
       at org.jboss.remoting.Client.invoke(Client.java:1708)
       at org.jboss.remoting.Client.invoke(Client.java:612)
       at org.jboss.aspects.remoting.InvokeRemoteInterceptor.invoke(InvokeRemoteInterceptor.java:60)
       at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
       at org.jboss.aspects.remoting.ClusterChooserInterceptor.invoke(ClusterChooserInterceptor.java:84)
       at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
       at org.jboss.ejb3.proxy.remoting.IsLocalProxyFactoryInterceptor.invoke(IsLocalProxyFactoryInterceptor.java:72)
       at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)
       at org.jboss.aspects.remoting.ClusteredPojiProxy.invoke(ClusteredPojiProxy.java:79)
       at $Proxy2.createProxyBusiness(Unknown Source)
       at org.jboss.ejb3.proxy.objectfactory.session.SessionProxyObjectFactory.createProxy(SessionProxyObjectFactory.java:129)
       at org.jboss.ejb3.proxy.clustered.objectfactory.session.SessionClusteredProxyObjectFactory.getProxy(SessionClusteredProxyObjectFactory.java:76)
       at org.jboss.ejb3.proxy.objectfactory.ProxyObjectFactory.getObjectInstance(ProxyObjectFactory.java:146)
       at javax.naming.spi.NamingManager.getObjectInstance(NamingManager.java:304)
       at org.jnp.interfaces.NamingContext.getObjectInstance(NamingContext.java:1315)
       at org.jnp.interfaces.NamingContext.getObjectInstanceWrapFailure(NamingContext.java:1332)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:765)
       at org.jnp.interfaces.NamingContext.lookup(NamingContext.java:629)
       at javax.naming.InitialContext.lookup(InitialContext.java:351)
       at org.jboss.ejb3.test.clusteredsession.unit.StatelessUnitTestCase.testLoadbalance(StatelessUnitTestCase.java:72)
       at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
       at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
       at junit.extensions.TestSetup.run(TestSetup.java:27)
      


      Failure is coming in the SessionProxyFactoryBase code that attempts to replace the cached proxy constructor with one created using the TCCL:

       // Obtain the correct business proxy constructor
       Constructor<?> constructor = this.getConstructorsProxySpecificBusinessInterface().get(
       businessInterfaceName.trim());
      
       /*
       * In place for web injection (isolated CL)
       */
       ClassLoader tcl = Thread.currentThread().getContextClassLoader();
       try
       {
       // See if we can get at the bean class from the TCL
       Class<?> businessInterfaceClass = Class.forName(businessInterfaceName, false, tcl);
      
       // If so, use the TCL to generate the Proxy class, not the Container CL
       Set<Class<?>> businessInterfaces = new HashSet<Class<?>>();
       businessInterfaces.add(businessInterfaceClass);
       constructor = this.createProxyConstructor(businessInterfaces, tcl);
      
       }
       catch (ClassNotFoundException cce)
       {
       // Ignore
       }
      


      The business interface is simple:

      public interface ClusteredStatelessRemote
      {
       NodeAnswer getNodeState();
      }
      


      It's odd to me that Class.forName() is able to load the ClusteredStatelessRemote class using the TCCL, and a nested call to Proxy.getProxyClass(...) passing the classloader and the interface is able to generate a proxy class, yet there is a LinkageError trying to find the constructor.

        • 1. Re: Classloading problem in proxy factories
          brian.stansberry

          Perhaps this is a sign of a classloader leak? Proxy.getProxyClass(...) is maintaining a cache of proxy classes, keyed by a List of the interface names that define the class. Separate cache for each classloader. In this case the classloader is the default one used by the Remoting connector.

          I'm pretty sure we'll leak the proxy class (and thus its business interfaces) to that data structure. There are other tests that deploy the same jar, so this code will find a proxy class from an earlier deployment, but will then fail to find the constructor because the classes currently on the classpath are different.

          Seems like generating a Proxy class using something other than the container classloader is a classloader leak.

          • 2. Re: Classloading problem in proxy factories
            brian.stansberry
            • 3. Re: Classloading problem in proxy factories
              alrubinger

              This whole TCL thing was put in place because of some failures that became present in web injection.

              I'll remove it locally, look for affected areas, and see if there's a better way to solve the issue.

              S,
              ALR

              • 4. Re: Classloading problem in proxy factories
                alrubinger

                To explain the reason that TCL code is in place...

                Without it, when performing web injection (which, by servlet spec, uses an isolated CL), we get:

                java.lang.IllegalArgumentException: failed to set value Proxy to jboss.j2ee:ear=tx_stateful_web.ear,jar=tx_stateful_web_ejb.jar,name=StatefulTestBean,service=EJB3 implementing [interface org.jboss.ejb3.proxy.intf.StatefulSessionProxy, interface org.jboss.ejb3.proxy.intf.SessionProxy, interface org.jboss.ejb3.proxy.intf.EjbProxy, interface somepackage.RemoteIF] on field private somepackage.RemoteIFsomepackage.TxServlet.remoteBean


                So even though "somepackage.RemoteIF" is implemented by the Proxy, it clashes with the destination target because the CLs are not equal.

                "bstansberry" wrote:
                Seems like generating a Proxy class using something other than the container classloader is a classloader leak.


                The interface target in a Servlet has a different CL than that of the Container, hence the different CL.

                "bstansberry" wrote:
                Perhaps this is a sign of a classloader leak? Proxy.getProxyClass(...) is maintaining a cache of proxy classes


                I'm not sure this is true, if I'm reading the source and comments from Proxy.getProxyClass correctly:

                /*
                 * Note that we need not worry about reaping the cache for
                 * entries with cleared weak references because if a proxy class
                 * has been garbage collected, its class loader will have been
                 * garbage collected as well, so the entire cache will be reaped
                 * from the loaderToCache map.
                 */


                To move forward, how about we also catch and ignore the LinkageError (to get the proxy-clustered tests passing), and I'll have to proceed with EJBTHREE-1442 to see if this is indeed the source of a CL leak.

                S,
                ALR






                • 5. Re: Classloading problem in proxy factories
                  brian.stansberry

                   

                  "ALRubinger" wrote:
                  To explain the reason that TCL code is in place...

                  Without it, when performing web injection (which, by servlet spec, uses an isolated CL), we get:

                  java.lang.IllegalArgumentException: failed to set value Proxy to jboss.j2ee:ear=tx_stateful_web.ear,jar=tx_stateful_web_ejb.jar,name=StatefulTestBean,service=EJB3 implementing [interface org.jboss.ejb3.proxy.intf.StatefulSessionProxy, interface org.jboss.ejb3.proxy.intf.SessionProxy, interface org.jboss.ejb3.proxy.intf.EjbProxy, interface somepackage.RemoteIF] on field private somepackage.RemoteIFsomepackage.TxServlet.remoteBean


                  So even though "somepackage.RemoteIF" is implemented by the Proxy, it clashes with the destination target because the CLs are not equal.


                  Hmm, seems to me it should clash. I guess the idea is if it's a remote interface, treat the webapp as if it's a remote client? Yeah, I guess that makes sense.

                  "ALRubinger" wrote:

                  "bstansberry" wrote:
                  Perhaps this is a sign of a classloader leak? Proxy.getProxyClass(...) is maintaining a cache of proxy classes


                  I'm not sure this is true, if I'm reading the source and comments from Proxy.getProxyClass correctly:

                  /*
                   * Note that we need not worry about reaping the cache for
                   * entries with cleared weak references because if a proxy class
                   * has been garbage collected, its class loader will have been
                   * garbage collected as well, so the entire cache will be reaped
                   * from the loaderToCache map.
                   */



                  Yeah, I think I was wrong. I saw that but got tangled up in thinking about classloaders. Hmm, as I think more, I'm getting more tangled -- so, I'll continue thinking before I comment more.


                  To move forward, how about we also catch and ignore the LinkageError (to get the proxy-clustered tests passing), and I'll have to proceed with EJBTHREE-1442 to see if this is indeed the source of a CL leak.


                  I think catching the error is good. How about also trying to limit this code path? E.g. something like:

                   /*
                   * In place for web injection (isolated CL)
                   */
                   ClassLoader tcl = Thread.currentThread().getContextClassLoader();
                   try
                   {
                   // See if we can get at the bean class from the TCL
                   Class<?> businessInterfaceClass = Class.forName(businessInterfaceName, false, tcl);
                   Class<?> ourBusinessInterfaceClass = this.getClassLoader().loadClass(businessInterfaceName);
                  
                   if (!businessInterfaceClass.equals(ourBusinessInterfaceClass))
                   {
                   // If so, use the TCL to generate the Proxy class, not the Container CL
                   Set<Class<?>> businessInterfaces = new HashSet<Class<?>>();
                   businessInterfaces.add(businessInterfaceClass);
                   constructor = this.createProxyConstructor(businessInterfaces, tcl);
                   }
                   }
                   catch (ClassNotFoundException cce)
                   {
                   // Ignore
                   }
                   catch (LinkageError le)
                   {
                   // Ignore
                   }


                  • 6. Re: Classloading problem in proxy factories
                    alrubinger

                     

                    "bstansberry" wrote:
                    I think catching the error is good. How about also trying to limit this code path?


                    Okay, good.

                    Turns out EJBTHREE-1473 was the source of most (if not all) of the CL Leaks. There's still one test failure remaining, "testSimpleEjb", where it looks like a Proxy is a reference held by an InvocationResponse, which is held from a remoting ServerThread...not sure what I can do about this one just yet.

                    S,
                    ALR

                    • 7. Re: Classloading problem in proxy factories
                      brian.stansberry

                      Ping Ron Sigal. That sounds vaguely familiar; i.e. something that got fixed [1]. This test is the same as what's in Branch_4_2, and it passes there, so remoting must be cleaning up there. Perhaps something got dropped between remoting versions.

                      This kind of thing is the big weakness in the classloader leak tests -- temporary leaks until some short-term-cached resource gets cleaned up. But, so far we haven't hit any of those kind of things that couldn't be fixed, and the tests have caught about 7-10 classloader leaks already, so...


                      [1] My vague memory goes to some last minute remoting SP fix before a 4.x release.....

                      • 8. Re: Classloading problem in proxy factories
                        alrubinger

                        We're not going to be able to do that extra check like:

                        if (!businessInterfaceClass.equals(ourBusinessInterfaceClass))
                        


                        ..else we'll get the LinkageError elsewhere, where I can't catch it:

                        java.lang.LinkageError: Class somepackage/ThreeLocal1IF violates loader constraints
                        20:27:33,746 ERROR [STDERR] at java.lang.Class.getDeclaredMethods0(Native Method)
                        20:27:33,746 ERROR [STDERR] at java.lang.Class.privateGetDeclaredMethods(Class.java:2395)
                        20:27:33,747 ERROR [STDERR] at java.lang.Class.getDeclaredMethod(Class.java:1907)
                        20:27:33,747 ERROR [STDERR] at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1354)
                        20:27:33,747 ERROR [STDERR] at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:52)
                        20:27:33,747 ERROR [STDERR] at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:421)
                        20:27:33,748 ERROR [STDERR] at java.security.AccessController.doPrivileged(Native Method)
                        20:27:33,748 ERROR [STDERR] at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:400)
                        20:27:33,748 ERROR [STDERR] at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:297)
                        20:27:33,748 ERROR [STDERR] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1035)
                        20:27:33,749 ERROR [STDERR] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:302)
                        20:27:33,749 ERROR [STDERR] at org.jboss.aop.joinpoint.InvocationResponse.writeExternal(InvocationResponse.java:100)
                        20:27:33,749 ERROR [STDERR] at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1310)
                        20:27:33,749 ERROR [STDERR] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1288)
                        20:27:33,749 ERROR [STDERR] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1079)
                        20:27:33,750 ERROR [STDERR] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1375)
                        20:27:33,750 ERROR [STDERR] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1347)
                        20:27:33,750 ERROR [STDERR] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1290)
                        20:27:33,750 ERROR [STDERR] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1079)
                        20:27:33,751 ERROR [STDERR] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:302)
                        20:27:33,751 ERROR [STDERR] at org.jboss.remoting.serialization.impl.java.JavaSerializationManager.sendObjectVersion2_2(JavaSerializationManager.java:120)
                        20:27:33,751 ERROR [STDERR] at org.jboss.remoting.serialization.impl.java.JavaSerializationManager.sendObject(JavaSerializationManager.java:95)
                        20:27:33,751 ERROR [STDERR] at org.jboss.remoting.marshal.serializable.SerializableMarshaller.write(SerializableMarshaller.java:120)
                        20:27:33,752 ERROR [STDERR] at org.jboss.remoting.transport.socket.ServerThread.versionedWrite(ServerThread.java:998)
                        20:27:33,752 ERROR [STDERR] at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:781)
                        20:27:33,752 ERROR [STDERR] at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:695)
                        20:27:33,752 ERROR [STDERR] at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:522)
                        20:27:33,752 ERROR [STDERR] at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:230)


                        S,
                        ALR

                        • 9. Re: Classloading problem in proxy factories
                          alrubinger

                           

                          "Ron Sigal" wrote:
                          Andrew, I've attached an updated jboss-remoting.jar, if you want to give it a spin. It's not fully tested, but if it solves the problem, I'll push ahead.
                          ...
                          Ah, do you need the jar in a repository to run your test?


                          Great, that should be perfect. Nope, don't need it in a repo, I'll drop it in manually and see what happens tomorrow. :)

                          S,
                          ALR

                          • 10. Re: Classloading problem in proxy factories
                            alrubinger

                            With the JAR attached to EJBTHREE-1442, I see this test pass completely about 1/2 the time; transient failures likely related to indeterminate time for GC.

                            Profiling reveals that all EJB3 Proxy objects are cleaned up, however, and this was previously not the case. So I say these changes are good.

                            Ron: Would like to request an 2.4.0SP2 release of jboss-remoting as you've sent it here.

                            Brian: Anything we can do to make the tests more bulletproof? I know we can't force a GC, but maybe put in a request for GC followed by Thread.sleep(5000); after everything runs?

                            S,
                            ALR

                            • 11. Re: Classloading problem in proxy factories
                              brian.stansberry

                              Re: transient failures, before failing, the test calls System.gc() several times mixed in with two attempts to fill the heap with garbage in order to force out SoftReferences. (This is why the tests run in the their own target against their own AS instances). I can add a 5 sec sleep in there somewhere, if you think it will be helpful.

                              • 12. Re: Classloading problem in proxy factories
                                brian.stansberry

                                Re: closing EJBTHREE-1471: I think there's still something to be understood here. For sure to get this to work you had to catch LinkageError, and that tells us there's two different classloaders in place. The concern I have is if the TCL is longer-lasting than the EJB CL, the business interface class it loads leaks.

                                Ah, of course the classloader leak tests pass -- they test for the EJB CL leaking, not the business interface class loaded via a different CL. In this case the classloader associated with the remoting connector.

                                OK, I think I'm clear on this now. :) EJBTHREE-1471 doesn't leak to EJB CL. It does pollute the proxy cache associated with the remoting connector's CL with a proxy associated with version A of the business interface. If you redeploy the EJB, the remoting CL's proxy cache is now corrupt. Your LinkageError catch makes the proxy creation usable, but you've still got a corrupt proxy cache. Maybe that's acceptable, but if something like the the check I suggested a few posts back can be added, it would be cleaner. May be a bit slower (or a bit faster) before a redeploy, but faster after, since it avoids trying to create a proxy that's just going to fail w/ LinkageError.

                                • 13. Re: Classloading problem in proxy factories
                                  ron_sigal

                                  Are you guys still ok with Remoting as updated? If so, I'm ready to tag it.

                                  • 14. Re: Classloading problem in proxy factories
                                    alrubinger

                                    Hard to say.

                                    The new JAR attached to the JIRA definitely cleans up the held reference in the ObjectOutputStream, as verified by Profiling the heap.

                                    However, I can't nail down the cause of the transient failure, which suggests that either the test isn't foolproof (though t looks pretty well-written - there's lots of GC requests followed by memory flooding to push out soft references), or that the reference isn't getting cleared immediately (is there some lag in place)?

                                    I also tried running the test on my Win32 partition using JBossProfiler, which can force a GC via its JMVTIInterface. However, this resulted in lots of:

                                    [JVMTIInterface] 4069000 references received


                                    ...followed by OutOfMemoryErrors.

                                    Long story short, I'd feel better about a tag if there were no transient failures. Is there a chance that "clear" on the OOS isn't getting called upon return of the InvocationResponse, or does it wait a bit? Maybe there's a race between that clear and the client returning, triggering the test to check for cleared CL?

                                    S,
                                    ALR

                                    1 2 Previous Next