1 2 Previous Next 15 Replies Latest reply on Feb 22, 2012 10:13 AM by dmalechek

    Remote Client.addCallbackListener hang

    dmalechek

      Hi,

      I'm having an issue where a remote client call to createQueueConnection that hangs and never returns until a socket timeout.

      We are using JBoss 4.2.3GA, Messaging 1.4.4, Remoting 2.2.3SP3.  The system involves 500+ remote client machines sending 500K+ messages a day to a Queue with a message listener attached.  There are a few other minor JMS queue activities as well.

      It runs well until about once a day, usually during heavy load, all the clients hang, requiring a restart of the server.  The CPU runs at 20% with plenty of heap and a strong Oracle box.

       

      At this point I'm trying to figure out if I have a remoting configuration mistake, a client code mistake, a possible remoting bug, or just a limitation of the architecture during a high stress point.  In the client, I don't re-use any of the JMS objects from one send to the next.  I haven't been able to reproduce in a development environment yet.

       

      I did try to back out the change made in JBREM-1081 to ServerInvoker in 2.2.3 to look like the 2.5.4 version of getCallbackHandler.  It did removed the BLOCKED threads I show below, but it still fails somewhere in the ServerInvokerCallbackHandler.connect() call, probably in BisocketClientInvoker.handleConnect().  My client is reluctant to turn on trace mode in production so I'll continue to add custom logging until I track this down. My next configuration change will be trying a smaller pingFrequency.

       

      The remote client side Thread dump looks like this.

      3XMTHREADINFO      "SEQMsgHandling" (TID:0x094A4800, sys_thread_t:0x094C9FB0, state:R, native ID:0x00007B14) prio=5

      4XESTACKTRACE          at java/net/SocketInputStream.socketRead0(Native Method)

      4XESTACKTRACE          at java/net/SocketInputStream.read(SocketInputStream.java:155(Compiled Code))

      4XESTACKTRACE          at java/io/BufferedInputStream.fill(BufferedInputStream.java:229(Compiled Code))

      4XESTACKTRACE          at java/io/BufferedInputStream.read(BufferedInputStream.java:246(Compiled Code))

      4XESTACKTRACE          at java/io/FilterInputStream.read(FilterInputStream.java:89(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/transport/socket/MicroSocketClientInvoker.readVersion(MicroSocketClientInvoker.java:1077(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/transport/socket/MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:682(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/transport/bisocket/BisocketClientInvoker.transport(BisocketClientInvoker.java:458(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:141(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/Client.invoke(Client.java:1925(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/Client.addCallbackListener(Client.java:1994(Compiled Code))

      4XESTACKTRACE          at org/jboss/remoting/Client.addListener(Client.java:1158(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/remoting/JMSRemotingConnection.addInvokerCallbackHandler(JMSRemotingConnection.java:259(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/remoting/JMSRemotingConnection.start(JMSRemotingConnection.java:388(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/delegate/ClientConnectionFactoryDelegate.org$jboss$jms$client$delegate$ClientConnectionFactoryDelegate$createConnectionDelegate$aop(ClientConnectionFactoryDelegate.java:165(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/delegate/ClientConnectionFactoryDelegate$createConnectionDelegate_N3019492359065420858.invokeNext(ClientConnectionFactoryDelegate$createConnectionDelegate_N3019492359065420858.java(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/container/StateCreationAspect.handleCreateConnectionDelegate(StateCreationAspect.java:80(Compiled Code))

      4XESTACKTRACE          at org/jboss/aop/advice/org/jboss/jms/client/container/StateCreationAspect0.invoke(StateCreationAspect0.java(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/delegate/ClientConnectionFactoryDelegate$createConnectionDelegate_N3019492359065420858.invokeNext(ClientConnectionFactoryDelegate$createConnectionDelegate_N3019492359065420858.java(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/delegate/ClientConnectionFactoryDelegate.createConnectionDelegate(ClientConnectionFactoryDelegate.java(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/JBossConnectionFactory.createConnectionInternal(JBossConnectionFactory.java:205(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/JBossConnectionFactory.createQueueConnection(JBossConnectionFactory.java:101(Compiled Code))

      4XESTACKTRACE          at org/jboss/jms/client/JBossConnectionFactory.createQueueConnection(JBossConnectionFactory.java:95(Compiled Code))

      ...

       

      The server side thread dump has all the ServerThread like on of these two:

      Thread: WorkerThread#0[10.31.48.175:46194] : priority:5, demon:false, threadId:494, threadState:BLOCKED, lockName:java.util.HashMap@18acff0

       

          org.jboss.remoting.ServerInvoker.removeCallbackHandler(ServerInvoker.java:1949)

          org.jboss.remoting.ServerInvoker.handleInternalInvocation(ServerInvoker.java:1514)

          org.jboss.remoting.transport.bisocket.BisocketServerInvoker.handleInternalInvocation(BisocketServerInvoker.java:648)

          org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:849)

          org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:611)

          org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:423)

          org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:176)

       

      Thread: WorkerThread#4[10.26.68.175:57044] : priority:5, demon:false, threadId:598, threadState:BLOCKED, lockName:java.util.HashMap@18acff0

       

          org.jboss.remoting.ServerInvoker.getCallbackHandler(ServerInvoker.java:1927)

          org.jboss.remoting.ServerInvoker.handleInternalInvocation(ServerInvoker.java:1501)

          org.jboss.remoting.transport.bisocket.BisocketServerInvoker.handleInternalInvocation(BisocketServerInvoker.java:648)

          org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:849)

          org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:611)

          org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:423)

          org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:176)

       

       

      My Server side remoting-bisocket-service.xml contains:

         <mbean code="org.jboss.remoting.transport.Connector"

                name="jboss.messaging:service=Connector,transport=bisocket"

                display-name="Bisocket Transport Connector">

            <attribute name="Configuration">

               <config>

                  <invoker transport="bisocket">

                       <!-- There should be no reason to change these parameters - warning!

                          Changing them may stop JBoss Messaging working correctly -->           

                     <attribute name="marshaller" isParam="true">org.jboss.jms.wireformat.JMSWireFormat</attribute>

                     <attribute name="unmarshaller" isParam="true">org.jboss.jms.wireformat.JMSWireFormat</attribute>

                     <attribute name="dataType" isParam="true">jms</attribute>

                     <attribute name="socket.check_connection" isParam="true">false</attribute>              

                     <attribute name="serverBindAddress">${jboss.bind.address}</attribute>

                     <attribute name="serverBindPort">14457</attribute>

                     <attribute name="clientSocketClass" isParam="true">org.jboss.jms.client.remoting.ClientSocketWrapper</attribute>

                     <attribute name="serverSocketClass">org.jboss.jms.server.remoting.ServerSocketWrapper</attribute>

       

                     <attribute name="numberOfCallRetries" isParam="true">1</attribute>

                     <attribute name="pingFrequency" isParam="true">214748364</attribute>

                     <attribute name="pingWindowFactor" isParam="true">10</attribute>

                     <attribute name="onewayThreadPool">org.jboss.jms.server.remoting.DirectThreadPool</attribute>              

                     <!-- End immutable parameters -->

       

                     <attribute name="stopLeaseOnFailure" isParam="true">true</attribute>

       

                     <!-- Periodicity of client pings. Server window by default is twice this figure -->                              

                     <attribute name="clientLeasePeriod" isParam="true">10000</attribute>

                     <attribute name="validatorPingPeriod" isParam="true">10000</attribute>

                     <attribute name="validatorPingTimeout" isParam="true">5000</attribute>

       

                     <attribute name="failureDisconnectTimeout" isParam="true">0</attribute>

                     <attribute name="callbackErrorsAllowed">1</attribute>

                     <attribute name="registerCallbackListener">false</attribute>

                     <attribute name="useClientConnectionIdentity" isParam="true">true</attribute>

       

                      <attribute name="timeout" isParam="true">120000</attribute>

       

                     <!-- Number of seconds to wait for a connection in the client pool to become free -->

                     <attribute name="numberOfRetries" isParam="true">10</attribute>

       

                     <!-- Max Number of connections in client pool. This should be significantly higher than

                          the max number of sessions/consumers you expect -->

                     <attribute name="JBM_clientMaxPoolSize" isParam="true">200</attribute>

       

                     <!-- The maximum time to wait before timing out on trying to write a message to socket for delivery -->

                     <attribute name="callbackTimeout">10000</attribute>

       

                     <!-- Use these parameters to specify values for binding and connecting control connections to

                          work with your firewall/NAT configuration

                     <attribute name="secondaryBindPort">xyz</attribute>                          

                     <attribute name="secondaryConnectPort">abc</attribute>              

                     -->

                 </invoker>

                  <handlers>

                     <handler subsystem="JMS">org.jboss.jms.server.remoting.JMSServerInvocationHandler</handler>

                  </handlers>

               </config>

            </attribute>

         </mbean>

       

      Thanks in advance for any insights.

      David Malechek

       

      4/21 update:

      I think the root cause of the blocking is a single thread like this:

      Thread: WorkerThread#1662[10.51.28.175:59116] : priority:5, demon:false, threadId:80903, threadState:WAITING, lockName:java.util.HashSet@18e0b77

      java.lang.Object.wait(Native Method)

      org.jboss.remoting.transport.bisocket.BisocketClientInvoker.handleConnect(BisocketClientInvoker.java:295)          org.jboss.remoting.MicroRemoteClientInvoker.connect(MicroRemoteClientInvoker.java:273)org.jboss.remoting.Client.connect(Client.java:1800)

      org.jboss.remoting.Client.connect(Client.java:655)

      org.jboss.remoting.Client.connect(Client.java:602)

      org.jboss.remoting.callback.ServerInvokerCallbackHandler.connect(ServerInvokerCallbackHandler.java:163)

      org.jboss.remoting.ServerInvoker.getCallbackHandler(ServerInvoker.java:1941)

      org.jboss.remoting.ServerInvoker.handleInternalInvocation(ServerInvoker.java:1506)

      org.jboss.remoting.transport.bisocket.BisocketServerInvoker.handleInternalInvocation(BisocketServerInvoker.java:648)

      org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:850)

      org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:634)

      org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:434)

      org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:176)

       

      And possibly the root is this, the client is not responding after the accept()?

      Thread: secondaryServerSocketThread</b> : priority:5, demon:true, threadId:42, threadState:RUNNABLE, lockName:null

      java.net.SocketInputStream.socketRead0(Native Method)<br>java.net.SocketInputStream.read(SocketInputStream.java:129)

      java.net.SocketInputStream.read(SocketInputStream.java:182)<br>java.io.FilterInputStream.read(FilterInputStream.java:66)

      org.jboss.remoting.transport.bisocket.BisocketServerInvoker$SecondaryServerSocketThread.run(BisocketServerInvoker.java:885)

       

      I peeked into the 2.5.4 code and saw that Remoting has a timeout on the socket read for the SecondaryServerSocketThread so my next thing to try is adding that to keep Central Server from crashing.  I can then get the thread dumps from the client side that in timing out to see what is going on there.

        • 1. Remote Client.addCallbackListener hang
          ron_sigal

          Hi David,

           

          I think you're on the right track.  BisocketClientInvoker.handleConnect() is waiting for a socket to use for the "control" connection.  In fact, it should already be there in the sockets Set.  When JBossMessaging sets up a connection, it calls org.jboss.remoting.Client.addListener(), which results in a call to BisocketClientInvoker.transport():

           

             protected Object transport(String sessionId, Object invocation, Map metadata,

                                        Marshaller marshaller, UnMarshaller unmarshaller)

             throws IOException, ConnectionFailedException, ClassNotFoundException

             {

                String listenerId = null;

                if (invocation instanceof InvocationRequest)

                {

                   InvocationRequest ir = (InvocationRequest) invocation;

                   Object o = ir.getParameter();

                   if (o instanceof InternalInvocation)

                   {

                      InternalInvocation ii = (InternalInvocation) o;

                      if (InternalInvocation.ADDLISTENER.equals(ii.getMethodName())

                          && ir.getLocator() != null) // getLocator() == null for pull callbacks

                      {

                         Map requestPayload = ir.getRequestPayload();

                         listenerId = (String) requestPayload.get(Client.LISTENER_ID_KEY);

                         listenerIdToClientInvokerMap.put(listenerId, this);

                         BisocketServerInvoker callbackServerInvoker;

                         callbackServerInvoker = BisocketServerInvoker.getBisocketServerInvoker(listenerId);

                         callbackServerInvoker.createControlConnection(listenerId, true);

                      }

                     

                      // Rather than handle the REMOVELISTENER case symmetrically, it is

                      // handled when a REMOVECLIENTLISTENER message is received by

                      // BisocketServerInvoker.handleInternalInvocation().  The reason is that

                      // if the Client executes removeListener() with disconnectTimeout == 0,

                      // no REMOVELISTENER message will be sent.

                   }

                }

           

                return super.transport(sessionId, invocation, metadata, marshaller, unmarshaller);

             }

           

          In particular, the call to callbackServerInvoker.createControlConnection() creates the socket on the client side whose counterpart on the server side should land in the sockets Set.  For some reason, that isn't happening.  If you can nail that down, then you've got it. 

           

          Thank you for letting me know about this issue.  Please let me know how your detective work goes, and let me know if you have any questions.

           

          -Ron

          1 of 1 people found this helpful
          • 2. Remote Client.addCallbackListener hang
            dmalechek

            Thanks Ron.

            I noticed that SecondaryServerSocketThread in 2.2.4 also has the timeout I need.  Do you have a target date for when it is GA for download? 

            Given I have a high profile customer needing some stability while I find the root cause, do you feel the svn 2.2.4 is ready for a production environment or would I be better off just tweaking 2.2.3SP3?

             

            One more thing in case you have some insight, the customer is running SLES 9 on the server and clients.  The clients are running the big blue's JVM and the server is Sun(Oracle).  I believe big blue has admitted to a input stream issue with the read method and available bytes.

             

            I think I'm getting close on this one, I'll keep you updated.

            Thanks again,

            David

            • 3. Remote Client.addCallbackListener hang
              ron_sigal

              Hi David,

               

              In fact, release 2.2.4 is tagged and ready to go.  I just haven't done the release steps because I'm waiting for EAP 4.3.CP10 to upgrade to 2.2.4 to verify there aren't any regressions.  I don't know when that will happen, but I don't suppose it will be too far in the future.  At that point I'll finish the release process.  In the meantime, you can check out

               

                http://anonsvn.jboss.org/repos/jbossremoting/remoting2/tags/2.2.4

               

              and build output/lib/jboss-remoting.jar by running

               

                ant jars

               

              from the root directory.

               

              Don't know anything about SLES 9.  Novell, hmm, rings a bell.  Don't they do netware? 

               

              -Ron

              • 4. Remote Client.addCallbackListener hang
                dmalechek

                Ron,

                 

                JBREM-1261 looks like a pretty good fit for what we have been seeing.  The part "Also, the timeout value of each socket is set to 30 seconds so that an attempt to hang up socket processing threads by withholding expected bytes will fail in a timely manner." is is the fix we need to get past whatever is the root cause.  I believe I'm seeing proof that we are timing out and reconvering where we weren't before after I added a timeout like you have in 2.2.4.

                I'll build your 2.2.4 and we get started testing it.

                 

                Thanks for you time, I'm crossing my fingers that this discussion may be done.  Now to get our customers to upgrade to 5.0!  :-)

                -David

                • 5. Remote Client.addCallbackListener hang
                  ron_sigal

                  That's great news, David.  Good luck.

                  • 6. Re: Remote Client.addCallbackListener hang
                    dmalechek

                    Ron,

                    Unfortunately we had an incident with the 2.2.4 jar that locked up all remote client activity.

                    I tried my hand at analyzing the thread dump, let me know if you think I'm wrong.  I think it may be related to JBREM-1242 but too involved for me to figure out.

                    Assuming that Object@383d87 is connectionValidatorLock and Object@8ecb42 is clientLeaseLock, the threads below show that Timer-9954 and Thread-390047 block each other.  Which would only be slightly bad, but then Timer-2 blocks and then all the WorkerThreads like WorkerThread#1664 is blocked since handleClientFailure is synchronized.  There was actually a thread Timer-9953 exactly like Timer-9954 as well.

                     

                     

                    Thread: Timer-9954 : priority:5, demon:true, threadId:782301, threadState:BLOCKED, lockName:java.lang.Object@383d87

                        org.jboss.remoting.Client.notifyListeners(Client.java:1781)

                        ...

                     

                    Thread: Thread-390047 : priority:5, demon:true, threadId:778140, threadState:BLOCKED, lockName:java.lang.Object@8ecb42

                        org.jboss.remoting.MicroRemoteClientInvoker.establishLease(MicroRemoteClientInvoker.java:458)

                        ...

                     

                    Thread: Timer-2 : priority:5, demon:true, threadId:39, threadState:BLOCKED, lockName:java.lang.Object@383d87

                        org.jboss.remoting.Client.disconnect(Client.java:735)

                        org.jboss.remoting.callback.ServerInvokerCallbackHandler.destroy(ServerInvokerCallbackHandler.java:1016)

                        org.jboss.remoting.callback.ServerInvokerCallbackHandler.shutdown(ServerInvokerCallbackHandler.java:1028)

                        org.jboss.jms.server.endpoint.ServerConnectionEndpoint.closeCallbackClient(ServerConnectionEndpoint.java:474)

                        org.jboss.jms.server.connectionmanager.SimpleConnectionManager.cleanupForSessionID(SimpleConnectionManager.java:468)

                        org.jboss.jms.server.connectionmanager.SimpleConnectionManager.handleClientFailure(SimpleConnectionManager.java:201)

                        org.jboss.jms.server.connectionmanager.SimpleConnectionManager.handleConnectionException(SimpleConnectionManager.java:241)

                        ...

                     

                    Thread: WorkerThread#1664[10.48.16.175:56059] : priority:5, demon:false, threadId:786648, threadState:BLOCKED, lockName:org.jboss.jms.server.connectionmanager.SimpleConnectionManager@2cf63

                        org.jboss.jms.server.connectionmanager.SimpleConnectionManager.handleClientFailure(SimpleConnectionManager.java:201)

                        ...

                     

                    Thanks,

                    David

                    • 7. Remote Client.addCallbackListener hang
                      ron_sigal

                      Hi David,

                       

                      Have you made any more progress?

                       

                      Do you have the rest of the stack dump?  It would help to know which threads are holding the locks that these threads are blocked on.

                       

                      -Ron

                      • 8. Re: Remote Client.addCallbackListener hang
                        dmalechek

                        OK, I've attached the dump file to the previous post.

                        -David

                        • 9. Re: Remote Client.addCallbackListener hang
                          ron_sigal

                          Hi David,

                           

                          I'm sorry I've been absent for so long.

                           

                          Are you still seeing the problem?  Any progress?

                           

                          -Ron

                          • 10. Re: Remote Client.addCallbackListener hang
                            dmalechek

                            Ron,

                            No change in status on this.  The patch I made to apply a 15 second socket timeout on the secondaryServerSocket appears to keep us afloat by releasing the blockage before things get out of control.

                            If there is ever a Remoting 2.2.3 patch for this, we'd love to have it, but we've actually start planning our next steps assuming there won't be one.  In the short term we will look towards trying to maintain longer term connections to stay away from the contention on the callback collection.  In the long term we are looking into going to RMI delivery.  We need to get to load balanced clustering at some point and your documentation says that excludes the JMS services.

                            Any ideas you have are appreciated.

                            Thanks,

                            David

                            • 11. Re: Remote Client.addCallbackListener hang
                              ron_sigal

                              Hi David,

                               

                              Note that the changes for JBREM-1292 "ServerInvoker should use a ConcurrentHashMap to store callback handlers"  might help.  Those changes are currently committed to the 2.2 branch.

                               

                              As for the dump, it looks like you created it using the AS console, right?  The reason I say that is that there doesn't seem to be an indication of which threads are holding the locks, and, if I recall correctly, that's a shortcoming of creating dumps from the AS console.  If you could create a dump using kill -3, that would be a lot more useful.

                               

                              Thanks,

                              Ron

                              • 12. Re: Remote Client.addCallbackListener hang
                                ron_sigal

                                Hey David,

                                 

                                I just got around to looking at JBREM-1297 "Deadlock between MicroRemoteClientInvoker.establishLease and Client.notifyListeners", and I think that's what you're seeing in the thread dump.  You want to take a look?  I built a jboss-remoting.jar with the fix and attached it to JBREM-1297, if you want to give it a spin.

                                 

                                -Ron

                                1 of 1 people found this helpful
                                • 13. Re: Remote Client.addCallbackListener hang
                                  dmalechek

                                  Thanks Ron.

                                  I took a look at JBREM-1292 and it does look promising as fix to my issue with the 2.2.4 jar.  I was running revision 6304.  Unfortunately I'm locked out of trying anything with my clients production environment unless we run into a string of server crashes that we can't seem to tune our way out of, then I can suggest 2.2.4 again.  If it happens, I'll get you an update, but so far they're in a stable situation.

                                   

                                  Thanks for the research.

                                  -David

                                  • 14. Re: Remote Client.addCallbackListener hang
                                    mgrossmann

                                    Hi David,

                                     

                                    recently i found your thread concerning the problem above. I have a similar problem with a similar constellation looking here https://community.jboss.org/thread/176906.

                                     

                                    Jboss 4.2.3, JBM 1.4.5, Remoting 2.2.3.SP1

                                     

                                    My question is, after you upgraded to remoting 2.2.4, did you get into the situation again?

                                     

                                    Thanks in advance, Regards

                                    Matthias

                                    1 2 Previous Next