4 Replies Latest reply on Apr 2, 2013 7:37 PM by ron_sigal

    SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)

    toddler

      Hi ...

       

      We are using EAP5.1.1 (remoting-2-4-5.SP2) and experience a problem when using

      SSL and there is a network hard failure (e.g. ethernet cable pulled), a socketWrite0() hangs

      while holding a  Messaging Post Office lock (readLock()) - this hangs JMS...(many threads

      need PostOffice).  Thread dump below.  It is difficult to reproduce.

       

      A Solaris 10 environment.

       

      This problem has only been observed when SSL is used.

       

      In our remoting-bisocket-service.xml file, we use:

      <attribute isParam="true" name="writeTimeout">30000</attribute>

       

      Have been told "writeTimeout" is broken when SSL is being used.

       

      It has been suggested, for SSL, we change the configuration to turn on the check_connection parameter:

      <attribute isParam="true" name="socket.check_connection">true</attribute>

       

      But my concern is:

      * results in an extra round-trip I/O (not good for a variety of reasons - throughput, remote clients with high latency connecitons, etc.).

      * potential to hit other problems with check_connections enabled.   e.g. found this:

      https://community.jboss.org/message/369442

       

      Has anyone experienced this problem and/or know a solution (other then using

      "socket.check_connection") might be?

      Searched and found a number of remoting issues on hard-network failure, but not this particular one...

       

      Potentially Related Issue:

      https://issues.jboss.org/browse/JBREM-1307

       

      Thead Dump below:

       

      2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#1[10.143.109.23:60321]" Id=74 RUNNABLE (in native)

          at java.net.SocketOutputStream.socketWrite0(Native Method)

          at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)

          at java.net.SocketOutputStream.write(SocketOutputStream.java:136)

          at com.sun.net.ssl.internal.ssl.OutputRecord.writeBuffer(OutputRecord.java:297)

          at com.sun.net.ssl.internal.ssl.OutputRecord.write(OutputRecord.java:286)

          at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:743)

          at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:731)

          at com.sun.net.ssl.internal.ssl.AppOutputStream.write(AppOutputStream.java:59)

          -  locked com.sun.net.ssl.internal.ssl.AppOutputStream@ac9224

          at org.jboss.remoting.transport.socket.TimedOutputStream.write(TimedOutputStream.java:119)

          at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)

          -  locked java.io.BufferedOutputStream@1aea5f0

          at java.io.DataOutputStream.write(DataOutputStream.java:90)

          -  locked java.io.DataOutputStream@16e88df

          at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

          at org.jboss.messaging.core.impl.message.MessageSupport.write(MessageSupport.java:387)

          at org.jboss.jms.message.JBossMessage.write(JBossMessage.java:1069)

          at org.jboss.jms.wireformat.ClientDelivery.write(ClientDelivery.java:91)

          at org.jboss.jms.wireformat.JMSWireFormat.write(JMSWireFormat.java:237)

          at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedWrite(MicroSocketClientInvoker.java:1325)

          at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:861)

          at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:470)

          at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:169)

          at org.jboss.remoting.Client.invoke(Client.java:2070)

          at org.jboss.remoting.Client.invoke(Client.java:879)

          at org.jboss.remoting.Client.invokeOneway(Client.java:928)

          at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:835)

          at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:708)

          at org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1600)

          at org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1512)

          -  locked org.jboss.jms.server.endpoint.ServerSessionEndpoint@1105e96

          at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.java:353)

          -  locked java.lang.Object@1b1dde3

          at org.jboss.messaging.core.impl.RoundRobinDistributor.handle(RoundRobinDistributor.java:119)

          at org.jboss.messaging.core.impl.MessagingQueue$DistributorWrapper.handle(MessagingQueue.java:617)

          at org.jboss.messaging.core.impl.ClusterRoundRobinDistributor.handle(ClusterRoundRobinDistributor.java:79)

          at org.jboss.messaging.core.impl.ChannelSupport.deliverInternal(ChannelSupport.java:681)

          at org.jboss.messaging.core.impl.MessagingQueue.deliverInternal(MessagingQueue.java:540)

          at org.jboss.messaging.core.impl.ChannelSupport.handle(ChannelSupport.java:251)

          -  locked java.lang.Object@5e7463

          at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:3133)

          at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.route(MessagingPostOffice.java:956)

          at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.sendMessage(ServerConnectionEndpoint.java:794)

          at org.jboss.jms.server.endpoint.ServerSessionEndpoint.send(ServerSessionEndpoint.java:435)

          at org.jboss.jms.server.endpoint.advised.SessionAdvised.org$jboss$jms$server$endpoint$advised$SessionAdvised$send$aop(SessionAdvised.java:87)

          at org.jboss.jms.server.endpoint.advised.SessionAdvised$send_7280680627620114891.invokeTarget(SessionAdvised$send_7280680627620114891.java)

          at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:111)

          at org.jboss.jms.server.container.SecurityAspect.handleSend(SecurityAspect.java:158)

          at sun.reflect.GeneratedMethodAccessor224.invoke(Unknown Source)

          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

          at java.lang.reflect.Method.invoke(Method.java:597)

          at org.jboss.aop.advice.PerInstanceAdvice.invoke(PerInstanceAdvice.java:122)

          at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)

          at com.timetra.nms.server.j2ee.jms.messaging.JBossMessagingServerInterceptor.handleSessionSend(JBossMessagingServerInterceptor.java:346)

          at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)

          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

          at java.lang.reflect.Method.invoke(Method.java:597)

          at org.jboss.aop.advice.PerInstanceAdvice.invoke(PerInstanceAdvice.java:122)

          at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)

          at org.jboss.jms.server.endpoint.advised.SessionAdvised.send(SessionAdvised.java)

          at org.jboss.jms.wireformat.SessionSendRequest.serverInvoke(SessionSendRequest.java:95)

          at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:165)

          at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967)

          at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)

          -  locked org.jboss.remoting.transport.socket.ServerThread@465cc

          at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)

          at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:586)

          at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)

       

          Number of locked synchronizers = 1

          - java.util.concurrent.locks.ReentrantLock$NonfairSync@12b4a3e

        • 1. Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)
          toddler

          Ooops ...Forgot to specify, the lock is held here:

           

          at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.routeInternal(MessagingPostOffice.java:3133)

           

          Note sure whey the thread-dump does not display that it is holding this lock...?

           

           

          Many/many other threads then waiting for this lock ...e.g;

           

          2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#122[10.217.178.53:3868]" Id=8486 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Native Method)

              -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Object.java:485)

              at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)

              at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)

              at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.localClose(ServerConsumerEndpoint.java:568)

              at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.close(ServerConsumerEndpoint.java:414)

              at org.jboss.jms.server.endpoint.advised.ConsumerAdvised.org$jboss$jms$server$endpoint$advised$ConsumerAdvised$close$aop(ConsumerAdvised.java:59)

              at org.jboss.jms.server.endpoint.advised.ConsumerAdvised$close_N4742752445160157748.invokeTarget(ConsumerAdvised$close_N4742752445160157748.java)

              at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:111)

              at org.jboss.jms.server.container.ServerLogInterceptor.invoke(ServerLogInterceptor.java:105)

              at org.jboss.aop.joinpoint.MethodInvocation.invokeNext(MethodInvocation.java:102)

              at org.jboss.jms.server.endpoint.advised.ConsumerAdvised.close(ConsumerAdvised.java)

              at org.jboss.jms.wireformat.CloseRequest.serverInvoke(CloseRequest.java:66)

              at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:165)

              at org.jboss.remoting.ServerInvoker.invoke(ServerInvoker.java:967)

              at org.jboss.remoting.transport.socket.ServerThread.completeInvocation(ServerThread.java:791)

              -  locked org.jboss.remoting.transport.socket.ServerThread@16b5e33

              at org.jboss.remoting.transport.socket.ServerThread.processInvocation(ServerThread.java:744)

              at org.jboss.remoting.transport.socket.ServerThread.dorun(ServerThread.java:586)

              at org.jboss.remoting.transport.socket.ServerThread.run(ServerThread.java:234)

           

          ..

           

          2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#64[10.143.109.23:41474]" Id=1414 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Native Method)

              -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Object.java:485)

              at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)

              at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)

          ...

           

          2012.07.26 13:23:38 -0500 INFO  [STDOUT] "WorkerThread#122[10.217.178.53:3868]" Id=8486 WAITING on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Native Method)

              -  waiting on EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock@1956aca

              at java.lang.Object.wait(Object.java:485)

              at EDU.oswego.cs.dl.util.concurrent.WriterPreferenceReadWriteLock$ReaderLock.acquire(WriterPreferenceReadWriteLock.java:163)

              at org.jboss.messaging.core.impl.postoffice.MessagingPostOffice.getBindingForQueueName(MessagingPostOffice.java:1033)

          ...

           

          Cheers

          • 2. Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)
            ron_sigal

            Hi Todd,

             

            You've done you're homework.

             

            As you see in JBREM-1307, there is a bug in Java's SSLSocket that can cause a deadlock, and the suggested workaround (I haven't tried it) is to turn on socket.check_connection.  As for your concerns about doing that:

             

              * results in an extra round-trip I/O (not good for a variety of reasons - throughput, remote clients with high latency connecitons, etc.).

             

            This is true. 

             

              * potential to hit other problems with check_connections enabled.   e.g. found this: https://community.jboss.org/message/369442

             

            Note that the problem in https://community.jboss.org/message/369442#369442 is based on a failure to set Remoting's socket timeout parameter.   I don't think it's really a deadlock.  The default client side timeout in Remoting is 30 minutes, so it just looks like a deadlock.  I'm not aware of any problems with the connect timeout.  If you're looking at the code, the connect timeout is enforced in org.jboss.remoting.transport.socket.SocketClientInvoker.connect().

             

            Hope that helps.

             

            -Ron

            • 3. Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)
              toddler

              I found a workaround to this SSL defect which is to set the socket linger option in OutputTimerTask.run()

              before the close() is performed.

               

              I've never contributed to the community source, but below are the

              diffs, hopefully someone can get this into the community version to tell me how I go about contributing

              this change into the community version.

               

               

              File to change: (e.g. for tag 2.5.4.SP3)

               

              org.jboss.remoting.transport.socket.ClientSocketWrapper.java

              org.jboss.remoting.transport.socket.TimedOutputStream.java

               

               

              diff ClientSocketWrapper.java ../ClientSocketWrapper.java

              223c223,224

              <          os = new TimedOutputStream(os, writeTimeout);

              ---

              >           Socket mySkt = super.getSocket();

              >           os = new TimedOutputStream(os, writeTimeout, mySkt);

               

               

               

               

               

              diff TimedOutputStream.java ../TimedOutputStream.java

              26a27

              > import java.net.Socket;

              47a49,60

              >    private Socket mySkt;

              >

              >     public Socket getMySkt() {

              >         return mySkt;

              >     }

              >

              >     public TimedOutputStream(OutputStream os, int outputTimeout, Socket pmySkt)

              >     {

              >         this.os = os;

              >         this.outputTimeout = outputTimeout;

              >         this.mySkt = pmySkt;

              >     }

              134c147

              <      

              ---

              >

              142,151c155,192

              <          try

              <          {

              <             log.debug(this + " closing: " + tos);

              <             tos.close();

              <             tos = null;

              <          }

              <          catch (IOException e)

              <          {

              <             log.debug("unable to close " + tos);

              <          }

              ---

              >           try

              >           {

              >

              >               Socket mySocket = tos.getMySkt();

              >               if (mySocket != null)

              >               {

              >                   // set the linger on with duration 0, results RST being sent

              >                   // and socket send and receive buffers being discarded (does not

              >                   // wait of the existing write() to complete].  There is no use in

              >                   // setting the soLingerDuration longer then 0 (thus waiting for the write to

              >                   // complete), since we've already waited "writeTimeout" seconds on this write.

              >                   boolean on = true;

              >                   int soLingerDuration = 0;

              >                   log.debug(this + " socketWrite TIMEOUT: set soLinger(ON,0): " + tos);

              >                   mySocket.setSoLinger(on, soLingerDuration);

              >

              >               }

              >           }

              >           catch (IOException e)

              >           {

              >               log.debug(" socketWrite TIMEOUT: Attempting to set linger on socket stream [" + tos +

              >                       "], Exception [" + e.toString() + "]");

              >           }

              >           finally

              >           {

              >               log.debug(this + " socketWrite TIMEOUT: close(): " + tos);

              >               try

              >               {

              >                   tos.close();

              >               }

              >               catch (Exception e)

              >               {

              >                   // ignore - log log error

              >                   log.error("Failed to close() socket stream Exception: ", e);

              >               }

              >               log.debug(this + " socketWrite TIMEOUT: close() completed.");

              >               tos = null;

              >           }

              153c194

              <      

              ---

              >

              • 4. Re: SSL bisocket socketWrite0() hangs, while holding Messaging Post Office lock (on a network failure)
                ron_sigal

                Hi  Todd,

                 

                Thank you for working on this issue.  I've put a link to this thread on JBREM-1307.  I don't have any projection for a next release of Remoting 2, but when the time comes, I will certainly look at patch.

                 

                And give you all the credit. 

                 

                -Ron