6 Replies Latest reply on Aug 3, 2004 7:13 PM by fgoldfain

    How to recover from failed connection

    richieb

      We have a client that uses UIL2 IL layer. When there is a connection error (Ping Timeout in our case), the client tries to close the session and reconnect. However, due to network problems the close hangs (as session.close() tries to send unsubscribed message).

      As a result we get subscriptions with dead listeners and messages pile up in the server.

      What is the proper way to close a connection after a network error, so that the server will clean up?

      BTW, we are using Jboss 3.2.3. We cannot use TTL on our messages because of TTL bugs in 3.2.3 (too many threads get created).

      Any suggestions would be greatly appreciated...

      ....richie

        • 1. Re: How to recover from failed connection
          starksm64

          Close the session/connection in a background thread so you can resume. What is the problem with the TTL approach in more detail?

          • 2. Re: How to recover from failed connection
            richieb

             

            "scott.stark@jboss.org" wrote:
            Close the session/connection in a background thread so you can resume. What is the problem with the TTL approach in more detail?


            See bug 890030. There is a thread created for each subscription that uses TTL messages and the thread never exits.

            The problem with closing the connection in another thread that we saw, was that the "session.close()" call never completed and there seemed to be no timeout for sending the unsubscribe message.

            ...richie

            • 3. Re: How to recover from failed connection
              richieb

              Here is more detail on how things look when the problem occurs.

              First we get an exception saying that "Ping Timed Out". When we try to shutdown the JMS connection the threads hang. Here is relevant thread dump:

              "

              UIL2.SocketManager.WriteTask#142" daemon prio=1 tid=0x082268b0 nid=0x75d5 in Object.wait() [4da52000..4da528c8]
               at java.lang.Object.wait(Native Method)
               at java.lang.Object.wait(Object.java:429)
               at EDU.oswego.cs.dl.util.concurrent.LinkedQueue.take(LinkedQueue.java:122)
               - locked <0x44d7ba30> (a java.lang.Object)
               at org.jboss.mq.il.uil2.SocketManager$WriteTask.run(SocketManager.java:473)
               at java.lang.Thread.run(Thread.java:534)
              
              "UIL2.SocketManager.ReadTask#141" daemon prio=1 tid=0x082266b0 nid=0x75d4 runnable [4d9d1000..4d9d18c8]
               at java.net.SocketInputStream.socketRead0(Native Method)
               at java.net.SocketInputStream.read(SocketInputStream.java:129)
               at java.io.BufferedInputStream.fill(BufferedInputStream.java:183)
               at java.io.BufferedInputStream.read(BufferedInputStream.java:201)
               - locked <0x44d7bfa8> (a org.jboss.util.stream.NotifyingBufferedInputStream)
               at org.jboss.util.stream.NotifyingBufferedInputStream.read(NotifyingBufferedInputStream.java:67)
               at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2133)
               at java.io.ObjectInputStream$BlockDataInputStream.readBlockHeader(ObjectInputStream.java:2313)
               at java.io.ObjectInputStream$BlockDataInputStream.refill(ObjectInputStream.java:2380)
               at java.io.ObjectInputStream$BlockDataInputStream.read(ObjectInputStream.java:2452)
               at java.io.ObjectInputStream$BlockDataInputStream.readByte(ObjectInputStream.java:2601)
               at java.io.ObjectInputStream.readByte(ObjectInputStream.java:845)
               at org.jboss.mq.il.uil2.SocketManager$ReadTask.run(SocketManager.java:278)
               at java.lang.Thread.run(Thread.java:534)
              
              "Heartbeater" prio=1 tid=0x08247268 nid=0x6e5f in Object.wait() [4d7cd000..4d7cd8c8]
               at java.lang.Object.wait(Native Method)
               - waiting on <0x4475ca20> (a org.jboss.mq.il.uil2.msgs.UnsubscribeMsg)
               at java.lang.Object.wait(Object.java:429)
               at org.jboss.mq.il.uil2.SocketManager.internalSendMessage(SocketManager.java:240)
               - locked <0x4475ca20> (a org.jboss.mq.il.uil2.msgs.UnsubscribeMsg)
               at org.jboss.mq.il.uil2.SocketManager.sendMessage(SocketManager.java:189)
               at org.jboss.mq.il.uil2.UILServerIL.unsubscribe(UILServerIL.java:493)
               at org.jboss.mq.Connection.removeConsumer(Connection.java:1198)
               at org.jboss.mq.SpySession.removeConsumer(SpySession.java:766)
               at org.jboss.mq.SpyMessageConsumer.close(SpyMessageConsumer.java:411)
               at com.javtech.myapp.jms.JMSInterface.a(myappAgent:330)
               - locked <0x489daa00> (a java.lang.Class)
               at com.javtech.myapp.jms.JMSInterface.a(myappAgent:262)
               - locked <0x489daa00> (a java.lang.Class)
               at com.javtech.myapp.jms.JMSInterface.a(myappAgent:155)
               at com.javtech.myapp.jms.JMSInterface.publish(myappAgent:582)
               at com.javtech.myapp.agent.e.b(myappAgent:95)
               at com.javtech.myapp.agent.e.run(myappAgent:72)
               at java.lang.Thread.run(Thread.java:534)
              
              
              

              The Heartbeater thread tries to do "session.close()" and this just sits there. The socket read in ReadTask doesn't seem to time out.

              Should I skip the session.close() in this case and just do connection.close()?

              ...richie



              • 4. Re: How to recover from failed connection

                Closing the connection will close the session.

                Your fundamental problem is that the server is not responding to your requests.
                The connection isn't broken, the server just isn't responding in time.
                Have a look at what the server is doing.

                • 5. Re: How to recover from failed connection
                  richieb

                   

                  "adrian@jboss.org" wrote:
                  Closing the connection will close the session.

                  Your fundamental problem is that the server is not responding to your requests.
                  The connection isn't broken, the server just isn't responding in time.
                  Have a look at what the server is doing.


                  We have an unreliable network connection (client is on the other side of the world :) ). That's why the server does not respond.

                  While the JMS connection in the "half-dead" state the client can do wrong things. We'd like to drop the connection and try reconnecting from scratch.

                  I'll try just closing the connection and see what happens. Will report the result.

                  Thanks!

                  ...richie

                  • 6. Re: How to recover from failed connection
                    fgoldfain

                    Hey Richie,

                    I have exactly the same problem as you.
                    Have you found an answer or a work around?

                    Thanks,
                    Francois