-
1. Re: Messaging blocked by long time-out
thammoud Mar 13, 2009 3:24 PM (in response to rtm333)We are experiencing the exact same problem on Linux. This is very easy to reproduce. Have two clients running against a server and pull the network cable from one of the clients simulating a kernel death. A few seconds later, the server puts out:
13:07:15,064 WARN [SimpleConnectionManager] A problem has been detected with the connection to remote client 5c4o12v-svccoz-fs969gkr-1-fs96isi3-1z, jmsClientID=b1-uvp969sf-1-rkg969sf-zoccvs-v21o4c5. It is possible the client has exited without closing its connection(s) or the network has failed. All associated connection resources will be cleaned up.
If you dump the stack trace you will see:
Timer-87" Id=954 RUNNABLE (in native)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
- locked java.io.BufferedOutputStream@4773d99f
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.jboss.jms.wireformat.ClientDelivery.write(ClientDelivery.java:93)
at org.jboss.jms.wireformat.JMSWireFormat.write(JMSWireFormat.java:237)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedWrite(MicroSocketClientInvoker.java:971)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:606)
at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:422)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:133)
at org.jboss.remoting.Client.invoke(Client.java:1645)
at org.jboss.remoting.Client.invoke(Client.java:559)
at org.jboss.remoting.Client.invokeOneway(Client.java:609)
at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(ServerInvokerCallbackHandler.java:826)
at org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallbackOneway(ServerInvokerCallbackHandler.java:697)
at org.jboss.jms.server.endpoint.ServerSessionEndpoint.performDelivery(ServerSessionEndpoint.java:1452)
at org.jboss.jms.server.endpoint.ServerSessionEndpoint.handleDelivery(ServerSessionEndpoint.java:1364)
- locked org.jboss.jms.server.endpoint.ServerSessionEndpoint@6d47a5f
at org.jboss.jms.server.endpoint.ServerConsumerEndpoint.handle(ServerConsumerEndpoint.ja
After the timeout, the server dumps:
13:22:51,975 ERROR [SocketClientInvoker] Got marshalling exception, exiting
java.io.IOException: No route to host
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.jboss.jms.wireformat.ClientDelivery.write(ClientDelivery.java:93)
at org.jboss.jms.wireformat.JMSWireFormat.write(JMSWireFormat.java:237)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedWrite(MicroSocketClientInvoker.java:971)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:606)
at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:422)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:133)
at org.jboss.remoting.Client.invoke(Client.java:1645)
at org.jboss.remoting.Client.invoke(Client.java:559)
and other clients start getting ticks.
This will not come back until the write socket times out. I believe that is a tcp keep alive issue and on Linux it defaults to minutes.
Is there any configuration that we can use? We do have:<attribute name="callbackTimeout">10000</attribute>set in our remoting-bisocket-service.xml file.
We are dead in the water in our migration effort from ActiveMQ to JBOSS messaging because of this. Thank you for any help. -
2. Re: Messaging blocked by long time-out
ron_sigal Mar 19, 2009 1:40 AM (in response to rtm333)Hi guys,
First, a little background. JBossMessaging sends messages to a consumer by calling org.jboss.remoting.callback.ServerInvokerCallbackHandler.handleCallback(), which, in the case of the bisocket transport. results in a call to org.jboss.remoting.Client.invoke(), which eventually leads to a write on a java.net.Socket. Now, it's true that there is a parameter, "callbackTimeout", that can be used to configure the callback Client and, ultimately, the Sockets used by the Client. But it's important to note that the Socket timeout value, set by Socket.setSoTimeout(), affects blocking reads on the Socket's SocketInputStream. It doesn't affect blocking writes. See, for example,
http://java.sun.com/j2se/1.4.2/docs/api/java/net/Socket.html#setSoTimeout(int).
See also
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4031100,
where someone asked (in 1997!) for a write timeout for Sockets. The request was rejected since they anticipated NIO would eliminate the problem."thammoud" wrote:
I believe that is a tcp keep alive issue
Yes, I think can use TCP to discover that the connection has failed. Indeed, it's probably a keepalive failure that's causing the exceptions your both seeing. Fortunately, it is possible to configure the keepalive parameters so that the failure is detected more quickly. Unfortunately, as far as I know, you have to set the parameters at the TCP level rather than the Java level. For Linux configuration, see, for example,
http://www.linux.org/docs/ldp/howto/TCP-Keepalive-HOWTO/usingkeepalive.html
section 3.1, in particular, and for Windows, see, for example,
http://msdn.microsoft.com/en-us/library/ms819735.aspx
On the other hand, there is an org.jboss.remoting.Lease on the server side which, if it doesn't receive pings from the client in a timely fashion, will declare that a connection is broken and inform any registered listeners. Now JBossMessaging registers a listener, so it should be informed about the broken connection, and it should attempt to shut it down. Remoting won't try to kill a Socket in the middle of a read() or write(), but I'm a little surprised that JBossMessaging doesn't clean things up enough to prevent all the other clients from receiving messages.
Are you seeing the output fromlog.trace("Notified connection listener of lease expired due to lost connection from client (client session id = " + clientHolder.getSessionId());
on your server logs?
-Ron -
3. Re: Messaging blocked by long time-out
lanceliao1 Mar 23, 2009 5:18 AM (in response to rtm333)Hi all
I faced the same problem,Can Remoting provide a method to resolve the problem like AMQ.
http://issues.apache.org/activemq/browse/AMQ-1993 -
4. Re: Messaging blocked by long time-out
swany Mar 25, 2009 2:39 PM (in response to rtm333)Java does allow setting the keep-alive parameter programatically on a socket. It appears from the Javadocs, however, that this value is binary (set to 2 hours or nothing).
-
5. Re: Messaging blocked by long time-out
thammoud Mar 30, 2009 7:13 PM (in response to rtm333)ActiveMQ's solution was above and beyond the TTL. This is a serious issue that needs to be fixed by the JBOSS folks asap. No respectable messaging middle-ware should ever be held hostage to a rogue client.
Unfortunately, we had to abandon our migration to JBM from ActiveMQ because of this issue (and a couple of other anomalies that have to do with reconnects). While we love the clustering capabilities of the system, the reliability leaves much to be desired under stressful conditions. For now, our MDB no longer use jbossmq and now use JBM, our client interaction with our servers will stay with ActiveMQ until these issues are resolved. -
6. Re: Messaging blocked by long time-out
lanceliao1 Mar 31, 2009 5:33 AM (in response to rtm333)Yes,This is a serious problem.
The JBM are not reliable in stressful conditions,We have to consider choosing another mom production. -
7. Re: Messaging blocked by long time-out
timfox Apr 5, 2009 7:43 AM (in response to rtm333)+1
I believe this is a serious issue and remoting should apply a fix as per the AMQ fix.
Ron, do you have a JIRA and ETA for this? -
8. Re: Messaging blocked by long time-out
ron_sigal Apr 16, 2009 3:00 AM (in response to rtm333)"timfox" wrote:
Ron, do you have a JIRA and ETA for this?
JBREM-1120 "Add a socket write timeout facility".
I was hoping to get something into Remoting 2.5.1, which I just released, but time didn't allow. I've scheduled JBREM-1120 for releases 2.2.2.SP12 and 2.5.2.
-Ron -
9. Re: Messaging blocked by long time-out
jlaemthonglang Jun 2, 2009 12:16 PM (in response to rtm333)does this bug affect JBM queue too? or topics only?
-
10. Re: Messaging blocked by long time-out
waylandc Jun 17, 2009 6:10 AM (in response to rtm333)Any ETA on 2.2.2SP12?
-
11. Re: Messaging blocked by long time-out
ron_sigal Jul 17, 2009 2:23 PM (in response to rtm333)"jlaemthonglang" wrote:
does this bug affect JBM queue too? or topics only?
As far as I know, it should affect both. I'm sure someone from JBossMessaging could be more definite. -
12. Re: Messaging blocked by long time-out
ron_sigal Jul 17, 2009 2:25 PM (in response to rtm333)"waylandc" wrote:
Any ETA on 2.2.2SP12?
Release 2.2.2.SP12 morphed into 2.2.3, which was released in May. The next release, 2.2.3.SP1, should come out in early September, and I'll aim to get the write timeout facility in that release. -
13. Re: Messaging blocked by long time-out
ron_sigal Aug 19, 2009 3:23 PM (in response to rtm333)I've just attached to JBREM-1120 two versions of jboss-remoting.jar with the write timeout facility implemented:
* 2.2.3.SP1 preview (874 kb): writes "Remoting version: 2.2.3.SP1-preview: 8/19/2009 - 14:54" when loaded
* 2.5.2 preview (1.07 Mb): writes "JBossRemoting Version 2.5.2 (Flounder) preview: 8/19/09-14:59" when loaded
Note that release 2.2.3.SP1 is scheduled for September 14, 2009. Right now I don't have a release date for 2.5.2.
-Ron -
14. Re: Messaging blocked by long time-out
ron_sigal Aug 19, 2009 4:07 PM (in response to rtm333)"Chul Yoon" wrote:
I assume I can just set the write timeout attribute for JBM in remoting-bisocket-service.xml
Something like this?:<attribute name="writeTimeout">200000</attribute>
That will set "writeTimeout" on the server side, which will affect (1) writing responses to the client and (2) sending messages to the client. If you want to configure the client side as well, then add the "isParam" attribute, which will put "writeTimeout" in the InvokerLocator which gets sent to the client:<attribute name="writeTimeout" isParam="true">200000</attribute>
-Ron