5 Replies Latest reply on Nov 28, 2012 7:55 AM by kliczko

    Connection timeout issues - Connection failure has been detected

    tomjenkinson

      Hi,


      I was reading http://community.jboss.org/thread/158100 with much interest as the user appears to have similar issues to myself in terms of his connection timing out on a windows VM. However unlike this user I am deploying HornetQ 2.1.2.Final into JBoss AS 5.1.0.GA.


      All my connections are established using the In-VM transport and as such the connection should never break (being in-memory calls). However I initially started seeing error messages like:

      [hornetq-failure-check-thread] WARN   impl.RemotingConnectionImpl  - Connection failure has been detected: Did  not receive ping from invm:0. It is likely the client has exited or  crashed without closing its connection, or the network between the  server and client has failed. The connection will now be closed.  [code=3]


      I then reviewed the available documentation on this issue (http://hornetq.sourceforge.net/docs/hornetq-2.1.2.Final/user-manual/en/html/connection-ttl.html) and determined that the best course of action would be to configure the InVMConnectionFactory server/all-with-hornetq/deploy/hornetq.sar/hornetq-jms.xml with a connection-ttl of -1. I thought that this would mean that any connection created from this connection factory would be automatically created with a TTL of -1.


      Unfortunately this is not strictly true. as I found when looking in the HornetQ source code for connection-ttl I found that the connections are initially hardcoded with a TTL of HornetQClient.DEFAULT_CONNECTION_TTL (CoreProtocolManager:72) and it is actually the first ping back from the client FailoverManagerImpl that sets the ttl correctly, this means that you cannot disable the clientFailureCheckPeriod as I will now come onto.


      After changing this setting connection-ttl to -1 I do no longer receive errors on the server complaining that the connection has been idle for too long, however now I get a complaint on the client side:

      WARN  [org.hornetq.core.protocol.core.impl.RemotingConnectionImpl] (Thread-72 (group:HornetQ-client-global-threads-30973565)) Connection failure has been detected: Did not receive data from server for org.hornetq.core.remoting.impl.invm.InVMConnection@ac58f6 [code=3]


      This is because the client has not received data from the server in an extended period of time and should be safe for my in memory application and is probably due to thread scheduling in the Windows XP VM which is under considerable load.


      I looked into whether I could set the clientFailureCheckPeriod to -1 (so the client would not check for a dead server) but as I say, if I do this it disables the client pinging (FailoverManagerImpl:1014) which means the server is never told to set the connectionTTL to -1 (FailoverManagerImpl:1222 and CoreProtocolManager:94)


      It would be great to hear your feedback on this, and in particular whether it would be possible to update CoreProtocolManager to override the TTL at connection creation time without requiring the client to ping back with its connectionTTL. Note, it can be overriden by the connection-ttl-setting (CoreProtocolManager:76) but -1 is not a valid setting for this. If the connection-ttl setting from hornetq-jms.xml was automatically applied to the connection then I would be able to get away with configuring the InVMConnectionFactory with:

      <connection-ttl>-1</connection-ttl>

      <client-failure-check-period>-1</client-failure-check-period>

      As I say, though, if you have these settings the connection keeps in the default state of having a hardcoded timeout of HornetQClient.DEFAULT_CONNECTION_TTL (60 seconds) as the client never pings back to set the TTL to -1.


      At the moment I am left to experiment with setting  connection-ttl and client-failure-check-period to very high numbers to see if this will mask the issue...

       

      Thanks!

      Tom