Could you please suggest what is the best way to solve this problem?
If my consumers are not actually connected, but just think that they are the number of saved messages can become excessive (using DeliveryMode.PERSISTENT).
I have one more scenario regarding missing onException callback:
1. Starting cluster node0.
2. Starting standalone consumer.
3. Starting standalone producer - consumer receives the messages
4. Starting cluster node1.
5. kill cluster node0
6. Starting standalone producer - consumer receives the messages
7. kill cluster node1 - no exception callback on the consumer
Again the consumer things it is connected, but it is not, because the whole cluster is shutdown.
the console log of cluster node1:
10:53:11,569 INFO [TreeCache] viewAccepted(): [10.58.100.162:32864|2] [10.58.100.162:32864] 10:53:11,570 INFO [SEPMessageCluster] New cluster view for partition SEPMessageCluster (id: 2, delta: -1) : [10.58.100.162:1299 ] 10:53:11,572 ERROR [SocketClientInvoker] Got marshalling exception, exiting java.net.SocketException: end of file at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:604) at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:418) at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:122) at org.jboss.remoting.ConnectionValidator.doCheckConnection(ConnectionValidator.java:133) at org.jboss.remoting.ConnectionValidator.run(ConnectionValidator.java:308) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) 10:53:11,581 INFO [SEPMessageCluster] I am (10.58.100.162:1299) received membershipChanged event: 10:53:11,581 INFO [SEPMessageCluster] Dead members: 1 ([10.58.100.162:1199]) 10:53:11,581 INFO [SEPMessageCluster] New Members : 0 () 10:53:11,581 INFO [SEPMessageCluster] All Members : 1 ([10.58.100.162:1299]) 10:53:11,586 INFO [MessagingPostOffice] JBoss Messaging is failing over for failed node 0. If there are many messages to reload this may take some time... 10:53:11,742 WARN [Client] unable to remove remote callback handler: Can not get connection to server. Problem establishing socket connection for InvokerLocator [bisocket://blade8.localdomain:4557/?JBM_clientMaxPoolSize=200&clientLeasePeriod=10000&clientSocketClass=org.jboss.jms.client.remoting.ClientSocketWrapper&dataType=jms&marshaller=org.jboss.jms.wireformat.JMSWireFormat&numberOfCallRetries=1&numberOfRetries=10&pingFrequency=214748364&pingWindowFactor=10&socket.check_connection=false&timeout=0&unmarshaller=org.jboss.jms.wireformat.JMSWireFormat] 10:53:11,743 WARN [LeasePinger] LeasePinger[SocketClientInvoker[1e42d5a, bisocket://blade8.localdomain:4557](a1m2s4i-vj7cpa-fdpndt5y-1-fdpnee9y-a)] failed sending disconnect for client lease for client with session ID a1m2s4i-vj7cpa-fdpndt5y-1-fdpnee9w-8 10:53:11,744 ERROR [MicroRemoteClientInvoker] error shutting down lease pinger 10:53:11,748 INFO [MessagingPostOffice] JBoss Messaging failover completed 10:53:30,089 WARN [SimpleConnectionManager] ConnectionManager[f38b42] cannot look up remoting session ID a1m2s4i-k5q34x-fdpnbj60-1-fdpnef10-10 10:53:30,089 WARN [SimpleConnectionManager] A problem has been detected with the connection to remote client a1m2s4i-k5q34x-fdpnbj60-1-fdpnef10-10, jmsClientID=null. It is possible the client has exited without closing its connection(s) or the network has failed. All connection resources corresponding to that client process will now be removed. [root@blade8 jboss-4.2.2.GA-JBM-1.4.0.SP3]#
I have an idea of how to workaround the problem:
Create a topic for ping messages, e.g. /topic/ping. and each system that consumes messages should setup a topic subscriber on /topic/ping. The subscriber should run a timer/timout thread to check if the last received ping message was a long time ago > TIMOUT.
But I thought that the concept with the ExceptionListener should free the developer from such things, to use the messaging system in order to check its connection with it.
If you use a connection factory with "enableFailover" set to true, then when an exception occurs it will automatically try and reconnect to another server, so you won't get the exception unless it can't reconnect.
See user guide
If you don't want automatic failover then set to false and you will get all exceptions.
There is also a bug in jboss remoting 2.2.2.SP4 which means sometimes you won't get exceptions on failure. See JIRA for more details. This bug is fixed in 2.2.2.SP5 which is available in the repository.
I have updated to this one
Here is the config of my ClusteredConnectionFactory in connection-factories-service.xml :
... <mbean code="org.jboss.jms.server.connectionfactory.ConnectionFactory" name="jboss.messaging.connectionfactory:service=ClusteredConnectionFactory" xmbean-dd="xmdesc/ConnectionFactory-xmbean.xml"> <depends optional-attribute-name="ServerPeer">jboss.messaging:service=ServerPeer</depends> <depends optional-attribute-name="Connector">jboss.messaging:service=Connector,transport=bisocket</depends> <depends>jboss.messaging:service=PostOffice</depends> <attribute name="JNDIBindings"> <bindings> <binding>/ClusteredConnectionFactory</binding> <binding>/ClusteredXAConnectionFactory</binding> <binding>java:/ClusteredConnectionFactory</binding> <binding>java:/ClusteredXAConnectionFactory</binding> </bindings> </attribute> <attribute name="SupportsFailover">true</attribute> <attribute name="SupportsLoadBalancing">false</attribute> </mbean> ...
Unfortunately I'm able to reproduce the above scenario each time I try.
At step 5. / 6. I always get a dump like the following on cluster node1:
[root@blade8 jboss-4.2.2.GA-JBM-1.4.0.SP3]# 14:35:44,883 WARN [SimpleConnectionManager] ConnectionManager[de3c87] cannot look up remoting session ID a1m2s4i-enks3t-fdpvfj55-1-fdpvhdf8-1i 14:35:44,884 WARN [SimpleConnectionManager] A problem has been detected with the connection to remote client a1m2s4i-enks3t-fdpvfj55-1-fdpvhdf8-1i, jmsClientID=null. It is possible the client has exited without closing its connection(s) or the network has failed. All connection resources corresponding to that client process will now be removed.
After this the consumer receives the messages if I start the producer, but when I kill cluster node1, the consumer do not get any exception callback.