14 Replies Latest reply on Jul 30, 2009 2:42 PM by Abdel Dridi

    JBM2 cluster fails under heavy load

    Abdel Dridi Newbie

      I have set a cluster of 2 nodes, each node has it's own backup.
      Each pair (Live/backup) is installed in a 64bits Linux box.
      Configuration of 4 nodes is the same except "backup" is set to false in each live node.

      In each node I have 102 distributed queues; A producer produces message to an InBoundQueue in each node and a consumer
      consuming messages from the InBoundQueue and distributes them over the 100 queues depending on the message content, each queue of the 100 ones
      has a consumer that consumes messages and copy them to a common distributed outBoundQueue.
      I have a consumer producer per Queue except for the outBound queue where I have a pool of 100 producer and 1 consumer

      The InBoundQueue producer has a rate of 500msg/s which leads to a 1000msg/s for the cluster.

      After 30 min of running, I had the following error:

      Jul 22, 2009 6:37:11 PM org.jboss.messaging.core.logging.Logger warn
      WARNING: Connection failure has been detected Did not receive data from server (or ping).:3
      18:37:42,055 ERROR @Thread-12 (group:JBM-client-global-threads-621631806) [SmppQueueListener] Exception in onMessage():
      javax.jms.JMSException: Timed out waiting for response when sending packet 43
       at org.jboss.messaging.core.remoting.impl.RemotingConnectionImpl$ChannelImpl.sendBlocking(RemotingConnectionImpl.java:1155)
       at org.jboss.messaging.core.client.impl.ClientSessionImpl.commit(ClientSessionImpl.java:420)
       at org.jboss.messaging.jms.client.JBossMessage.acknowledge(JBossMessage.java:969)
       at com.clairmail.test.happypath.SmppQueueListener.onMessage(SmppQueueListener.java:56)
       at org.jboss.messaging.jms.client.JMSMessageListenerWrapper.onMessage(JMSMessageListenerWrapper.java:97)
       at org.jboss.messaging.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:670)
       at org.jboss.messaging.core.client.impl.ClientConsumerImpl.access$100(ClientConsumerImpl.java:41)
       at org.jboss.messaging.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:787)
       at org.jboss.messaging.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:105)
       at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
      Caused by: MessagingException[errorCode=3 message=Timed out waiting for response when sending packet 43]

      then the system failover took over for 5 min or so then all connections were destroyed.

      I Followed Tim's docs (CH.36 and 37) to set the cluster and the backup nodes.
      Do you think it's the network switched that's causing the problem?
      Thought I have 1G switch.