4 Replies Latest reply on Feb 23, 2009 4:51 PM by ramneekh

    Jboss node failover for Jbpm is not working in cluster HA mo

    ramneekh


      We have 2 servers in clustered env. Server1 starts ingestion, steps/tasks of BPM workflow are executed, on Server1 and some on Server2.
      In middle of a BPM task execution if one server/node fails, that task is not being propagated to the other node. However, when crashed server restarts it execute the left off task.


      Here is the exception I see on one server when I kill the server on the other node. Again the server still running does not pick up where the 2nd server left off.

      09:21:48,151 INFO [STDOUT] ==========> entering node: Decision(Content mapped to category?)
      09:21:48,190 INFO [STDOUT] ==========> entering node: Node(Convert to XML)
      09:21:51,742 ERROR [SocketClientInvoker] Got marshalling exception, exiting
      java.net.SocketException: end of file
      at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:624)
      at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:418)
      at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:122)
      at org.jboss.remoting.ConnectionValidator.doCheckConnectionWithLease(ConnectionValidator.java:522)
      at org.jboss.remoting.ConnectionValidator.run(ConnectionValidator.java:301)
      at java.util.TimerThread.mainLoop(Timer.java:512)
      at java.util.TimerThread.run(Timer.java:462)
      09:21:51,763 WARN [Client] unable to remove remote callback handler: Can not get connection to server. Problem establishing socket connection for InvokerLocator [bisocket://usbox004.bo.us.am.ericsson.se:4457//?JBM_clientMaxPoolSize=200&clientLeasePeriod=10000&clientSocketClass=org.jboss.jms.client.remoting.ClientSocketWrapper&dataType=jms&marshaller=org.jboss.jms.wireformat.JMSWireFormat&numberOfCallRetries=1&numberOfRetries=10&pingFrequency=214748364&pingWindowFactor=10&socket.check_connection=false&stopLeaseOnFailure=true&timeout=0&unmarshaller=org.jboss.jms.wireformat.JMSWireFormat&validatorPingPeriod=10000&validatorPingTimeout=5000]
      09:21:52,886 INFO [DefaultPartition] Suspected member: 138.85.222.114:59471



      Your help is much appreciated.