1 Reply Latest reply on Aug 15, 2006 8:40 PM by josey

    Two servers assuming the role of master node (JMS)?

      I have two clustered JMS servers running JBoss [Zion] 4.0.4.GA (build: CVSTag=JBoss_4_0_4_GA date=200605151000); the servers run linux (debian). JMS data is persisted to a postgres8.1 database; each server has the same configuration (same JMS destinations; same datasource for the JMS persistence manager, etc.).

      I have been doing some testing to verify that when the master node (node1) goes down, the other node (node2) assumes the role of master. Also testing to be sure that when node1 comes back up all still processes without problems. I have seen a bunch of exceptions that lead me to believe that both servers are trying to be master when node1 is brought back up; I am pretty sure that this is not possible (andI see that at any given moment only one has assumed the role of MasterNode when I check the HASingletonDeployer in the JMX console for each server); so I am hoping that there is a better explanation for what I am seeing.

      Here are the steps that I have carried out:
      1. Bring up two clustered JBoss servers; each one has an identical JMS configuration; each one has the same ear file deployed
      2. Start a client that continously produces messages for a Topic; messages begin to be persisted; note that the client uses HA-JNDI (it has a list of both servers)
      3. There are 3 MDBs; two have durable subscriptions; one sleeps for a couple of seconds during processing
      4. Kill the master node (node1) [e.g., stop the JBoss server]; other node (node2) assumes the responsibility of master; client recognizes this change and starts to connect to the new master (node2)
      5. bring up node1; exceptions start to fly on node1 (none on node2); the exceptions do not stop; client sees no exceptions
      Here is an example of the exception(s):

      2006-08-15 17:14:13,343 ERROR [org.jboss.jms.asf.StdServerSession] failed to commit/rollback
      org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=dingle/535, BranchQual=, localId=535] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.mq.SpyXAException: - nested throwable: (org.jboss.mq.SpyTransactionRolledBackException:
      Transaction was rolled back.; - nested throwable: (org.jboss.mq.SpyJMSException: Could not mark the message as deleted in the database: update affected 0 rows)))
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:372)
       at org.jboss.tm.TxManager.commit(TxManager.java:240)
       at org.jboss.jms.asf.StdServerSession.onMessage(StdServerSession.java:351)
       at org.jboss.mq.SpyMessageConsumer.sessionConsumerProcessMessage(SpyMessageConsumer.java:902)
       at org.jboss.mq.SpyMessageConsumer.addMessage(SpyMessageConsumer.java:170)
       at org.jboss.mq.SpySession.run(SpySession.java:323)
       at org.jboss.jms.asf.StdServerSession.run(StdServerSession.java:194)
       at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748)
       at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.mq.SpyXAException: - nested throwable: (org.jboss.mq.SpyTransactionRolledBackException: Transaction was rolled back.; - nested throwable: (org.jboss.mq.SpyJMSException: Could not mark the message as deleted in the database: update affected 0 rows))
       at org.jboss.mq.SpyXAResource.commit(SpyXAResource.java:102)
       at org.jboss.tm.TransactionImpl$Resource.commit(TransactionImpl.java:2253)
       at org.jboss.tm.TransactionImpl.commitResources(TransactionImpl.java:1784)
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:340)
       ... 8 more
      Caused by: org.jboss.mq.SpyTransactionRolledBackException: Transaction was rolled back.; - nested throwable: (org.jboss.mq.SpyJMSException: Could
      not mark the message as deleted in the database: update affected 0 rows)
       at org.jboss.mq.server.JMSDestinationManager.transact(JMSDestinationManager.java:435)
       at org.jboss.mq.server.JMSServerInterceptorSupport.transact(JMSServerInterceptorSupport.java:200)
       at org.jboss.mq.security.ServerSecurityInterceptor.transact(ServerSecurityInterceptor.java:197)
       at org.jboss.mq.server.TracingInterceptor.transact(TracingInterceptor.java:422)
       at org.jboss.mq.server.JMSServerInvoker.transact(JMSServerInvoker.java:201)
       at org.jboss.mq.il.jvm.JVMServerIL.transact(JVMServerIL.java:342)
       at org.jboss.mq.Connection.send(Connection.java:1110)
       at org.jboss.mq.SpyXAResourceManager.commit(SpyXAResourceManager.java:164)
       at org.jboss.mq.SpyXAResource.commit(SpyXAResource.java:98)
       ... 11 more
      Caused by: org.jboss.mq.SpyJMSException: Could not mark the message as deleted in the database: update affected 0 rows
       at org.jboss.mq.pm.jdbc2.PersistenceManager.remove(PersistenceManager.java:1204)
       at org.jboss.mq.server.BasicQueue.acknowledge(BasicQueue.java:578)
       at org.jboss.mq.server.JMSTopic.acknowledge(JMSTopic.java:348)
       at org.jboss.mq.server.ClientConsumer.acknowledge(ClientConsumer.java:334)
       at org.jboss.mq.server.JMSDestinationManager.acknowledge(JMSDestinationManager.java:483)
       at org.jboss.mq.server.JMSDestinationManager.transact(JMSDestinationManager.java:427)
       ... 19 more
      2006-08-15 17:14:13,348 WARN [org.jboss.tm.TransactionImpl] XAException: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=dingle/536, BranchQual=, localId=536] errorCode=XAER_RMERR
      org.jboss.mq.SpyXAException: - nested throwable: (org.jboss.mq.SpyTransactionRolledBackException: Transaction was rolled back.; - nested throwable: (org.jboss.mq.SpyJMSException: Could not mark the message as deleted in the database: update affected 0 rows))
       at org.jboss.mq.SpyXAResource.commit(SpyXAResource.java:102)
       at org.jboss.tm.TransactionImpl$Resource.commit(TransactionImpl.java:2253)
       at org.jboss.tm.TransactionImpl.commitResources(TransactionImpl.java:1784)
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:340)
       at org.jboss.tm.TxManager.commit(TxManager.java:240)
       at org.jboss.jms.asf.StdServerSession.onMessage(StdServerSession.java:351)
       at org.jboss.mq.SpyMessageConsumer.sessionConsumerProcessMessage(SpyMessageConsumer.java:902)
       at org.jboss.mq.SpyMessageConsumer.addMessage(SpyMessageConsumer.java:170)
       at org.jboss.mq.SpySession.run(SpySession.java:323)
       at org.jboss.jms.asf.StdServerSession.run(StdServerSession.java:194)
       at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748)
       at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.mq.SpyTransactionRolledBackException: Transaction was rolled back.; - nested throwable: (org.jboss.mq.SpyJMSException: Could
      not mark the message as deleted in the database: update affected 0 rows)
       at org.jboss.mq.server.JMSDestinationManager.transact(JMSDestinationManager.java:435)
       at org.jboss.mq.server.JMSServerInterceptorSupport.transact(JMSServerInterceptorSupport.java:200)
       at org.jboss.mq.security.ServerSecurityInterceptor.transact(ServerSecurityInterceptor.java:197)
       at org.jboss.mq.server.TracingInterceptor.transact(TracingInterceptor.java:422)
       at org.jboss.mq.server.JMSServerInvoker.transact(JMSServerInvoker.java:201)
       at org.jboss.mq.il.jvm.JVMServerIL.transact(JVMServerIL.java:342)
       at org.jboss.mq.Connection.send(Connection.java:1110)
       at org.jboss.mq.SpyXAResourceManager.commit(SpyXAResourceManager.java:164)
       at org.jboss.mq.SpyXAResource.commit(SpyXAResource.java:98)
       ... 11 more
      Caused by: org.jboss.mq.SpyJMSException: Could not mark the message as deleted in the database: update affected 0 rows
       at org.jboss.mq.pm.jdbc2.PersistenceManager.remove(PersistenceManager.java:1204)
       at org.jboss.mq.server.BasicQueue.acknowledge(BasicQueue.java:578)
       at org.jboss.mq.server.JMSTopic.acknowledge(JMSTopic.java:348)
       at org.jboss.mq.server.ClientConsumer.acknowledge(ClientConsumer.java:334)
       at org.jboss.mq.server.JMSDestinationManager.acknowledge(JMSDestinationManager.java:483)
       at org.jboss.mq.server.JMSDestinationManager.transact(JMSDestinationManager.java:427)
       ... 19 more
      


      6. kill node2; exceptions on node1 disappear after a short time
      7. if I bring node2 back up then the exceptions start to fly on node2 (and do not stop); same exceptions as copy/pasted above

      It seems like both servers attempt to become the master node when node1 is restarted (Step 5), hence the issue with the database error. I checked the HASingletonDeployer in the JMX console for each server and only one of the servers is the MasterNode at any point in time.

      Any thoughts on this? It appears that both servers are attempting to access the same message(s).

      Thanks for any help.