4 Replies Latest reply on Mar 14, 2006 8:03 AM by marklittle

    OBJECT_NOT_EXIST during recovery (Phorum)

    marklittle

      Author: burdeasa2
      Date: 09-28-04 20:23

      I already had JBoss 3.2.6 RC1 installed on my PC and working with our internal resource adapter. I added a sleep to our resource adapter code at the point where the Application Server should have logged that the transaction is prepared, but where we have not sent out the commit order to the remote system yet. I attempted two-phase commit transactions and killed the JBoss process. Then I restarted JBoss to test recovery.

      Since The default JBoss TM does not do logging, this test failed.

      So I installed Arjuna+JBoss 1.1.2 and repeated the same test.

      Now when I restart JBoss, I see the following in the server.log:

      2004-09-28 13:47:51,573 WARN [jacorb.poa] POA RootPOA rid: 2 opname: _is_a _invoke: object key not previously generated!
      2004-09-28 13:47:51,573 DEBUG [org.jboss.mx.modelmbean.ModelMBeanInvoker] No persistence-manager descriptor found, null persistence will be used
      2004-09-28 13:47:51,573 DEBUG [org.jboss.jmx.adaptor.snmp.agent.SnmpAgentService] It's for me: javax.management.MBeanServerNotification: notificationType=JMX.mbean.registered source=JMImplementation:type=MBeanServerDelegate seq-no=183 time=1096393671573 message=null objectName=jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB userData=null, handback:2147483647
      2004-09-28 13:47:51,573 ERROR [jacorb] org.omg.CORBA.OBJECT_NOT_EXIST: unknown oid vmcid: 0x0 minor code: 0 completed: No
      at org.jacorb.orb.giop.ServerRequestListener.deliverRequest(Unknown Source)
      at org.jacorb.orb.giop.ServerRequestListener.requestReceived(Unknown Source)
      at org.jacorb.orb.giop.GIOPConnection.receiveMessages(Unknown Source)
      at org.jacorb.orb.giop.MessageReceptor.doWork(Unknown Source)
      at org.jacorb.util.threadpool.ConsumerTie.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:536)

      2004-09-28 13:47:51,573 DEBUG [org.jboss.system.ServiceController] Creating service jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB
      2004-09-28 13:47:51,573 DEBUG [org.jboss.ejb.plugins.EntityInstancePool] Creating jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB
      2004-09-28 13:47:51,573 DEBUG [org.jboss.ejb.plugins.EntityInstancePool] Created jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB
      2004-09-28 13:47:51,573 DEBUG [org.jboss.management.j2ee.LocalJBossServerDomain] handleNotification: javax.management.Notification[source=jboss.system:service=ServiceController,type= org.jboss.system.ServiceMBean.create,sequenceNumber=127,timeStamp=1096393671573,message=null,userData=jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB]
      2004-09-28 13:47:51,573 DEBUG [org.jboss.management.j2ee.factory.DefaultManagedObjectFactoryMap] Failed to find factory for event: javax.management.Notification[source=jboss.system:service=ServiceController,type= org.jboss.system.ServiceMBean.create,sequenceNumber=127,timeStamp=1096393671573,message=null,userData=jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB]
      2004-09-28 13:47:51,573 DEBUG [org.jboss.system.ServiceController] Creating dependent components for: jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB dependents are: []
      2004-09-28 13:47:51,589 DEBUG [org.jboss.proxy.ejb.ProxyFactory] Proxy Factory for clustering/HTTPSession initialized
      2004-09-28 13:47:51,620 WARN [jacorb.poa] POA RootPOA rid: 0 opname: getCurrentStatus _invoke: object key not previously generated!
      2004-09-28 13:47:51,620 DEBUG [org.jboss.mx.modelmbean.ModelMBeanInvoker] No persistence-manager descriptor found, null persistence will be used
      2004-09-28 13:47:51,620 ERROR [jacorb] org.omg.CORBA.OBJECT_NOT_EXIST: unknown oid vmcid: 0x0 minor code: 0 completed: No
      at org.jacorb.orb.giop.ServerRequestListener.deliverRequest(Unknown Source)
      at org.jacorb.orb.giop.ServerRequestListener.requestReceived(Unknown Source)
      at org.jacorb.orb.giop.GIOPConnection.receiveMessages(Unknown Source)
      at org.jacorb.orb.giop.MessageReceptor.doWork(Unknown Source)
      at org.jacorb.util.threadpool.ConsumerTie.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:536)


      Is there something I should be configuring? I?m just using the default configuration setup by the installer.

      I looked through the documentation but did not see anything that solved my problem.

      In the Failure Recovery Guide, I saw the following but I don?t know if this is something I need to configure:

      Lifespan policy - specifies the lifespan of the objects implemented in the POA. The
      lifespan policy can have the following values:
      * TRANSIENT (Default) Objects implemented in the POA cannot outlive the
      process in which they are first created. Once the POA is deactivated, an
      OBJECT_NOT_EXIST exception occurs when attempting to use any object
      references generated by the POA.
      *PERSISTENT Objects implemented in the POA can outlive the process in which
      they are first created.


      I have attached the complete server.log if that helps.

        • 1. Re:  OBJECT_NOT_EXIST during recovery (Phorum)
          marklittle

          Author: Mark Little
          Date: 09-28-04 21:53

          > I already had JBoss 3.2.6 RC1 installed on my PC and working
          > with our internal resource adapter. I added a sleep to our
          > resource adapter code at the point where the Application Server
          > should have logged that the transaction is prepared, but where
          > we have not sent out the commit order to the remote system yet.

          Presumably your resource adapter is wrapped in an XAResource? If so, the sleep occurs where? In the commit?

          > I attempted two-phase commit transactions and killed the JBoss
          > process.

          Hi. We haven't qualified A+J against the latest version of JBoss; the last version we qualify with is the GA release of 3.2.5. Although I don't believe there have been any significant changes to JBoss that should affect your results, would it be possible for you to revert to 3.2.5 for now?

          > Then I restarted JBoss to test recovery.
          >
          > Since The default JBoss TM does not do logging, this test
          > failed.
          >
          > So I installed Arjuna+JBoss 1.1.2 and repeated the same
          > test.
          >
          > Now when I restart JBoss, I see the following in the
          > server.log:
          >
          > 2004-09-28 13:47:51,573 WARN [jacorb.poa] POA RootPOA rid: 2
          > opname: _is_a _invoke: object key not previously generated!
          > 2004-09-28 13:47:51,573 DEBUG
          > [org.jboss.mx.modelmbean.ModelMBeanInvoker] No
          > persistence-manager descriptor found, null persistence will be
          > used
          > 2004-09-28 13:47:51,573 DEBUG
          > [org.jboss.jmx.adaptor.snmp.agent.SnmpAgentService] It's for
          > me: javax.management.MBeanServerNotification:
          > notificationType=JMX.mbean.registered
          > source=JMImplementation:type=MBeanServerDelegate seq-no=183
          > time=1096393671573 message=null
          > objectName=jboss.j2ee:jndiName=clustering/HTTPSession,plugin=pool,service=EJB
          > userData=null, handback:2147483647
          > 2004-09-28 13:47:51,573 ERROR [jacorb]
          > org.omg.CORBA.OBJECT_NOT_EXIST: unknown oid vmcid: 0x0 minor
          > code: 0 completed: No
          > at

          [Stuff deleted]

          You don't say which version of JTA you are using, but I assume it is the distributed one, i.e., the implementation based on the JTS. Otherwise I wouldn't expect OBJECT_NOT_EXIST.

          > Is there something I should be configuring? I?m just using
          > the default configuration setup by the installer.

          It's obviously difficult to check without a stand-alone use case, but there are failure scenarios where we would expect to see this exception. For example, because the JTS implementation is being used, each transaction participant is actually an OTS Resource (CosTransactions::Resource interface). These participants aren't recoverable in that an IOR for one isn't useable after the entity it points to fails. However, it is only IORs that the transaction coordinator has in its log when it decides to commit.

          If the participant VM fails, then obviously commit will fail and it's up to the recovery subsystem to take over. Recovery is driven top-down (coordinator-to-participant) and bottom-up (participant-to-coordinator). In the top-down case, the transaction log contains references that aren't useable initially, but the recovery system doesn't know this in this case - the end points could still be valid. So, it attempts to make an invocation on the Resource, which fails and JacORB throws the OBJECT_NOT_EXIST exception back. When bottom-up recovery runs, the newly recovered participant uses the CosTransactions::RecoveryCoordinator reference it obtained when it registered with the transaction originally. This allows it to pass a new reference to the transaction to replace the out-of-date reference. Recovery then happens.

          We have a stand-alone test case that illustrates this. When I'm in the office tomorrow I'll see if I can upload it to the forum.

          >
          > I looked through the documentation but did not see anything
          > that solved my problem.
          >
          > In the Failure Recovery Guide, I saw the following but I
          > don?t know if this is something I need to configure:
          >
          > Lifespan policy - specifies the lifespan of the objects
          > implemented in the POA. The
          > lifespan policy can have the following values:
          > * TRANSIENT (Default) Objects implemented in the POA cannot
          > outlive the
          > process in which they are first created. Once the POA is
          > deactivated, an
          > OBJECT_NOT_EXIST exception occurs when attempting to use any
          > object
          > references generated by the POA.
          > *PERSISTENT Objects implemented in the POA can outlive the
          > process in which
          > they are first created.

          You shouldn't have to configure anything.

          > I have attached the complete server.log if that helps.

          Mark.

          • 2. Re:  OBJECT_NOT_EXIST during recovery (Phorum)
            marklittle

            Author: burdeasa2
            Date: 09-29-04 14:27

            Thanks for your quick reply!

            Let me try to clarify what I did as best as I can.

            First, I reran the test with JBoss 3.2.5 with the same result, as we both suspected.


            The scenario is that we have a client which uses a connection through one instance of the RA named "dtpra" to one EIS and a second connection through a second instance of the RA named "dtpra1" to a second EIS.

            The XAResource.commit method is called for "dtpra", but before it returns, I kill the entire JBoss process. So, the XAResource.commit for "dtpra1" is never called.

            Since the entire JBoss process is killed, the coordinator and both RA instances are gone.

            When I restart JBoss, I expected our XAResource.recovery method to be called (using the top-down recovery you mentioned), but it is never called. Instead, I get the OBJECT_NOT_EXIST errors, so I assumed that this was preventing the recovery from happening.

            Since I saw the refeference that said that the default "Lifespan Policy" for a POA is TRANSIENT, and the entire JBoss process went away, I thought that may be the cause. Therefore, I thought that maybe I need to configure this to be "PERSISTENT".

            Of course, I don't know anything about POAs, so I could easily be way off base.

            The bottom line is that out XAResource.recover method is not called upon restart and I don't know why.


            Thanks again for your help.

            • 3. Re:  OBJECT_NOT_EXIST during recovery (Phorum)
              marklittle

              Author: Mark Little
              Date: 09-29-04 16:35

              Hi again.

              > Thanks for your quick reply!

              No problem.

              >
              > Let me try to clarify what I did as best as I can.
              >
              > First, I reran the test with JBoss 3.2.5 with the same
              > result, as we both suspected.

              That's good at least.

              >
              > The scenario is that we have a client which uses a connection
              > through one instance of the RA named "dtpra" to one EIS and a
              > second connection through a second instance of the RA named
              > "dtpra1" to a second EIS.
              >
              > The XAResource.commit method is called for "dtpra", but
              > before it returns, I kill the entire JBoss process. So, the
              > XAResource.commit for "dtpra1" is never called.
              >
              > Since the entire JBoss process is killed, the coordinator and
              > both RA instances are gone.
              >
              > When I restart JBoss, I expected our XAResource.recovery
              > method to be called (using the top-down recovery you
              > mentioned), but it is never called. Instead, I get the
              > OBJECT_NOT_EXIST errors, so I assumed that this was preventing
              > the recovery from happening.
              >
              > Since I saw the refeference that said that the default
              > "Lifespan Policy" for a POA is TRANSIENT, and the entire JBoss
              > process went away, I thought that may be the cause. Therefore,
              > I thought that maybe I need to configure this to be
              > "PERSISTENT".
              >
              > Of course, I don't know anything about POAs, so I could
              > easily be way off base.
              >
              > The bottom line is that out XAResource.recover method is not
              > called upon restart and I don't know why.

              I'll upload the test scenario I mentioned later today and include a description of how the recovery works. Most of this comes from the failure recovery guide, but it's always easier with a worked example.

              Cheers,

              Mark.

              • 4. Re:  OBJECT_NOT_EXIST during recovery (Phorum)
                marklittle

                Author: Mark Little
                Date: 09-29-04 20:47
                Attachment: EXAMPLE.ZIP (15k)

                Here's that example I promised, along with some text putting it into context (hopefully).

                Firstly, XAResource that are enrolled with a transaction can be either serializable or non-serializable. If they are the former, then it is assumed that state is saved during commit and the deserialized state may be used during recovery for that instance.

                If the state isn't serializable then you must provide an instance of XAConnectionRecovery that will be used to return an instanec of XAResource (via XAConnection) that recovery can use to obtain a new reference for that RM.

                Based on customer requirements when we were part of
                Hewlett-Packard, there is a distinction made between the XAResource used
                during deserialization and that used for the recover method. I can't go into
                specific details, but certain customers wanted to make this distinction. So,
                the XAResource that you deserialize your state into will not have recover
                called on it.

                Secondly, and as a result of the first point, in order to get recover called
                you need to provide an XAConnectionRecovery instance. This returns an
                XAConnection and from that the recovery subsystem will obtain a new
                XAResource upon which it will call recover.

                The recovery system takes a few
                snapshots of the Xids that are in the back end system and from these Xids it will eventually build up a list of Xids that it believes it can recover and leave
                the others alone.

                In no specific order, the example consists of:

                ExampleXAConnection - this is the XAConnection I mentioned before. In this
                example it returns an instance of the same XAResource that is used for
                deserialization; this is purely for the convenience of the example.

                ExampleXAConnectionRecovery - this is the class that the recovery subsystem
                uses to get a handle on an XAResource to recover. This is encapsulated by
                the ExampleXAConnection class (above). Although each
                ExampleXAConnectionRecovery instance can return multiple XAConnections, in
                this example we only need one. Hence the count variable. However, because
                recovery takes several passes over the Xids to determine which ones it can
                recover and which ones are just in flight, the instance of this class will
                be used many times. So, after hasMoreConnections returns false to end one
                phase, we reset the count so it can be used again.

                ExampleXAResource - the recover operation returns a list of Xids. In order
                to present to the recovery system an Xid to recover, we need to make sure it is in the list every time recover returns something. You can ignore the
                AtomicAction class - this is just a quick way for the example to get at some
                valid-looking Xids. But the toRecover instance is the Xid we expect to see
                rollback called on eventually.

                If you run this example normally (java Test) and kill the VM while it
                sleeps, you'll see two states in the object store. Now if you
                re-run with the -recover option you'll see these states get recovered and
                then the ExampleXAConnectionRecovery instance is used to get the
                XAConnection and then the XAResource. You'll then see recover called on that
                a few times before rollback is eventually called.

                During recovery you should see the OBJECT_NOT_EXIST exception being reported. As I mentioned, this is because the top-down recovery is run and tries to use the stale CosTransactions::Resource reference. However, once bottom-up recovery runs and the replay_completion operation on the CosTransactions::RecoveryCoordinator is executed, top-down recovery can run again and this time will terminate the transaction correctly.

                BTW, when you run this example, you should make sure that you are using the JTS implementation of the JTA (the default) and remove the arjunacore version of the XARecoveryModule from your Arjuna properties file, so you are left with only the jts version.

                I hope this helps,

                Mark.