2 Replies Latest reply on Sep 11, 2004 11:44 PM by acoliver

    Logging/recovery and transaction distribution

    acoliver

      It is my opinion that we need to add transaction logging and recovery to the JB transaction manager.

      My understanding of Trasaction Logging and recovery:

      The transaction manager would follow the two phase commit protocol (we seem to kinda sorta do that but not completely). The transaction manager should log when an XAResource is enrolled, prepared or committed in a transaction. If the node dies then the recovery process picks it up it should read the log, instantiate all of the XAResources involved in the transaction, tell them to commit or rollback the transaction. If no prepares were issued, all rolls back. If some prepares were issued but not all, all rolls back. If all preapres were issued or some commits were issued, it should commit.

      My understanding of distributed transactions:

      Again we must follow the two phase commit protocol. When a transaction is passed from one JBoss node to the next, the transaction managers need to communicate and one transaction manager should be slave, the other master. (For now lets assume no more than A talks to B). This could complicate recovery, but for now we should just always rollback.

      What am I missing?

      I started prototyping this out a while back and ran into a couple relatively minor snags. The first, our transaction manager needs to use hte full XID instead of only 1/2. A server ID needs to be based on the clsuter ID in the event of a cluster. Some general refactoring may need to take place to deal with the divide of labor between the classes (IIRC things that needed to be logged seemed to be distributed between a couple classes).

        • 1. Re: Here's a tutorial I wrote for Jboss-IDE
          acoliver

          Hi,

          I think there is a little misunderstanding. Hans, you think that poinsarx starts jboss from eclipse, but I think he starts it totally by hand on a command line.

          I tried to start JBoss with a run configuration in debug mode from Eclipse M5. With the following properties line:

          /opt/j2sdk1.4.1_01/bin/java -DJBOSS_HOME=<absolute path to my JBoss area>/jboss-src/jboss-3.2/build/output/jboss-3.2.0RC3 -Dprogram.name=run.sh -Xbootclasspath:/opt/j2sdk1.4.1_01/jre/lib/rt.jar:/opt/j2sdk1.4.1_01/jre/lib/sunrsasign.jar:/opt/j2sdk1.4.1_01/jre/lib/jsse.jar:/opt/j2sdk1.4.1_01/jre/lib/jce.jar:/opt/j2sdk1.4.1_01/jre/lib/charsets.jar:/opt/j2sdk1.4.1_01/jre/lib/ext/sunjce_provider.jar:/opt/j2sdk1.4.1_01/jre/lib/ext/dnsns.jar:/opt/j2sdk1.4.1_01/jre/lib/ext/localedata.jar:/opt/j2sdk1.4.1_01/jre/lib/ext/ldapsec.jar -classpath <path to my JBoss area>/jboss-src/jboss-3.2/build/output/jboss-3.2.0RC3/bin/run.jar:/opt/j2sdk1.4.1_01/lib/tools.jar -Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,suspend=y,address=localhost:7301 org.jboss.Main

          With a run configuration which generates this startup line, JBoss starts up and runs the 'default' config in debug mode and i requested the jmx-console successfully.

          I hope this helps.

          Bernd

          • 2. Re: Logging/recovery and transaction distribution

            Your misunderstanding is that we don't need to log each XAResource (branch Xid)
            in each transaction.

            All we need to do is log the global XIDs that have been prepared and those that
            have been committed.

            We need a separate list of 'XAResources'.

            When we recover, we ask each XAResource what XIDs it knows about and compare (ignoring the branch).

            For our own XAResources, we can just log the ObjectName of the
            ManagedConnectionFactory provided by jca.
            For other people's XAResources (e.g. DTM), we need a handle/proxy
            (serialized in a persistent location) that lets us reconnect to that server.

            The downside to this approach however is that we cannot do a partial recovery.
            We must have all XAResources available (including remote servers
            or an alternate server with the failed server's recovery log) at recovery time.

            With your approach, it would be possible to do a partial recovery without all
            XAResources available. If an XAResource that is not available didn't take part in a
            transaction we could recover this transaction.

            The downside of your approach is that you do a lot more logging.

            In practise with either approaches, a failure to reconnect to an XAResource
            probably means you'll have to fallback to Heuristics, unless you can afford not
            to reboot the server.