6 Replies Latest reply on Feb 22, 2005 11:22 AM by sheckler

    HAJMS and Oracle TAF

    sheckler

      Hi all,

      I am running HAJMS (with oracle-jdbc2-service) on JBoss 3.2.6. The oracle server infact is an Real Application Cluster with Transparent Application Failover (TAF).

      In case of TAF , for a short period, there may be no oracle instance available.

      ORA-01089: immediate shutdown in progress - no operations are permitted or
      ORA-03113: end-of-file on communication channel

      may occur

      Writing transactions are normally rolled back when failover is finished or the jboss transaction timeout occures. Reading transactions just continue (this is managed by the oracle oci driver).

      How will HAJMS (e.g. a MDB) behave in this case? Will it enter a reconnection loop until oracle is available again?


      Can I configure a separate transaction timeout for JMS?

      Are thre special settings in the oracle-ds.xml , when used by JMS?


      Thank You all
      Stefan



      PS: I am using Oracle 9.2 and ojdbc14.jar driver


      oracle-jms-ds.xml

      <datasources>
      <local-tx-datasource>
      <jndi-name>OracleJMSDS</jndi-name>
      <connection-url>jdbc:oracle:oci8:@TNSNAMES-ENTRY</connection-url>
      <driver-class>oracle.jdbc.driver.OracleDriver</driver-class>
      <security-domain>OracleDbRealm</security-domain>

      <!-- Uses the pingDatabase method to check a connection is still valid before handing it out from the pool -->
      <!--valid-connection-checker-class-name>org.jboss.resource.adapter.jdbc.vendor.OracleValidConnectionChecker</valid-connection-checker-class-name-->

      <!-- sql statement that is executed before it is checked out from the pool to make sure it is still valid. If the sql fails,
      the connection is closed and new ones created. -->
      <check-valid-connection-sql>select table_name from all_tables where owner = 'RWCO_B'</check-valid-connection-sql>

      <!-- an sql statement that is executed against each new connection -->
      <new-connection-sql>select table_name from all_tables where owner = 'RWCO_B'</new-connection-sql>

      <!-- a class that looks at vendor specific message to determine whether sql errors are fatal -->
      <exception-sorter-class-name>org.jboss.resource.adapter.jdbc.vendor.OracleExceptionSorter</exception-sorter-class-name>

      <!-- whether to monitor for unclosed Statement?s and ResultSet?s and issue warnings when the user forgets to close them (default false)
      From 3.2.6 track-statements has a new option <track-statements>nowarn</track-statements which closes Statement?s and ResultSet?s
      without a warning. It is also the new default value.
      -->
      <track-statements>nowarn</track-statements>

      <!-- whether to enable query timeout based on the length of time remaining until the transaction times out (default false) -->
      <set-tx-query-timeout>false</set-tx-query-timeout>

      <!-- the number of prepared statements per connection to be kept open and reused in subsequent requests.
      They are stored in a LRU cache. The default is 0 (zero), meaning no cache. -->
      <!--prepared-statement-cache-size>100</prepared-statement-cache-size-->

      <!--pooling parameters-->
      <min-pool-size>1</min-pool-size>
      <max-pool-size>10</max-pool-size>
      <blocking-timeout-millis>30000</blocking-timeout-millis>
      <idle-timeout-minutes>10</idle-timeout-minutes>

      <!--the default transaction isolation of the connection (unspecified means use the default provided by the database):
      TRANSACTION_READ_UNCOMMITTED
      TRANSACTION_READ_COMMITTED
      TRANSACTION_REPEATABLE_READ
      TRANSACTION_SERIALIZABLE
      TRANSACTION_NONE
      -->
      <!--transaction-isolation> </transaction-isolation-->

      <!--metadata/typemapping> - a pointer to the type mapping in conf/standardjbosscmp.xml (from JBoss4) -->


      </local-tx-datasource>
      </datasources>


        • 1. Re: HAJMS and Oracle TAF

          As far as JBossMQ is concerned (HAJMS considerations are irrelevent to your question)
          any send or recieve that requires persistence/transactions will fail during the DB failover.

          Senders will receive an exception, receivers will have the message redelivered.

          MDBs will not reconnect since there is no connection failure between the client (MDB)
          and server (JBossMQ), only between the server and the DB.

          JBossMQ does not support a transaction timeout beyond that provided by JTA.

          Your other questions can be found in the JCA FAQ.

          • 2. Re: HAJMS and Oracle TAF
            sheckler

            Thank You Adrian,

            To verify I tried the following:
            running JBoss (3.2.7 by now) on my pc with oracle on a remote server and the following configuration set:

            datasoure for JMS
            ..
            <blocking-timeout-millis>300000</blocking-timeout-millis>
            ..



            jboss-service.xml

            ..
            "<attribute name=TransactionTimeout>600</attribute>"
            ..


            after disconnecting my pc from the lan for about one minute I saw the following
            message several times:
            10:56:07,359 INFO [JMSContainerInvoker] Reconnected to JMS provider
            and everything recovers.
            The message comes from MDBs, doesn't it?

            If the the database connection is away for longer than the configured datasource blocking-timeout for JMS, JBoss doesn't come back.

            If the fault tolerance of JMS (meaning: server is running consistently and stable again after database failover) is dependent on the blocking-timeout of the datasource JMS uses (blocking-timeout shoult be at least as long as the database failover can last), it's fine for me.

            Stefan



            • 3. Re: HAJMS and Oracle TAF

              The blocking timeout is just how long the
              DataSource.getConnection() waits when all database connections are in use.

              What does the this mean?


              If the the database connection is away for longer than the configured datasource blocking-timeout for JMS, JBoss doesn't come back.


              I don't see the relevence to database failover.

              If JBossMQ cannot get a JDBC connection
              it will propogate the exception to the client. Either by:
              1) Cannot send message - in this case the message was never sent and the client
              knows this
              2) Cannot acknowledge message - in this case the message was never received
              the message is not lost because only a successful acknowledge deletes the message
              it will be redelivered

              If you are seeing something else show me the logging (READ THIS FIRST)
              and describe the behaviour.

              I am usually not very tolerant of posts like this that say "IT DOES NOT WORK".
              What it usually means is the poster "DOES NOT KNOW HOW IT IS SUPPOSED TO WORK".
              But in this case I sniff a bug report if I correctly understand your "JBoss doesn't come back". Although what this "JBoss" is that doesn't come back I can only guess?
              Where did this "JBoss" go in the first place? Do you mean the MDB, the JMS server,
              the sender, the database connection or what?

              • 4. Re: HAJMS and Oracle TAF
                sheckler

                Hello Adrian,
                thanks for Your reply.

                It is quite difficult to explain in short words, what (I think) is going on. We are developing a workforce management system communicating with an external network control system (electricity).

                We need high availability.

                Therefore we use JBoss clustering and HAJMS and oracle RAC/TAF.

                As an example: we can interchange information with mobile units (cars) over GPRS or GSM via the network control system.

                Car --> network control system -> JBoss

                This interface between the network control system and JBoss is implemented as an mbean using nio and HAJMS to get information from the car to the application server. This mbean service is a cluster singleton service. In a test environement every 10 seconds a update information from the car is comming via socket and put into a ha queue. An MDB processes this information and triggers some business logic. I can see the log message every 10 seconds. If the message cannot be put into the queue, it is retried several times.

                And now I come to the point. JBoss comes back means I can see these log messages again and JBoss is ready for clients to connect (after databse was plug off for about one minute). No exceptions are seen afterwards
                .
                JBoss doesn't come back means I dont's see the log messages any longer and clients cannot connect (JBoss console is dead) and I can see exceptions (after database was plug off more than 10 minutes, which is the blocking timeout). The last exception is

                10:15:09,218 WARN [LocalManagedConnectionFactory] Destroying connection that is not valid, due to the following exception:
                java.sql.SQLException: ORA-12571: TNS: Fehler beim Paket-Schreiber
                . Then nothing more happens. No reaction on shutdown request.


                I thought I found empirically a relation between the blocking timeout of the jms datasource and the time I can cut off the database connection without killing the server completely.

                I saw HAJMS failover messages from the container as well as from my own classes while the database was plug off.

                Please excuse my English
                Stefan








                • 5. Re: HAJMS and Oracle TAF

                  I don't want to know what you "*think* is going on".
                  I want to see what *is* going on from the logging.

                  "JBoss console is dead" .
                  If you mean the jmx console, this is just Tomcat with no interaction on the database.
                  If I understand you correctly, this probably means your JVM has crashed
                  (assuming no funny network interactions).

                  I'd guess (and it is a guess) that you are using OCI which is native code.
                  And that this native code has caused the jvm to crash due to a bug????

                  Trying to take a thread dump as explained in "READ THIS FIRST"
                  will tell you whether the jvm is still active and what it is doing.

                  If I don't see something more concrete about the problem in your future posts
                  I will be ignoring this thread.

                  • 6. Re: HAJMS and Oracle TAF
                    sheckler

                    Yes, I am using OCI.
                    Since I changed to the newest version of the oci driver (from 9.2.0.2 to 9.2.0.5 of oracle 9i), I could not reproduce any of the behaviour described.
                    The test results differences are definitly caused by the oci driver.