4 Replies Latest reply on Aug 30, 2013 3:40 PM by jobame

    Problem with analysing transaction locks

    jobame

      Hello

       

      We use JBoss-as-7.1.1.Final and do have transaction errors. It starts with the below log messages, continues through

      • ARJUNA016063: The transaction is not active!
      • java.sql.SQLException: pingDatabase failed status=-2
      • ARJUNA016051: thread is already associated with a transaction!
      • Lock owned during cleanup: java.lang.Throwable: Lock owned during cleanup
      • Connection is not associated with a managed connection.org.jboss.jca.adapters.jdbc.jdk6.WrappedConnectionJDK6@363f4446
      • java.lang.OutOfMemoryError: Java heap space

      and ends up with a "ORA-01591: lock held by in-doubt distributed transaction 9.17.233860" which is pretty bad. See attached log for full details.

       

      How is it technically possibly to analyse where the error occurs? At the start of the problem log statements there is no link to the code of the application. How can we work on it?

       

      22.08.2013 18:35:01,063 WARN  [TransactionReaper.java:check:148] ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f in state  RUN

      22.08.2013 18:35:02,789 WARN  [TransactionReaper.java:check:148] ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f in state  CANCEL

      22.08.2013 18:35:04,430 WARN  [TransactionReaper.java:check:148] ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f in state  CANCEL_INTERRUPTED

      22.08.2013 18:35:04,434 WARN  [TransactionReaper.java:check:254] ARJUNA012120: TransactionReaper::check worker Thread[Transaction Reaper Worker 0,5,main] not responding to interrupt when cancelling TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f -- worker marked as zombie and TX scheduled for mark-as-rollback

      22.08.2013 18:35:04,435 WARN  [TransactionReaper.java:check:270] ARJUNA012110: TransactionReaper::check successfuly marked TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f as rollback only

      22.08.2013 18:35:04,436 WARN  [TransactionReaper.java:doCancellations:406] ARJUNA012113: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] missed interrupt when cancelling TX 0:ffff0a045150:-5eacabf6:5214c882:51f15f -- exiting as zombie (zombie count decremented to 0)

      22.08.2013 18:35:04,435 INFO  [ClusterGuard.java:ensureOneProcessInCluster:49] db - entry created ClusterControl [id=34275, owner=DebitPositionCalculation, startDate=22.08.2013 18:35:02, hostname=chetan001.unix.live.local]

      22.08.2013 18:35:04,456 INFO  [DebitPositionRunTrigger.java:triggerDebitPositions:63] ... starting DPC

      22.08.2013 18:35:04,458 INFO  [DebitPositionRunTrigger.java:calculateDebitPositions:90] Retrieved 0 AccountState candidates for calculation of debit positions from database.

      22.08.2013 18:36:35,725 WARN  [BasicAction.java:Abort:1588] ARJUNA012077: Abort called on already aborted atomic action 0:ffff0a045150:-5eacabf6:5214c882:51f15f

      22.08.2013 18:36:35,728 ERROR [TimerTask.java:run:137] JBAS014120: Error invoking timeout for timer: [id=1b8444f9-0dd2-4bf0-b96c-49f5330256fe timedObjectId=ab23TurnerEAR.ab23TurnerEJB.AccountStateTimer auto-timer?:true persistent?:false timerService=org.jboss.as.ejb3.timerservice.TimerServiceImpl@12efc0ee initialExpiration=Wed Aug 21 00:00:00 CEST 2013 intervalDuration(in milli sec)=0 nextExpiration=Thu Aug 22 18:40:00 CEST 2013 timerState=IN_TIMEOUT: javax.ejb.EJBTransactionRolledbackException: Transaction rolled back

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleEndTransactionException(CMTTxInterceptor.java:115) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.endTransaction(CMTTxInterceptor.java:95) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.tx.TimerCMTTxInterceptor.endTransaction(TimerCMTTxInterceptor.java:66) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInOurTx(CMTTxInterceptor.java:232) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.required(CMTTxInterceptor.java:304) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:190) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.as.ejb3.component.interceptors.AdditionalSetupInterceptor.processInvocation(AdditionalSetupInterceptor.java:43) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]

        at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:101) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.timerservice.task.CalendarTimerTask.callTimeout(CalendarTimerTask.java:60) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at org.jboss.as.ejb3.timerservice.task.TimerTask.run(TimerTask.java:132) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_37]

        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_37]

        at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_37]

        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_37]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_37]

        at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_37]

        at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.0.0.GA.jar:2.0.0.GA]

      Caused by: javax.transaction.RollbackException: ARJUNA016063: The transaction is not active!

        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1155) [jbossjts-4.16.2.Final.jar:]

        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:117) [jbossjts-4.16.2.Final.jar:]

        at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)

        at org.jboss.as.ejb3.tx.CMTTxInterceptor.endTransaction(CMTTxInterceptor.java:92) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]

        ... 24 more

       

      Edit: since we don't know when the problem does occur we can't set the log level to TRACE for a long time. The application was running without problems for weeks.

      We use two timer triggered methods but we make sure only one of them can run at a time. So if one starts and the other has not finished yet then the other cannot start.

       

      Regards,

       

      Dominik

        • 1. Re: Problem with analysing transaction locks
          smarlow

          "ARJUNA016051: thread is already associated with a transaction" and some related bugs are fixed upstream (in WildFly 8) as well as in EAP 6.1.  If your already running your application in production, EAP is best for that.


          How is it technically possibly to analyse where the error occurs? At the start of the problem log statements there is no link to the code of the application. How can we work on it?

           

          While it is possible for application logic to cause some Exceptions to be discarded (which makes it harder to analyse), I think that the problem is mostly not caused by your application. 

          • 2. Re: Problem with analysing transaction locks
            sfcoy

            Scott Marlow may well be right, but my first instinct says that you have a bunch of long running transactions, possibly of the Select N+1 Problem – How to Decrease Your ORM Performance variety. The fact that you are also running out of memory contributes to my opinion that you have a bunch of transactions running that are all loading a significant chunk of your database into memory. Chances are your database has grown over the last few weeks and it has reached the point where it takes too long for your queries to complete.

             

            Unfortunately, the log you provided is not a standard jboss server log, which includes the thread id in each log message. Having this information in the log makes it easier to work out what is going on.

            • 3. Re: Problem with analysing transaction locks
              smarlow

              scoy is definitely correct in that you are hitting an (application) error situation first and if you avoid that, that would also work around the problem (so that the JTA transaction doesn't have to timeout). 

               

              WFLY-1346 is one the jiras that are fixed upstream in WildFly 8 and could be made available to you if you are/become an EAP customer.

              • 4. Re: Problem with analysing transaction locks
                jobame

                Many thanks for the additional hints. Next thing on Monday will be to include the thread into the logging statements.

                 

                The application basically is triggered by a timer service. When it is triggered, it fetches with a single statement (with no joins in the query at all) a number of entites and for each found entity it sends a message into a queue. A message driven bean then retrieves and processes each message. Each message runs in its own transaction and the queries being used should not cause a N+1 problem (especially based on the small amount of data - see below).

                 

                Regarding the database: there are only around 600 "customers" and the load was almost not changing during the last 3 months. Considering the relationships of the domain model and the size of tupels resulting from one-to-many relationship queries (based on the concrete data) it is very unrealistic that a N+1 problem could result. At the time of the exception being thrown there was almost nothing to do by the application/database/host. Anyways, I will further analyse the application apart from changing the logging.

                 

                Is there a possibilty that the database for some reason did not react during a certain amount of time which caused a transaction to time out or that some other activity on database side can cause such a problem? Would the server not be able to handle a transaction timeout? Is it possible to use JMS without the driver for distributed transactions in JBoss?