1 2 Previous Next 20 Replies Latest reply on Oct 4, 2011 2:44 PM by clebert.suconic

Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 22, 2011 2:55 PM

Hi!

We have standalone 2.2.5 Hornetq installation on Ubuntu Server. One active, one backup , static connection between them. Shared store is SAN mounted on Ubuntu as ocfs2. NIO is used for connections, Journal is AIO.

2 instances of the same application (servlet + ssb on clustered Jboss via JCA) send 1-2K messages to 1 paged queue. Both connection pools grow up to 50 connections each.

Queue has MDB consumer (10 sessions) and also non exclusive divert to another queue (second queue paging policy is drop). There are no consumers on second queue yet.

Consumers send messages with 300-700 msg/sec. MDB consumes them. And everything seems nice, but:

In logs we have a lot of such messages:

[Thread-3 (group:HornetQ-scheduled-threads-22262475)] 19:21:18,874 WARNING [org.hornetq.core.transaction.impl.ResourceManagerImpl] transaction with xid XidImpl (1973652184 bq:55.102.48.48.48.48.48.49.58.57.53.100.52.58.52.101.55.98.49.100.101.55.58.52.99.49.49.100.52 formatID:131075 gtxid:49.45.55.102.48.48.48.48.48.49.58.57.53.100.52.58.52.101.55.98.49.100.101.55.58.52.99.49.49.56.55 timed out

Sometimes such messages:

[Thread-3 (group:HornetQ-scheduled-threads-22262475)] 19:21:18,875 SEVERE [org.hornetq.core.transaction.impl.ResourceManagerImpl] failed to timeout transaction, xid:XidImpl (2021427911 bq:55.102.48.48.48.48.48.49.58.57.53.100.52.58.52.101.55.98.49.100.101.55.58.52.99.49.49.52.97 formatID:131075 gtxid:49.45.55.102.48.48.48.48.48.49.58.57.53.100.52.58.52.101.55.98.49.100.101.55.58.52.99.48.102.50.51

java.lang.IllegalStateException: Transaction is in invalid state SUSPENDED

at org.hornetq.core.transaction.impl.TransactionImpl.rollback(TransactionImpl.java:345)

at org.hornetq.core.transaction.impl.ResourceManagerImpl$TxTimeoutHandler.run(ResourceManagerImpl.java:228)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)

at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

Sometimes such:

[New I/O server worker #1-34] 19:30:53,650 WARNING [org.hornetq.core.protocol.core.impl.HornetQPacketHandler] Reattach request from /192.168.168.197:53634 failed as there is no confirmationWindowSize configured, which may be ok for your system

But confirmation window is configured in ConnectionFactory in hornetq-jms-xml (3Mb)

And after several hours or 1 -2 days server stuck, all consumers and producers stuck also. No failover, restart doesn't help. But if I stop server, clear shared store (remove journal and pages) and start it again, - consumers and producers are able to reconnect and system starts to work again.

Could you help? What can be the problem?

1. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 22, 2011 9:23 PM (in response to didka)

You should configure Recovery to lookup for XIDs on the remote standalone.

You should add the recovery configuration at the JBoss Application Server.
Actions
2. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 23, 2011 12:57 AM (in response to clebert.suconic)

I add transaction property to all Jboss instances working with Hornetq as it is described in documentation:

<properties depends="arjuna" name="jta">
        <property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.JBMESSAGING1"
                  value="org.jboss.jms.server.recovery.MessagingXAResourceRecovery;java:/DefaultJMSProvider"/>
        <property name="com.arjuna.ats.jta.supportSubtransactions" value="NO"/>
        <property name="com.arjuna.ats.jta.jtaTMImplementation" value="com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionManagerImple"/>
        <property name="com.arjuna.ats.jta.jtaUTImplementation" value="com.arjuna.ats.internal.jta.transaction.arjunacore.UserTransactionImple"/>
<property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.HORNETQ1"                    value="org.hornetq.jms.server.recovery.HornetQXAResourceRecovery;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.16,port=5445;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.17,port=5445"/>
      </properties>

This warning still present:

[Thread-4 (group:HornetQ-scheduled-threads-404150953)] 10:41:32,407 WARNING [org.hornetq.core.transaction.impl.ResourceManagerImpl] transaction with xid XidImpl (538595391 bq:55.102.48.48.48.48.48.49.58.56.55.57.48.58.52.101.55.99.48.56.55.100.58.50.50.98.100.51 formatID:131075 gtxid:49.45.55.102.48.48.48.48.48.49.58.56.55.57.48.58.52.101.55.99.48.56.55.100.58.50.50.98.99.102 timed out

And this error also:

[Thread-0 (group:HornetQ-scheduled-threads-404150953)] 10:44:37,402 SEVERE [org.hornetq.core.transaction.impl.ResourceManagerImpl] failed to timeout transaction, xid:XidImpl (954968496 bq:55.102.48.48.48.48.48.49.58.56.55.57.48.58.52.101.55.99.48.56.55.100.58.52.101.50.49.53 formatID:131075 gtxid:49.45.55.102.48.48.48.48.48.49.58.56.55.57.48.58.52.101.55.99.48.56.55.100.58.52.101.50.49.50
java.lang.IllegalStateException: Transaction is in invalid state SUSPENDED
          at org.hornetq.core.transaction.impl.TransactionImpl.rollback(TransactionImpl.java:345)
          at org.hornetq.core.transaction.impl.ResourceManagerImpl$TxTimeoutHandler.run(ResourceManagerImpl.java:228)
          at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
          at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
          at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
          at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:662)

Did I miss something?
Actions
3. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 23, 2011 12:02 PM (in response to didka)

You are using the recovery module from JBoss Messaging. You should use the one from HornetQ:

org.hornetq.jms.server.recovery.HornetQXAResourceRecovery

I will Andy Taylor to give some insight here.
Actions
4. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 23, 2011 11:47 PM (in response to clebert.suconic)

At first: Thanx for support!

Actually I have both properties for JBossMessaging and for HornetQ in transaction properties:

<properties depends="arjuna" name="jta">
        <property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.JBMESSAGING1"
                  value="org.jboss.jms.server.recovery.MessagingXAResourceRecovery;java:/DefaultJMSProvider"/>
        <property name="com.arjuna.ats.jta.supportSubtransactions" value="NO"/>
        <property name="com.arjuna.ats.jta.jtaTMImplementation" value="com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionManagerImple"/>
        <property name="com.arjuna.ats.jta.jtaUTImplementation" value="com.arjuna.ats.internal.jta.transaction.arjunacore.UserTransactionImple"/>
<property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.HORNETQ1"                    value="org.hornetq.jms.server.recovery.HornetQXAResourceRecovery;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.16,port=5445;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.17,port=5445"/>
      </properties>

This is because another (quite old) application on this JBoss uses JBoss Messaging. I assumed they will not interfere.
Should I separate them or it is still possible to use both?
Anyway I will wait for the insights by Andy Taylor.
Actions
5. Re: Strange problem with Standalone 2.2.5 journal/pages...

ataylor Sep 26, 2011 5:46 AM (in response to didka)

I dont think its an issue with recovery, looks to me as if you have long running clients using XA, maybe you just need to up the transaction timeout setting
Actions
6. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 26, 2011 12:55 PM (in response to ataylor)

All transactions are less than 2 second (90% even less than 1 second). Timeout is set to default 5 min.
Is my config provided above ( for Jboss TS) OK? Is it OK to use both JBM and HornetQ properties?
I changed NIO to old IO and adjusted thread pool. System became more stable and works online about 3 days. But these warnings and errors with XA timeouts still apear in log.
Actions
7. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 28, 2011 12:51 AM (in response to didka)

It's ok if you use are using both products (although not tested).

It seems you are using the wrong module with HornetQ.
Actions
8. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 28, 2011 4:43 AM (in response to clebert.suconic)

Please advise, how to configure to use proper module?
Actions
9. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 28, 2011 9:01 AM (in response to didka)

Never mind: I just realized you have this:

<property name="com.arjuna.ats.jta.recovery.XAResourceRecovery.HORNETQ1" value="org.hornetq.jms.server.recovery.HornetQXAResourceRecovery;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.16,port=5445;org.hornetq.core.remoting.impl.netty.NettyConnectorFactory,user,xxxx,host=192.168.189.17,port=5445"/>

What version of the application server you have on the client, and what version of the Transaction Manager?

There are a few issues with fixed on the transaction manager when we were doing EAP.
Actions
10. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 28, 2011 12:22 PM (in response to clebert.suconic)

We use Jboss 4.3 EAP CP04
and
Jboss 5.1.0 GA. (JTA version - tag:JBOSSTS_4_6_1_GA)
Actions
11. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 28, 2011 5:07 PM (in response to didka)

Since you are on EAP, why don't you move to EAP 5.1

HornetQ is not supported on EAP 4 anyways. There's a tech preview now and the next upcoming version is planned to full support.

There was a bug fixed on that version of the TM. There was bug that if the list of XID was returned in a different order than expected by the TM recovery would break.

Anyway, after you got a transaction in a bad state like this, you will have to play with Heuristic methods and commit or rollback them manually on HornetQ and on the TM.

If this is happening everytime in your test, I suggest you upgrade to the latest TM and the latest EAP version.
1 of 1 people found this helpful
Actions
12. Re: Strange problem with Standalone 2.2.5 journal/pages...

didka Sep 28, 2011 10:24 PM (in response to clebert.suconic)

Thanx Clebert!
Problem is some legacy application should work on EAP4 (3d party application which is out of our control) and our HornetQ related module calls it via EJB. EJB calls between Jboss 4 and 5 are not working in our case due to class loading and serialization problems.
I will try to separate HornetQ related module to Jboss 5.1 and call application on EAP4 via WS.
Actions
13. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 28, 2011 11:32 PM (in response to didka)

There are some issues on that TM. Maybe you apply the same patches on the Transaction Manager. Since you're dealing with EAP, you would have to talk to support about it.
Actions
14. Re: Strange problem with Standalone 2.2.5 journal/pages...

clebert.suconic Sep 28, 2011 11:33 PM (in response to clebert.suconic)

Clebert Suconic wrote:

There are some issues on that TM. Maybe you apply the same patches on the Transaction Manager. Since you're dealing with EAP, you would have to talk to support about it.
I mean, if you want to keep the same path you were before.
Actions

1 2 Previous Next

Go to original post