failed tx never expires
mazz Dec 1, 2008 11:57 PMI think there is a problem with tx recovery expiration.
I have a JBossAS 4.2.1 server that has been running for about a week. Early on I had a database failure but the database is fine now and has been for days. This server does NOT have XA recovery fully configured (I know, I know - but don't get me started on that topic :) - so it got the infamous "Could not find new XAResource to use for recovering non-serializable XAResource" error. But! It's getting this error for days - it is never expiring.
I looked in the logs, and I get this error message about every 2 minutes for days and days - below I copied the first log message and the last one I got (which is up to the time I started writing this forum post) - look at the timestamps of the logs and notice the tx UID is the same:
The first one:
2008-11-25 13:39:26,686 WARN [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] Could not find new XAResource to use for recovering non-serializable XAResource < 131075, 29, 27, 1-a1058dc:d7e4:492c2eb9:18a68a1058dc:d7e4:492c2eb9:18cce^@...>
The latest one (and it's still repeating as I type):
2008-12-01 23:36:17,100 WARN [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] Could not find new XAResource to use for recovering non-serializable XAResource < 131075, 29, 27, 1-a1058dc:d7e4:492c2eb9:18a68a1058dc:d7e4:492c2eb9:18cce^@...>
As I said, this message appears tons of times, this is just the first and last time (and its still going). For 6 days straight and counting.
Now, I thought JBossTM would expire tx's that cannot recover after 12 hours - controlled by this configuration (this is taken directly out of my jbossjta-properties.xml from this server):
<!-- Interval, in hours, between running the expiry scanners. This can be quite long. The absolute value determines the interval - if the value is negative, the scan will NOT be run until after one interval has elapsed. If positive the first scan will be immediately after startup. Zero will prevent any scanning. Default = 12 = run immediately, then every 12 hours. --> <property name="com.arjuna.ats.arjuna.recovery.expiryScanInterval" value="12"/> <!-- Age, in hours, for removal of transaction status manager item. This should be longer than any ts-using process will remain running. Zero = Never removed. Default is 12. --> <property name="com.arjuna.ats.arjuna.recovery.transactionStatusManagerExpiryTime" value="12"/>
My question is - why doesn't this expire? As it stands, I don't think this server will ever come out of its funk - even a restart won't help because I assume the tx-object-store has this tx persisted and will just start back up trying (and failing) to recover. I would have to kill the server and manually delete the tx-object-store directory.
Of course, I'm expecting the answer to be, "well, configure XA recovery properly and you won't get this error". Ignore that for now :) This still shouldn't cause my server to forever be in a funk - the system should realize after a while its never going to recover this tx and expire those tx's after SOME amount of time (where that time should be something less than 6 days :)