Problem with XARecoveryModule.
slawomir.wojtasiak Nov 25, 2011 10:47 AMAfter losing all connections to my database I found a problem with JTA Recovery Manager. Following stacktrace tells us that connection used by recovery manager has already been closed, so finally recovery manager (for PG) is not working since that.
{quote}
15:25:00,641 WARN [com.arjuna.ats.jta:121] ARJUNA-16027 Local XARecoveryModule.xaRecovery got XA exception XAException.XAER_RMERR: org.postgresql.xa.PGXAException: Error during recover
at org.postgresql.xa.PGXAConnection.recover(PGXAConnection.java:358) [:]
at org.jboss.resource.adapter.jdbc.xa.XAManagedConnection.recover(XAManagedConnection.java:294) [:6.0.0.Final]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecovery(XARecoveryModule.java:468) [:6.0.0.Final]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.resourceInitiatedRecoveryForRecoveryHelpers(XARecoveryModule.java:436) [:6.0.0.Final]
at com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:155) [:6.0.0.Final]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789) [:6.0.0.Final]
at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [:6.0.0.Final]
Caused by: org.postgresql.util.PSQLException: This connection has been closed.
at org.postgresql.jdbc2.AbstractJdbc2Connection.checkClosed(AbstractJdbc2Connection.java:714) [:]
at org.postgresql.jdbc3.AbstractJdbc3Connection.createStatement(AbstractJdbc3Connection.java:230) [:]
at org.postgresql.jdbc2.AbstractJdbc2Connection.createStatement(AbstractJdbc2Connection.java:191) [:]
at org.postgresql.xa.PGXAConnection.recover(PGXAConnection.java:331) [:]
... 6 more
{quote}
I have done some investigation and I have found that this connection is cached and used to process recovering until server (or maybe only application) is restarted/redeployed.
XARecoveryModule uses xaResourceRecoveryHelper to get XAResource used then to process recovering by calling "recover" method on it. The problem is that returned XAManagedConnection is a cached one and there is no sanity checking of the connection. ManagedConnectionFactoryDeployment is responsible for caching it and it does nothing to provide valid connection:
ManagedConnectionFactoryDeployment
{code}
/**
* Open a managed connection
* @param s The subject
* @return The managed connection
* @exception ResourceException Thrown in case of an error
*/
private ManagedConnection open(Subject s) throws ResourceException
{
if (recoverMC == null)
{
recoverMC = createManagedConnection(s, null);
}
return recoverMC;
}
{code}
Such a cached XAManagedConnection is returned to XARecoveryModule.
To be honest there is a check in ManagedConnectionFactoryDeployment that was probably supposed to reconnect invalid connections but in this situation it's not possible as long as XAManangedConnection does not provide any sanity check in getXAResource().
ManagedConnectionFactoryDeployment
{code}
try
{
xaResource = mc.getXAResource();
}
catch (ResourceException reconnect)
{
close(mc);
mc = open(subject);
xaResource = mc.getXAResource();
}
{code}
XAManagedConnection
{code}
public XAResource getXAResource() throws ResourceException
{
return this;
}
{code}
So XARecoveryModule gets invalid XAResource instance and tries to process recovering by invoking recover(XAResource.TMSTARTRSCAN).
XARecoveryModule
{code}
try
{
trans = xares.recover(XAResource.TMSTARTRSCAN);
if (jtaLogger.logger.isDebugEnabled()) {
jtaLogger.logger.debug("Found "
+ ((trans != null) ? trans.length : 0)
+ " xids in doubt");
}
}
catch (XAException e)
{
jtaLogger.i18NLogger.warn_recovery_xarecovery1(_logName+".xaRecovery", XAHelper.printXAErrorCode(e), e);
try
{
xares.recover(XAResource.TMENDRSCAN);
}
catch (Exception e1)
{
}
return false;
}
{code}
This invocation ends inside PGXAConnection which does a sanity check and throws appropriate exception. This connection is caught by "catch" above and scanning for prepared transaction branches is just finished.
As you can see invalid XAResource is not invalidated in any way and next periodic recovery will also fail using it.
Does anyone resolve that? This looks like a bug, but maybe there is a way to resolve this problem only by using configuration.
I'm using JBoss 6.0.0 AS.
Thanks in advance,
Slawek