-
1. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 12:54 PM (in response to bill.burke)ResourceAdapter.getXAResources is for message inflow recovery,
see JCA1.5 Section 12.5.2
The activation specs must be persisted for recovery purposes.
For outbound it needs to do ManagedConnection.getXAResource().
For the Subject problem see the special case mentioned in JCA1.5 Section 6.5.3.5
This looks insecure to me (and probably not supported by some XAResources)
so we may need to add configuration for a recovery user/password where
this cannot be determined from the JCA config.
The fundamental problem is that the transaction manager needs to get the list
of ResourceAdapters/ManagedConnectionFactorys to perform recovery.
But this cannot be done using a <depends-list> because the ResourceAdapter
needs a reference to the transaction manager before it can start (chicken and egg).
1) For the XATerminator - RARs could try to use this during start()
2) For the tx-connection-factory/tx datasources - not strictly necessary since we
don't need the connection manager or pool for recovery
So what is required is that there be a separate recovery MBean that has a
<depends-list> of the ResourceAdapters.
This will allow the following startup ordering:
1) Transaction manager
2) Resource Adapters
3) Recovery manager
Care must be taken such that the recovery manager only tries to
recover transactions that were from a previous instances.
Using either a mark in the log of when it was restarted or by
recording the jvm id in the log record?
We also want to fix this tight coupling of the TM and RARs to make this simpler.
We can then delay some processes to make the recovery occur before
services become available
1) Transaction manager
2) Resource Adapters
3) Recovery manager
4) Start WorkManager
5) Bind ConnectionFactorys to JNDI
6) Activate MDB activations
Also bear in mind that JBossMQ and other services(?) use a DataSource
to persist data. i.e. one XAResource uses another one as a delegate.
Although in this case, the delegate XAResource should never be taking
part in two phase commit!? -
2. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 12:57 PM (in response to bill.burke)On Heuristics, I think the best thing to do for this is what Corba alllows.
i.e. We report all heuristics to a special log/notification mechanism
with some policy on whether it should be automatically rolledback or committed.
Only the administrator can work out what has gone wrong or whether it is
consistent with how he tried to resolve a problem. -
3. Re: Finding XAResources in Recovery step
bill.burke Jan 3, 2005 1:29 PM (in response to bill.burke)"adrian@jboss.org" wrote:
ResourceAdapter.getXAResources is for message inflow recovery,
see JCA1.5 Section 12.5.2
The activation specs must be persisted for recovery purposes.
We really don't have any component that uses inflow yet do we? I'll do this one on the second iteration of the recovery mechanism.
For outbound it needs to do ManagedConnection.getXAResource().
For the Subject problem see the special case mentioned in JCA1.5 Section 6.5.3.5
This looks insecure to me (and probably not supported by some XAResources)
so we may need to add configuration for a recovery user/password where
this cannot be determined from the JCA config.
Yes, I already read that section...What Subject are you suppoed to pass in? It is the ConnectionRequestInfo that is supposed to be null.
The fundamental problem is that the transaction manager needs to get the list
of ResourceAdapters/ManagedConnectionFactorys to perform recovery.
But this cannot be done using a <depends-list> because the ResourceAdapter
needs a reference to the transaction manager before it can start (chicken and egg).
1) For the XATerminator - RARs could try to use this during start()
2) For the tx-connection-factory/tx datasources - not strictly necessary since we
don't need the connection manager or pool for recovery
So what is required is that there be a separate recovery MBean that has a
<depends-list> of the ResourceAdapters.
This will allow the following startup ordering:
1) Transaction manager
2) Resource Adapters
3) Recovery manager
I don't think this needs to be that complicated. If we force/require all ResourceAdapters and ManagedConnectionFactory's to have a specific ObjectName attribute then we can do an MBean query on the MBeanServer to find these MBeans.
From what you're saying, I think we need to require that each XA resource be required to implement an MBean whose sole purpose is to obtain a reference to an XAResource interface.
So, here's what I had in mind:
1. JBoss Boots up. All MBeans are created and started.
2. JBoss ServerImpl broadcasts an MBean Startup Notification. (This is currently already coded).
3. RecoverManager MBean receives the "JBoss Started" notification and begins recovery.
4. RecoveryManager queries MBeanServer for all "Recovery" Mbeans. These "Recovery" Mbeans will provide references to their XAResources.
5. RecoveryManager performs recovery.
6. RecoveryManager tells all "Recovery" Mbeans that they are finished with the XAResources.
Care must be taken such that the recovery manager only tries to
recover transactions that were from a previous instances.
Using either a mark in the log of when it was restarted or by
recording the jvm id in the log record?
The RecoveryManager MBean can do this at start(). Just rename existing log files, or use a timestamp in the logfile name. Anyways...I think it would be better to have a specific logger than a DB. That way, we have ultimate speed, and we can have/control as many log files as we want.
We also want to fix this tight coupling of the TM and RARs to make this simpler.
Based on what I've said above, do we really need to fix the coupling? Problem is...is it ok to do recovery while a live system is running? Seems it would be ok as long as XAResource.recover does not return Xids of existing running transactions.
Also bear in mind that JBossMQ and other services(?) use a DataSource
to persist data. i.e. one XAResource uses another one as a delegate.
Although in this case, the delegate XAResource should never be taking
part in two phase commit!?
This will take more thought when recovery is added to JMS. Isn't/doesn't the TM required to identify duplicate XAResources?
Bill -
4. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 1:51 PM (in response to bill.burke)
Yes, I already read that section...What Subject are you suppoed to pass in? It is the ConnectionRequestInfo that is supposed to be null.
Correct, my memory wasn't working correctly.
Like I said, you can either used the configured user/password or if there
isn't one (our jca login modules have mechanisms to provide a default).
have a separate config for the recovery user/password.
It should be a case of refactoring the getSubject() code in
BaseConnectionManager2. I never liked the way this worked in any case :-) -
5. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 2:00 PM (in response to bill.burke)
I don't think this needs to be that complicated. If we force/require all ResourceAdapters and ManagedConnectionFactory's to have a specific ObjectName attribute then we can do an MBean query on the MBeanServer to find these MBeans.
From what you're saying, I think we need to require that each XA resource be required to implement an MBean whose sole purpose is to obtain a reference to an XAResource interface.
So, here's what I had in mind:
1. JBoss Boots up. All MBeans are created and started.
2. JBoss ServerImpl broadcasts an MBean Startup Notification. (This is currently already coded).
3. RecoverManager MBean receives the "JBoss Started" notification and begins recovery.
4. RecoveryManager queries MBeanServer for all "Recovery" Mbeans. These "Recovery" Mbeans will provide references to their XAResources.
5. RecoveryManager performs recovery.
6. RecoveryManager tells all "Recovery" Mbeans that they are finished with the XAResources.
The RecoveryManager MBean can do this at start(). Just rename existing log files, or use a timestamp in the logfile name. Anyways...I think it would be better to have a specific logger than a DB. That way, we have ultimate speed, and we can have/control as many log files as we want.
I like the idea of a RecoverableMBean. That is similar to what OTS provides.
That has a couple of problems:
1) One of the MBeans might have failed to start or is no longer deployed.
The recovery could be incomplete/inaccurate unless there is a predefined list
of expected resources.
2) Renaming log files is bad. It is not a repeatable process. What happens if it fails again
during recovery.
3) Timestamps are bad, especially if you want to move the log to a different
server with a uncorrelated up clock.
The reason for using the DB is as follows:
a) It is usually a part of the transaction already
b) It is easy to implement
c) It fixes the final problem for a local db
i.e. when using the last resource gambit, there is no way to know whether
the db commit worked or failed if the AS fails during the DB commit invocation. -
6. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 2:03 PM (in response to bill.burke)
Based on what I've said above, do we really need to fix the coupling? Problem is...is it ok to do recovery while a live system is running? Seems it would be ok as long as XAResource.recover does not return Xids of existing running transactions.
In principle there should be no problem. The TM knows which XIDs are
currently active (they are in the hashmap of active transactions).
In practice, you will probably find Oracle and MSSQL has issues :-) -
7. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 2:06 PM (in response to bill.burke)
This will take more thought when recovery is added to JMS. Isn't/doesn't the TM required to identify duplicate XAResources?
It has to do the isSameRM() check. It doesn't need to go down to XID level.
Except of course for Oracle where isSameRM() doesn't work correctly,
although this might only be on the suspend/start(resume)? -
8. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 2:07 PM (in response to bill.burke)Linked with JIRA: http://www.jboss.org/index.html?module=bb&op=viewtopic&t=58285
-
9. Re: Finding XAResources in Recovery step
bill.burke Jan 3, 2005 2:34 PM (in response to bill.burke)"adrian@jboss.org" wrote:
I like the idea of a RecoverableMBean. That is similar to what OTS provides.
That has a couple of problems:
1) One of the MBeans might have failed to start or is no longer deployed.
The recovery could be incomplete/inaccurate unless there is a predefined list
of expected resources.
2) Renaming log files is bad. It is not a repeatable process. What happens if it fails again
during recovery.
3) Timestamps are bad, especially if you want to move the log to a different
server with a uncorrelated up clock.
Ok, then it would be implemented as follows:
1. TM depends on RecoveryManager
2. Recovery manager records prexisting log files.
3. Recovery Manager creates new log files.
4. Everybody starts up.
5. Recover Manager receives START notification. Starts recovering on prexisting recorded file list.
The reason for using the DB is as follows:
a) It is usually a part of the transaction already
b) It is easy to implement
c) It fixes the final problem for a local db
i.e. when using the last resource gambit, there is no way to know whether
the db commit worked or failed if the AS fails during the DB commit invocation.
Seems this would only work if the logger was the same DataSource as the Gambitted resource.
Bill -
10. Re: Finding XAResources in Recovery step
bill.burke Jan 3, 2005 2:42 PM (in response to bill.burke)"bill.burke@jboss.com" wrote:
"adrian@jboss.org" wrote:
I like the idea of a RecoverableMBean. That is similar to what OTS provides.
That has a couple of problems:
1) One of the MBeans might have failed to start or is no longer deployed.
The recovery could be incomplete/inaccurate unless there is a predefined list
Forgot to answer this one...
RecoverableMBean should provide a getId. The RecoverManager will store this at the beginning of the logfile and not allow recovery unless all RecoverableMBeans are deployed.
Bill -
11. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 2:43 PM (in response to bill.burke)"bill.burke@jboss.com" wrote:
Ok, then it would be implemented as follows:
1. TM depends on RecoveryManager
2. Recovery manager records prexisting log files.
3. Recovery Manager creates new log files.
4. Everybody starts up.
5. Recover Manager receives START notification. Starts recovering on prexisting recorded file list.
You still have an unrepeatable operation. It is not a good idea
to dynamically create logs (it might fail - no disk space - at just the wrong time).
Logs should be preallocated space and reused - you rewrite the log(s) from the memory
at checkpoints.
Like I said, the TM knows its currently active transactions. They are in its
memory state.
The reason for using the DB is as follows:
a) It is usually a part of the transaction already
b) It is easy to implement
c) It fixes the final problem for a local db
i.e. when using the last resource gambit, there is no way to know whether
the db commit worked or failed if the AS fails during the DB commit invocation.
Seems this would only work if the logger was the same DataSource as the Gambitted resource.
Bill
Correct. You are only allowed one local resource and it must be the one
recording the transactions. In fact this is required even if it is not the tm log.
You have to have a mechanism to discover whether the transaction committed
for that unlikely occurance that the AS fails during the localdb.commit()
see the JIRA link. -
12. Re: Finding XAResources in Recovery step
bill.burke Jan 3, 2005 3:02 PM (in response to bill.burke)"adrian@jboss.org" wrote:
"bill.burke@jboss.com" wrote:
Ok, then it would be implemented as follows:
1. TM depends on RecoveryManager
2. Recovery manager records prexisting log files.
3. Recovery Manager creates new log files.
4. Everybody starts up.
5. Recover Manager receives START notification. Starts recovering on prexisting recorded file list.
You still have an unrepeatable operation.
No you don't. The files are not renamed. If covery fails at any point, then the next server reboot will just retry those log files.
It is not a good idea
to dynamically create logs (it might fail - no disk space - at just the wrong time).
The disk might fail, but this does not create a scenario of inconsistent state. Prepared resources will just get rolled back during recovery.
Logs should be preallocated space and reused - you rewrite the log(s) from the memory
at checkpoints.
Personally, I prefer a simpler design than a rolling, preallocated log file. I think requiring enough disk space is a reasonable requirement.
Bill -
13. Re: Finding XAResources in Recovery step
adrian.brock Jan 3, 2005 3:12 PM (in response to bill.burke)
The disk might fail, but this does not create a scenario of inconsistent state. Prepared resources will just get rolled back during recovery.
resource1.prepare(); // vote ok
resource2.prepare(); // vote ok
log.record(); // ooops disk full
Now you have to rollback when it could be commited.
Worse, you can't do any more work because all transactions fail from now on.