- Local or global transactions?
- Support from Infinispan
- Design Draft
There are several use cases that requires HotRod to support transactions:
- potentially support Hibernate API as an access layer to the grid: ISPN-24 (a proof of concept is being done at this time)
- to be able to support embedded cache to transactionally talk to a remote Infinispan cluster through RemoteCacheStore
- as a general mean of being able to access an Infinispan grid in a transactional manner, through HotRod
This is tracked in JIRA by: ISPN-375
Local or global transactions?
The difference between local and global transactions
As an example, local transactions are achieved by setting the JDBC Connetion's autocommit property to false:
Connection conn = getConnection(); conn.setAutoCommit(false); // do several database updates ... //this ends the transaction, one-phase commit conn.commit();
Global transactions might encapsulate transactional resources distribute across multiple JVMs. Global transactions are specified by Java Transaction API. Here is an example of a global transaction that spreads over two RDBMS:
Connection connToA = getConnectionToDatabaseA(); Connection connToB = getConnectionToDatabaseB(); UserTransaction ut = getUserTransaction(); //most likely from JNDI ut.begin(); //update things in A connToA.prepareStatement("update..."); //update things in B connToB.prepareStatement("update..."); //after commit both A and B were updated or none. This is achieved through a 2-phase-protocol ut.commit();
Do we need global or local?
Hibernate's transaction demarcation allows a user to choose to either use JDBC style transactions (local transactions) or use JTA style transactions. Hibernates encourages people towards JTA and declarative demarcation (more on Hibernate's transaction demarcation). This requires hotrod to be able to participate to participate in global transactions.
Further on, it is possible (and not difficult) to build local transactions on top of global transaction: Infinispan's batching API does just that internally.
//start local transactions cache.startBatch(); //do stuff ... //this would commit/rollback the local transaction cache.endBatch(true);
So in order to fully support integration with Hibernate, HotRod requires global transactions.
Support from Infinispan
Infinispan has support for global transactions. Internally it achieve this by registering an XAResource implementation to the transaction associated with the calling thread. Current implementation lacks failure recovery ISPN-272 (this will be addressed in 5.0).
A new level of intellingence(4th) would added for HotRod clients supporting transactions: transactional clients (intelligence byte 0x04).
This level of intelligence does not enforce another level of intelligence on the client (i.e. topology aware or distribution aware), but can be built on top of a basic client.
The basic design idea behind this is that HotRod client will bridge all transaction-controll opeartions to the Infinispan's XAResources instance.
New operations in HotRod
Transactional clients would implement following operations:
- BegginTx [tx_id length] [tx_id]
- PrepareTx [tx_id length] [tx_id]
- CommitTx [tx_id length] [tx_id]
- RollbackTx [tx_id length] [tx_id]
- tx_id lenght [vint] : number of bytes that make up transaction's id
- tx_id [byte arryay] : transaction's id
TODO - add recovery operations
These HotRod operations are "borrowed" from JTA's XAResouce interface. JTA itself is a mapping of the industry's standard X/Open XA, which means that HotRod client implementations other than Java could make use of this set of operations to integrate with specific transaction managers.
Associating existing operations with transactions
The association between a transaction and a HotRod operation is achieved through the HotRod header. This is sent on every operation by the client to the server and it contains the transaction id: if it is empty, then this isn't a transactional operation. See HotRod's request header for more details.
Operation behavior on the server side
Generally speaking, HotRod operations might be dispatched to multiple nodes. E.g.
//txBigin operation might go a node A tx.start(); //this might go to node B hrClient.put("k","v"); //this might go to node C hrClient.put("k2","v2"); //txCommit operation might go to node D txCommit()
In the previous example. the client sends 4 operations to HotRod cluster, all of them being directed to different servers in the cluster. This is something that is not currently suported by the Infinispan's transactions infrastracture: once a transaction originates on one node, all further operations within that transactions must be executed against the same node. Possible solutions to this problem are discuseed in following paragraphs.
One solution for achieving transaction locality is by making HotRod client connecting to the same HotRod server instance for all the operations associated to a transaction.The client keeps a map between the transaction id and the server on which the transaction runs: this way it won't need to hold the same TCP connection for the lifetime of a transaction but it can benefit from client's connection pooling. If the HotRod server that has the transaction fails, then a call on that transaction would fail as well. In future it would be possible to enhance the server to also support recovery in an such a situation, by replicating transaction information.
Run the transaction on a single node identified by consistentHash(tx_Id). E.g. in the previous code snipped:
- txBeggin goes to node A. Node A calculates consistentHash(tx_id) -> E and forwards the start transaction request to that node.
- put("k","v") goes to node B. Node B calculates consistentHash(tx_id) -> E and forwards the transactional put request to that node.
As an optimisation: the HotRod client might pipe all the requests for the same transaction to the node on which transaction resides, by calculating it's hash. If so, the scalability issue from Solution 1 would be solved.
Another issue to be considered is recovery from failure: what if E crashes before transaction is committed? This could be solved by replicating E's transaction to backup nodes: owners based on hash calculation (DIST) or the entire cluster (REPL). This functionality might be added afterwards.
Solution 1 seems better and simpler
- Java Transaction API (JTA)