I am posting this message for Greg. He had earlier posted it to JBoss-user mailing list.
Can deadlocks occur because of improper connection sharing between transactions?
I have been working on a large project with several coworkers and have encountered a deadlock issue that needs to be resolved. While investigating this problem, I have come up with a fairly simple test that is capable of producing deadlock. This scenario may or may not be the same as what occurs in our large project, but I think they are related.
For my test environment, I'm using JBoss-3.2.1, j2sdk 1.4.1_02-b06, JDBC 3.0 - JSQLConnect 3.30 driver, and SQLServer 2000. The test uses CMP and CMT. The 'transaction-isolation' in xa-ds.xml has been set to 'TRANSACTION_SERIALIZABLE'. Also, all 'trans-attribute' tags in ejb-jar.xml have been set to 'Required'.
My test consists of a standalone java application (APP) that calls a stateless session bean (SLSB), which in turn makes 2 calls to an entity bean (EB). Besides using jndi to look up the various components at the appropriate times, the logic is as follows: (1) APP makes a single call to the SLSB; (2) SLSB calls EBHome.findByPrimaryKey(id); (3) SLSB calls EBRemote.setValue(id, value). Even though this isn't very complex, it can cause deadlock.
On one run where my test deadlocked, SQLServer Profiler revealed the following:
- All spids (if my understanding is correct, they represent a physical connection to SQLServer) for JBoss are created with isolation level serializable.
- Spid 53 is created and a transaction is started on it.
- 'SELECT COUNT(*) FROM my_table WHERE id=1' is issued on spid53. This creates a shared (S) lock on key PK_my_table. According to MS documentation, since the isolation level is serializable, the S lock will be held until the current transaction has finished.
- Spids 54, 55, 56, and 57 are created, but not used.
- Spid 58 is created.
- 'SELECT my_table.id,my_table.value FROM my_table WHERE id = 1' is issued on spid 58. This creates an S lock on key PK_my_table for spid 58. No problems yet, since the S lock on spid 53 and the S lock on spid 58 don't conflict. S locks allow reads to occur on other spids.
- 'UPDATE my_table SET value=100 WHERE id=1' is issued on spid 58. This is where things go wrong. Spid 58 now needs an exclusive (X) lock on key PK_my_table to carry out the update. However, this can't happen because spid 53 still has its S lock on that key. This behavior between S locks and X locks on different spids is documented by MS. Since spid 58 can't get its X lock because spid 53 won't release its S lock (it isn't being used anymore), deadlock occurs.
This test application does not always deadlock. If all of the work is done within one spid, then the shared (S) lock is converted into an exclusive (X) lock successfully, and the update is allowed.
I have seen similar deadlock behavior in our internal semi-production project. However, the deadlock might not occur for several hours. It does seem though that the problem is usually related to transactions and connections (spids) getting mixed up.
I have tried other jdbc drivers, SapDB, and set TrackConnectionByTx to true in the datasource configuration with similar tests as the one described above, but haven't had any luck with them either. They all deadlocked at some point. Does anybody know if TrackConnectionByTx is intended to handle the previously described deadlock scenario?
Any comments are welcome.