Delayed operation optimization in PersistentBag and PersistentList can corrupt 2nd level cache
eugene75 Apr 18, 2013 9:33 AMThe delayed operation optimization in PersistentBag and PersistentList can corrupt the second level cache under specific conditions.
Assume we two entities Parent and Child with a bi-directional one-to-many from Parent to Child with 2nd level caching enabled for both entities and both sides of the relationship and at the beginning of the transaction the 2nd level cache is empty. The transaction flow as follows will result in a corrupt 2nd level cache:
// add a new Child to Parent
// the add operation is added to the delayed queue and the collection is not initialized or marked as dirty
Child newChild = new Child();
newChild.setParent(parent);
parent.getChildren().add(newChild);
// do something that causes an entity manager flush, either explicit call or a query
// this causes newChild to be written to the database
entityManager.flush();
// remove an existing child from parent
// the remove() call will cause the collection to get initialized from the database, which includes the newly added child
// because the collection does not think it is dirty, it sticks the relationship data in the 2nd level cache
Child oldChild = parent.findChild("name");
oldChild.setParent(null);
parent.getChildren().remove(oldChild);
entityManager.remove(oldChild);
// at this point if the transaction fails, the 2nd level cache is not cleaned up and the Parent->Child relationship contains an ID that does not exist in the database
// even if the transaction eventually succeeds, there is a period of time when the 2nd level cache contains uncommitted data not visible to any other transaction
The code in AbstractPersistentCollection.afterInitialize() that controls whether or not a collection is added to the 2nd level cache does check the delayed operation queue and block the 2nd level cache put if it is non-empty. But it does account for the fact that a previous flush() within the same transaction may have cleared the delayed operation queue. It seems to me that once a collection has been modified in the course of a transaction, it should veto the 2nd level cache put for the duration of that transaction.
Our workaround is to provide our own UserCollectionType that forgos use of the delayed operation queue or overrides the endRead() method to veto caching.