-
15. Re: is it possible for MVCCInvocationContext to have null mv
dukehoops Mar 12, 2009 2:29 PM (in response to dukehoops)Manik,
Your explanation makes sense, however I wonder whether that means Hibernate / Hib-JBC2 integration project / JBC3 stack is unusable with REPEATABLE_READ then. Let me explain.
Person, Child are cached entities, parent.children is a cached collection (using Hibernate's Transactional concurrency strategy).
Consider this method:public void addChild(parentId){ tx.begin(); Parent p = parentDAO.get(parentId); Set<Child> children = p.getChildren(); Child newChild = new Child(); children.add(newChild); tx.commit(); }
Let's look at what calls hib-jbc integration layer (org.hibernate.cache.jbc2.[collection|entity].TransactionalAccess) will issue for each of the above lines:
1. Parent p = parentDAO.get(parentId);
cache.get(parentIdFqn); //cache miss on very 1st call
//hibernate goes to db to retrieve data and then puts it into cache:
cache.pfer(parentIdFqn, p)
2. Set children = p.getChildren();
//look in collection cache for parent.children with given parentId
cache.get(parentIdCollectionFqn) //miss
//hibernate will look in db, and if db returns nothing, will put an empty collection into cache
cache.pfer(parentIdCollectionFqn, emptyCollection);
3. children.add(newChild);
Hibernate tells 2nd level cache to evict collection cache entry on any collection modification. Specifically above add call will result in removeNode call on tx.commit():
cache.removeNode(parentIdCollectionFqn); //should remove empty Collection from cache to ensure next time step #2 above is executed cache misses and we go to DB.
However, as you point out pfer() will operate in different tx than get() and removeNode(). Therefore, removeNode will not remove the empty collection. Now Node with parentIdCollectionFqn still has emptyCollection as value and so in a subsequent transaction:
tx.begin();
cache.get(parentIdCollectionFqn); //hibernate expects a miss so that new children can be re-read from db
tx.commit();
.. we will actually get a cache hit with stale emptyCollection, one that does not contain newChildId. So as far as Hibernate will be concerned, given parent will have no children. (even though newChild is in db).
So, I understand your explanation and agree that JBC code behaves correctly. But it seems like the problem is that when child is added to a collection in Hibernate, collection cache will see exactly this series of calls:
tx1:
-get()
-pfer()
-removeNode() //expected to ensure cache miss on the subsequent get()
tx2:
-get() //expect a miss
So that means that org.hibernate.cache.jbc2.collection.TransactionalAccess.putFromLoad() should use a Cache method that ensures that a subsequent Cache.removeNode will actually remove, i.e putFromLoad()
- either should use put() instead of pfer()
- or use a new method that is like pfer() but does not suspend TX.
Here's the updated test that I believe issues the same calls that org.hibernate.cache.jbc2.collection.TransactionalAccess would issue with the above service method:package org.jboss.cache.api; import org.jboss.cache.config.Configuration; import org.jboss.cache.config.Configuration.NodeLockingScheme; import org.jboss.cache.factories.UnitTestConfigurationFactory; import org.testng.annotations.Test; import javax.transaction.TransactionManager; import org.jboss.cache.*; import org.jboss.cache.lock.IsolationLevel; /** * Tests that a node that was putFromExternalRead and then removed in TX1 does NOT * get returned in subsequent TX2. * * @author nikita_tovstoles@mba.berkeley.edu * @since 3.0.3.GA */ @Test(groups = {"functional", "pessimistic"}, sequential = true, testName = "api.RemovedNodeResurrectionInSubsequentTxTest") public class RemovedNodeResurrectionInSubsequentTxTest extends AbstractSingleCacheTest { private static final Fqn A_B = Fqn.fromString("/a/b"); private static final Fqn A = Fqn.fromString("/a"); private static final Fqn A_C = Fqn.fromString("/a/c"); private static final String KEY = "key"; private static final String VALUE = "value"; private static final String K2 = "k2"; private static final String V2 = "v2"; protected NodeSPI root; protected TransactionManager txManager; public CacheSPI createCache() { CacheSPI myCache = (CacheSPI<Object, Object>) new UnitTestCacheFactory<Object, Object>().createCache(UnitTestConfigurationFactory.createConfiguration(Configuration.CacheMode.LOCAL, false), false, getClass()); myCache.getConfiguration().setCacheMode(Configuration.CacheMode.LOCAL); myCache.getConfiguration().setCacheLoaderConfig(null); myCache.getConfiguration().setNodeLockingScheme(NodeLockingScheme.MVCC); configure(myCache.getConfiguration()); myCache.start(); root = myCache.getRoot(); txManager = myCache.getTransactionManager(); return myCache; } protected void configure(Configuration c) { c.setIsolationLevel(IsolationLevel.REPEATABLE_READ); } public void testPferAndEvictInSameTx() throws Exception { Object val = null; cache.getRoot().addChild(A); txManager.begin(); val = cache.get(A_B, KEY); //N1 IF THIS LINE IS EXECUTED, TEST FAILS BELOW AT N2 //put in cache cache.putForExternalRead(A_B, KEY, VALUE); //val = cache.get(A_B, KEY); //assert val != null : "get() after pfer() returned " + val; //N2 THIS WILL FAIL IF LINE N1 (ABOVE) IS EXECUTED (NOT COMMENTED OUT) //evict from cache cache.removeNode(A_B); //sometimes MVCCInvocationContext has ref to mvccTCtx, sometimes NOT //verify eviction //val = cache.get(A_B, KEY); //assert val == null : "get() after evict() returned " + val; txManager.commit(); txManager.begin(); //verify miss val = cache.get(A_B, KEY); assert val == null : "get() after tx.commit returned " + val; txManager.commit(); } }
and corresponding failure expected from JBC core perspective but NOT from Hibernate perspective:java.lang.AssertionError: get() after tx.commit returned value
at org.jboss.cache.api.RemovedNodeResurrectionInSubsequentTxTest.testPferAndEvictInSameTx(RemovedNodeResurrectionInSubsequentTxTest.java:78)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619) -
16. Re: is it possible for MVCCInvocationContext to have null mv
manik Mar 17, 2009 7:18 AM (in response to dukehoops)You have a good point. Perhaps as a recommendation, READ_COMMITTED should be used *if* using JBC3, Hibernate + MVCC.
-
17. Re: is it possible for MVCCInvocationContext to have null mv
brian.stansberry Mar 17, 2009 11:01 AM (in response to dukehoops)READ_COMMITTED is the recommended isolation level for the Hibernate Second Level Cache use case[1], since the Session itself acts as a R_R cache. The reason you'd use R_R in JBC is if your app was calling Session.evict() or Session.clear() and then you wanted to re-read an entity with R_R semantics.
Manik, why is the removeNode call not removing the node? Is the MVCC code detecting that the node didn't exist when the get() call happened and therefore turning the removeNode into a no-op?
dukehoops, there's also a race at work here if there are multiple threads involved. See http://opensource.atlassian.com/projects/hibernate/browse/HHH-3817. To deal with that I think I need something more sophisticated in the Hibernate/JBC layer such that a call to org.hibernate.cache.Cache.evict(key) results in pending putForExternalRead calls being ignored. Basically turn get() + pfer() into a logical single operation that gets aborted if an evict happens in the middle. That's not so nice though, adds overhead.
[1] See page 13 of http://www.jboss.org/file-access/default/members/jbossclustering/freezone/docs/hibernate-jbosscache-guide-3.pdf -
18. Re: is it possible for MVCCInvocationContext to have null mv
manik Mar 17, 2009 11:26 AM (in response to dukehoops)"bstansberry@jboss.com" wrote:
Manik, why is the removeNode call not removing the node? Is the MVCC code detecting that the node didn't exist when the get() call happened and therefore turning the removeNode into a no-op?
Yes. This is consistent with R_R semantics in that you only modify data that you read. Now if write skew detection was enabled (by setting writeSkewCheck to true) this case would throw a DataVersioningException.
See [1] for details on write skews. And contrary to what the docs say, writeSkewCheck is false by default.
[1] http://www.jboss.org/file-access/default/members/jbosscache/freezone/docs/3.0.3.GA/userguide_en/html/transactions.html#transactions.locks.mvcc (See section on write skews) -
19. Re: is it possible for MVCCInvocationContext to have null mv
brian.stansberry Mar 17, 2009 11:48 AM (in response to dukehoops)You only modify data that you read? What if I just call removeNode() with no preceding get()?
Well, anyway, that makes it clear that MVCC + R_R is not usable for Hibernate Second Level Caching, at least not without a major rewrite of the integration layer. -
20. Re: is it possible for MVCCInvocationContext to have null mv
manik Mar 17, 2009 11:59 AM (in response to dukehoops)"bstansberry@jboss.com" wrote:
You only modify data that you read? What if I just call removeNode() with no preceding get()?
Then the read happens at the time of the removeNode(). I.e., the node to be removed is read from the data structure, then marked as removed and finally removed when the tx completes."bstansberry@jboss.com" wrote:
Well, anyway, that makes it clear that MVCC + R_R is not usable for Hibernate Second Level Caching, at least not without a major rewrite of the integration layer.
Actually, this should actually be viewed as a bug/inconsistency in the way write skews are currently handled, since:
1. A write skew involving modifying data will overwrite any intermediate data and the last commit wins.
2. In the case of a removeNode(), a write preceeding the remove will win.
To be consistent, with the same "last tx wins" approach to dealing with write skews, if write skew checking is disabled a removeNode() should remove *any* node present under the given Fqn, not necessarily the node read earlier in the same tx. -
21. Re: is it possible for MVCCInvocationContext to have null mv
brian.stansberry Mar 17, 2009 12:06 PM (in response to dukehoops)Good; thanks for thinking hard about this. :-) Something about not executing the remove seemed wrong; it was that inconsistency versus the modification.
-
22. Re: is it possible for MVCCInvocationContext to have null mv
dukehoops Mar 17, 2009 12:21 PM (in response to dukehoops)I am unsure what "inconsistency" means as relating to whether the software works as designed, but IMO this is a bug in the hib-jbc2 integration layer. Please note that if cache.evict() were used instead of removeNode() same problem would arise.
Note that the workaround involved not only using READ_COMMITTED in jbc config but also ensuring that Datasource being enlisted as XA Resource was configured to READ_COMMITTED as well. -
24. Re: is it possible for MVCCInvocationContext to have null mv
dukehoops Mar 17, 2009 1:15 PM (in response to dukehoops)Manik,
Thanks for posting the JIRA. However, at the risk of appear to be a nit-picker, I don't think JBCACHE-1493 captures the extent of the problem as it pertains to R_R, and Hibernate - JBC3 interaction. Specifically, given an application that uses Hibernate and JBC3 (MVCC) as 2nd level cache in Transactional (as opposed to read-only) mode:
1. the pattern I described in the post linked below is not an edge case:
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4217616#4217616
Basically, this will take place whenever app logic is such that the 1st time a collection is accessed the intent is to alter it.
2. as far as I can tell, enabling writeSkewCheck will merely result in data version exceptions being thrown in this use case - making collection un-cacheable if app does a write on 1st access.
Given the above pattern, and sequence of calls that HIbernate Core issues in case of Collections I think the solution is not to use pfer() in org.hibernate.cache.jbc2.collection.TransactionalAccess.putFromLoad() if R_R is in use.
-nikita -
25. Re: is it possible for MVCCInvocationContext to have null mv
brian.stansberry Mar 17, 2009 1:29 PM (in response to dukehoops)The idea of JBCACHE-1493 is not to enabled writeSkewCheck, it is to ensure that the removeNode call actually removes the node, whether or not the node existed when the tx earlier called get(). As a result the empty collection will be removed.
Note I'm not disagreeing with you that this needs to be handled better in the Hib/JBC integration. Hence http://opensource.atlassian.com/projects/hibernate/browse/HHH-3817 and perhaps others.
Just using an ordinary JBC put() to handle TransactionalAccess.putFromLoad() will not work because you'll get lock conflicts across the cluster when the call replicates."dukehoops" wrote:
Note that the workaround involved not only using READ_COMMITTED in jbc config but also ensuring that Datasource being enlisted as XA Resource was configured to READ_COMMITTED as well.
Can you comment more on why making the datasource READ_COMMITTED as well was necessary? -
26. Re: is it possible for MVCCInvocationContext to have null mv
dukehoops Mar 17, 2009 1:49 PM (in response to dukehoops)I see. Makes sense.
Can you comment more on why making the datasource READ_COMMITTED as well was necessary?
I don't know - puzzled as well. But it absolutely was the case that my multithreaded load tests kept failing until I made that change, and haven't failed since. -
27. Re: is it possible for MVCCInvocationContext to have null mv
jason.greene Mar 17, 2009 5:06 PM (in response to dukehoops)"dukehoops" wrote:
I see. Makes sense.Can you comment more on why making the datasource READ_COMMITTED as well was necessary?
I don't know - puzzled as well. But it absolutely was the case that my multithreaded load tests kept failing until I made that change, and haven't failed since.
Most DBs substitute SERIALIZABLE with REPEATABLE_READ. So if you make your DS RR you have likely just moved to single threaded access. -
28. Re: is it possible for MVCCInvocationContext to have null mv
dukehoops Mar 17, 2009 5:13 PM (in response to dukehoops)I am using mysql 5.1 with INNODB. the db reports iso as RR
-
29. Re: is it possible for MVCCInvocationContext to have null mv
manik Mar 18, 2009 7:50 AM (in response to dukehoops)"dukehoops" wrote:
1. the pattern I described in the post linked below is not an edge case:
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4217616#4217616
I agree that the pattern is not an edge case for Hib/JBC3 integration, but it is an edge case for JBC3 in general. Rest assured, just because I refer to it as an edge case doesn't mean it isn't important and won't be fixed. :-)
In fact, if you want to build JBC3 trunk and give it a whirl, I'd appreciate the feedback.