1 2 Previous Next 23 Replies Latest reply on Aug 4, 2008 4:50 PM by manik

Custom data versions

manik Jul 10, 2008 7:46 AM

This came up while implementing handling custom data versions with MVCC in 3.0.0.

We currently allow the setting of custom data versions with optimistic locking via the Option API:

cache.getInvocationContext().getOptionOverrides().setDataVersion(customDataVersion);
cache.put("/a/b/c", "k", "v");

Now this is fundamentally broken in that the custom data version is only a single version and this is assumed to be applied on /a/b/c when the transaction commits. This breaks if, for example, /a/b does not exist in the cache and needs to be created as well - /a/b will be created with a default data version which may not be the intention.

Worse, assume /a/b does exist and also has a custom data version. If lockParentForInsertRemove is set, then we would expect to increment the parent's version as well. But again, only a single data version is passed in for the leaf node. In this case, what happens is that the parent's version is *not* incremented, breaking lockParentForInsertRemove semantics.

I think the problem really is a conceptual one, where you can't expect to pass in a single custom data version and have it make any sense on an API call that modifies more than 1 node. I know this API is important for certain use cases (Hibernate 2nd level cache), perhaps what we need is a richer API where we do something like:

Map<Fqn, DataVersion> customVersions = new HashMap<Fqn, DataVersion>();
customVersions.put(Fqn.fromString("/a/b", customVersion1);
customVersions.put(Fqn.fromString("/a/b/c", customVersion2);
cache.getInvocationContext().getOptionOverrides().setDataVersion(customVersions);
cache.put("/a/b/c", "k", "v");

And with this, we can tell precisely which version applies to which node. Nodes not mentioned in the customVersions map will default to DefaultDataVersions upon creation, or throw an exception upon attempting to increment if a custom version is present on the node but not passed in.

What do you think? Yes, I know, a more cumbersome API, but more correct IMO.

Thoughts?

1. Re: Custom data versions

manik Jul 10, 2008 12:17 PM (in response to manik)

BTW, the reason why custom data versions are important in Hibernate's use case is detailed below.

* Node in DB has v1
* Server1 writes v2 to the cache and the db.
* This results in an invalidation message clearing the state of the node in neighbouring Servers.
* Server2 does a read from the db before Thread1 completes. Gets v1.
* Server2 attempts to put v1 in the cache, but there is a race condition and this happens after Server1's invalidation message comes in.
* Server2's put succeeds and Server1's state is invalidated.

Caches now have v1 while the db has v2!!
Actions
2. Re: Custom data versions

manik Jul 10, 2008 12:21 PM (in response to manik)

Just to make things clearer, tombstones are maintained after an invalidation so as long as the *actual* versions are used (v1 and v2 in the example above, which would be custom DataVersion impls from the db) then we are ok.

The problem arises where we use DefaultDataVersions which are simple incremented numbers in each JVM.
Actions
3. Re: Custom data versions

brian.stansberry Jul 10, 2008 12:31 PM (in response to manik)
That sounds fine to me. In the Hibernate integration most (I think all) calls involve a long tree of structural nodes with a single "data" node at the bottom. Two different custom data versions, one for structural, one for the "data" node. It should be simple for me to implement an optimized Map for that use case; i.e. this approach shouldn't cost much and will eliminate a lot of hassles.

Do you want a Map, or perhaps an interface:

public interface DataVersionProvider { DataVersion getDataVersion(Fqn fqn); }

If that's the only call JBC is going to make, it would be much simpler to write an optimized version of that vs. Map, where to be correct I'd have to implement a bunch of operations that I don't think would be invoked.

We could provide a simple helper impl that takes a Map in the constructor and delegates to Map.get(Fqn).
Actions
4. Re: Custom data versions

sebersole Jul 10, 2008 12:34 PM (in response to manik)

For what it is worth, my initial thought was the interface as well...
Actions
5. Re: Custom data versions

brian.stansberry Jul 10, 2008 12:41 PM (in response to manik)

BTW, in the Hibernate integration, the "custom" data version for the structural nodes is one that never returns a version conflict; i.e. the structural nodes hold no data and changing the child map is not treated as a meaningful change, so there can be no meaningful version conflict.

This seems like a pretty general use case, so I opened a JIRA to provide such a DataVersion impl in JBC itself: http://jira.jboss.com/jira/browse/JBCACHE-1389
Actions
6. Re: Custom data versions

manik Jul 10, 2008 1:03 PM (in response to manik)

Now with MVCC I really don't have a need for data versions internally - the only reason I still use DataVersions is to support custom data versions from Hibernate.

I was thinking, maybe there is a way we can achieve the same consistency without the use of data versions?

Consider:

1. Server 1 wants to write V2
2. Server 2 wants to read, checks the cache, nothing there, and then decides to go to the DB
3. Server 2 reads V1 from DB
4. Server 1 writes V2 in DB
5. Server 1 puts V2 in cache, invalidates remote caches
6. Server 2 puts V1 in cache, invalidates remote caches

And this is the problem, right?

What if an additional step is introduced:

2.5. Server 2 knows it needs to go to the DB so it starts an isolated tx, does another cache.get() with a forceWriteLock option.

So now Server1's invalidate message will wait until server2 has read v1 and put it in the cache, then the invalidate message will remove it which is correct.

Since MVCC allows for non-blocking reads, Server2 will have to check the state of the cache again after acquiring the WL though to see if it still needs to go to the DB incase another reader had read from the DB and updated the cache. (Reminiscent of DCL, thankfully the same ills don't apply here).

What do you guys think?
Actions
7. Re: Custom data versions

brian.stansberry Jul 10, 2008 1:23 PM (in response to manik)

Before I get thinking too hard about where the isolated tx gets committed and what transaction is in effect in step 3.... an implementation problem is the code in step 2.5 that "knows it needs to go to the db" is in the core hibernate module, not the JBC integration layer. So some sort of addition would need to be made to the general 2nd level cache API (and implemented by all impls) to support the step 2.5 call. I'll let Steve comment on how he sees that.

That said, I understand where you're coming from on wanting to find a way to get rid of this. Will think about it.
Actions
8. Re: Custom data versions

jason.greene Jul 10, 2008 1:24 PM (in response to manik)

Hmm, the addition 2.5 would trigger a deadlock i think. Since both servers will try to aquire a lock on the same fqn. Although the end result would be correct. Implementing the deadlock detection alg would speed up failure as well.
Actions
9. Re: Custom data versions

manik Jul 11, 2008 7:30 AM (in response to manik)

That deadlock could happen even now. Eager deadlock detection would be good to get in though, but that is a separate issue.
Actions
10. Re: Custom data versions

brian.stansberry Jul 11, 2008 10:56 AM (in response to manik)

(Apologies in advance; I'm relying on recollection a lot in this post as the JBC refactor has erased my old knowledge of how to quickly verify how things work. Going to take some time to rebuild.)

Let me modify your use case a bit to reflect what actually happens:

6. Server 2 uses putForExternalRead to put V1 in cache, which with invalidation causes no cluster-wide traffic

In this case, from a simplistic point of view one of two things will happen on Server 2 in step 6:

a) the PFER arrives first, adds a node, and then the invalidation message invalidates it. This is fine.
b) the invalidation message from Server 1 arrives first, so the PFER call sees an existing (tombstone) node and immediately returns. IIRC, the tombstone has a limited life so some later PFER can succeed; i.e. there is no need for the PFER to analyze the tombstone's data version to see if it's allowed to resurrect it. So this seems fine.

I'm saying simplistic here because I'm assuming all the locking impls enforce the PFER semantic of never overwriting an existing node.

If we convert your case to use replication instead of invalidation, all remains the same except step 5 is a replication message and

6. Server 2 uses putForExternalRead to put V1 in cache, which generates a cluster-wide PFER:

In this case:

a) on Server 2 the PFER arrives first, adds a node, and then the replicated update message overwrites it. This is fine. With respect to the replicated PFER message on other nodes, either:
i) the PFER arrives first, and then the replicated update overwrites it
ii) the replicated update arrives first and the replicated PFER sees the existing node and aborts.

b) on Server 2 the replicated update message arrives first so the PFER sees an existing node and aborts.

That all seems fine as well.

I think the "abort if node exists" design of PFER serves us well here. Probably we need to think more about conflicts between updates rather than between PFER. Although in that case I would expect one or the other tx to fail before the JBC synchronization's beforeCompletion call.

Bottom line, I'm slooooowly moving toward thinking this DataVersion stuff is not needed for the Hibernate case. And fully expect to be proven wrong. :-) If having DataVersion isn't seriously perturbing the code base, I recommend leaving it in with a JIRA to remove it before 3.0.0.CR1/late beta. That serves notice to the community that it might go while giving us time to think long and hard about it.
Actions
11. Re: Custom data versions

manik Jul 11, 2008 1:06 PM (in response to manik)

Yup that pretty much sums it up. Concurrent updates will be dealt with exactly as they are dealt with right now.

Regarding releasing, it is easier than you think: DataVersions were a strictly Optimistic-Locking feature. With MVCC, we never made any promises that DataVersions would apply. Just as we ignore custom data versions with Pessimistic Locking, I expect to do the same with MVCC.

The problem with leaving versioning logic in MVCC is that it is pretty central. (Use VersionedNode instead of UnversionedNode, different write-skew check, different ways of handing custom data versions which may be unnecessary). I'd rather we leave this out (if the logic makes sense) and move on from a fragile and broken API. :-)
Actions
12. Re: Custom data versions

brian.stansberry Jul 11, 2008 1:33 PM (in response to manik)

Understood.

OK, all other 2nd level caching gurus, time to speak up or forever hold your peace! ;-)
Actions
13. Re: Custom data versions

manik Jul 15, 2008 2:36 PM (in response to manik)

Bump. Any other comments or thoughts? :-)
Actions
14. Re: Custom data versions

brian.stansberry Jul 15, 2008 2:54 PM (in response to manik)

Not from me, other than a vague thought that it would be good to use the testsuite of the Hibernate/JBC2 integration to test this as it develops. I'm not sure how much of a revamp of the integration that would take though; e.g. am I making any assertions about DataVersion values in the testsuite (hope not; seems unnecessary since I can store a "version" value inside the node's data itself.) For sure:

1) the OPTIMISTIC configs would have to be changed to MVCC (better yet add new MVCC configs and tests that use them.)
2) the calls that pass a DataVersion would have to be changed to not do that (or better yet the integration detects MVCC and uses the existing adaptors designed for PESSIMISTIC.)

I suppose a cache-jbosscache3 module could be added to Hibernate core at some point. We'd just have to be sure it doesn't get included in the main build of Hibernate core until it's ready and the main build is ready for it (i.e. don't add a "module" element for it to the top-level Hibernate core pom.)
Actions

1 2 Previous Next

Go to original post