9 Replies Latest reply on Jun 1, 2006 2:54 AM by ben.wang

Option to disable DataGravitation

brian.stansberry May 23, 2006 12:12 AM

The DataGravitatorInterceptor does a clustered get any time a call comes through the interceptor chain referring to a node that doesn't exist in the local cache. There will probably be situations where the caller doesn't want that behavior and having an Option to disable it would be good. A couple that come to mind:

1) As part of integrating state transfer I'm doing a put("_BUDDY_BACKUP_/192.168.1.2", null, null) in order to create the root node of the buddy backup tree. This is causing 2 clustered gets. I just need to create a node; I know there is no data, so I don't want to do a clustered get.

2) PojoCache when it breaks down an object into nodes is for sure not going to want to do a call across the cluster for each put.

3) Situations like HttpSession repl, where the application doing the put knows it owns the data.

As I write this, I'm thinking maybe a global config option for the cache would be useful as well; i.e. saying "don't do a clustered get() as part of a put()".

1. Re: Option to disable DataGravitation

manik May 23, 2006 11:30 AM (in response to brian.stansberry)

There is a config option already to disable data gravitation altogether although I suspect this isn't what you're after.

Let me have a think and I'll post again in a bit ...
Actions
2. Re: Option to disable DataGravitation

manik May 26, 2006 6:20 AM (in response to brian.stansberry)

Ok, fair enough, you've convinced me.

http://jira.jboss.com/jira/browse/JBCACHE-637
Actions
3. Re: Option to disable DataGravitation

manik May 26, 2006 12:11 PM (in response to brian.stansberry)

http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossCacheOptionsAPI
Actions
4. Re: Option to disable DataGravitation

brian.stansberry May 31, 2006 1:38 AM (in response to brian.stansberry)

With the suppressDataGravitation Option, the behavior is now to do a _gravitateData call any time there is a cache miss, unless specifically overruled via the Option.

Ben and I were having a chat tonight and we realized that in many use cases of BR, the desired behavior is really the opposite -- we never want to gravitate data unless an Option (say "enableDataGravitation") is set to true.

For example, with HttpSession repl, there is a single call early in the request cycle where we check the cache to see if the sesssion is there. If not, we want to gravitate it. Thereafter, while handling that request there are dozens of places where we could do get/put/remove calls on the session's subtree; in none of them do we want to gravitate data.

It's much easier to have gravitation disabled by default and then enable it for the single call than it is to disable it in all the other calls.

Similar thing generally applies with the implementation of putObject() calls -- a single get() call at the beginning where gravitation would be desirable, followed by numerous get/put/remove calls where it is not.

How about we add "enableDataGravitation"? Option.suppressDataGravitation turns off gravitation if buddyManager.isEnabledDataGravitation() == true. Option.enableDataGravitation does the opposite.

I've coded this up to see if it resolve unit test failures I'm seeing with FIELD granularity repl.
Actions
5. Re: Option to disable DataGravitation

manik May 31, 2006 3:55 AM (in response to brian.stansberry)

How does this help transparent failover?

Server A dies. The load balancer redirects the request to Server B. How would Server B know to add the 'enableDataGravitation' option for this call?
Actions
6. Re: Option to disable DataGravitation

ben.wang May 31, 2006 4:22 AM (in response to brian.stansberry)

Ideally, the aspect of BR should not propagate to user code (like http session or ejb3). However, we know that for efficiency reason, it won't work. A chatty put/get on new sub-trees will emit way too many remote calls. So this forces cache user to be aware of the performance consequences.

Typically in PojoCache, I do something like this:

1. Lock the whole sub-tree by do a cache.put(parentFqn, key, value)

2. Do operations for putObject, removeObject, and getObject etc. that have many finer grained put/get calls.

As a result, I only need to graivatate it once (if necessary) at step #1 at the parent fqn level.

If I need to do the other way around, I will have to litter my code with the supressDataGravitation off.
Actions
7. Re: Option to disable DataGravitation

manik May 31, 2006 8:45 AM (in response to brian.stansberry)

I agree that the chattiness will be a bad thing, but if we need to explicitly state when we want to gravitate data, I think this could affect the transparency of the failover. A node will have to know that a request is coming in which was originally for a failed node to be able to handle it accurately.
Actions
8. Re: Option to disable DataGravitation

brian.stansberry May 31, 2006 12:34 PM (in response to brian.stansberry)

No, not if the application follows the pattern I described.

1) When request comes in, do a get() with gravitation enabled for the root of the relevant subtree (e.g. the session). No need to know if it's failover or not; if it is you'll gravitate the data.
2) Throughout the *rest of the processing of that request*, freely make changes in the subtree with gravitation off, safe in the knowledge that the initial get() made your server the owner of that subtree.

BR will only work properly (or at least efficiently) if a single server owns the data. That means we are talking about things like sessions. I believe the pattern above will be a very common one for the session use case -- an initial get of the session from the cache followed by numerous changes to the session.
Actions
9. Re: Option to disable DataGravitation

ben.wang Jun 1, 2006 2:54 AM (in response to brian.stansberry)

The other thing to keep in mind is that data graivation should happen rarely (i.e., only during a failover). To require a cache.put of new node to broadcast the request always maybe too expensive.
Actions

Go to original post