-
1. Re: Eager state push on failure with Buddy Replication
brian.stansberry Mar 8, 2008 1:48 PM (in response to manik)The presence of a region can act as a "hint" about related data. Meaning a region root node represents the lowest level of unrelated data. For example, we have a region
/JSESSION/localhost/webapp1
In buddy backup we have:
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session1/attrA
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session1/attrB
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session2/attrA
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session2/attrB
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session3/attrA
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/session3/attrB
The background thread recognizes the existence of the
/JSESSION/localhost/webapp1 region and therefore starts iterating over the children of
/_BUDDY_BACKUP/CacheA:dead_0//JSESSION/localhost/webapp1/ migrating one child at a time.
I recognize this example is very much tailored to my particular use case, but actually in every JBC app I've written a region has that kind of meaning.
A "structural" node marker can be used instead of a region, and more cleanly indicates the meaning, since the region concept is so overloaded.
Re: usefulness, I think it's pretty necessary. With buddy groups / data partitions by default having 2 members, one member leaving means only 1 copy of data. Admins have to be very careful 1) to know what node has that backup data and 2) not to remove that node fom service until they are sure that data isn't needed any longer -- which typically means waiting a 1/2 hour or more. That means a simple rolling upgrade of a 4 node cluster takes over 2 hours, which is probably longer than a lot of service windows. Larger cluster takes longer. -
2. Re: Eager state push on failure with Buddy Replication
manik Mar 12, 2008 8:51 AM (in response to manik)Are regions replicated on the buddy backup as well? The region marker, I mean? Just wondering if the concept in itself still holds true.
-
3. Re: Eager state push on failure with Buddy Replication
brian.stansberry Mar 12, 2008 10:55 AM (in response to manik)I don't understand what you mean by "region marker". Are you referring to
A "structural" node marker can be used instead of a region, and more cleanly indicates the meaning, since the region concept is so overloaded.
?
If yes, AIUI the "structural" node marker concept doesn't really exist yet. There's the "resident" flag which IIRC is not replicated. -
4. Re: Eager state push on failure with Buddy Replication
manik Mar 12, 2008 12:08 PM (in response to manik)Well, either case really. In the case of the "resident" flag, this is not replicated.
Even if we use a traditional region (created using the RegionManager) I don't believe this is recreated on the buddy backup instance, since Regions are explicitly created on each instance. I could be wrong as I haven't checked the code yet, but I'm pretty sure regions aren't reflected in a buddy backup subtree. -
5. Re: Eager state push on failure with Buddy Replication
brian.stansberry Mar 12, 2008 12:22 PM (in response to manik)OK, so you're talking about a case where Node B is Node A's buddy, but whatever application created region /JSESSION/localhost/webapp1 on A hasn't deployed on B.
Yeah, that's a problem.
If the "resident" flag were replicated that would be a better solution. That would be a good thing anyway, although it adds cost to replication/invalidation messages. -
6. Re: Eager state push on failure with Buddy Replication
manik Mar 12, 2008 12:26 PM (in response to manik)the problem with the resident flag is that it could be used for anything - arbitrarily marking a node such that it doesn't get evicted, etc.
-
7. Re: Eager state push on failure with Buddy Replication
brian.stansberry Mar 12, 2008 1:02 PM (in response to manik)Yes, "structural node" != "resident". I mean "structural node flag" and should be disciplined and use the exact terminology. :-) You're right, substituting one concept for the other isn't correct.