The only bottleneck I can think of is on the transport layer (JGroups) when replicating to the same buddy node. Once on the buddy node, since you are talking about disjoint subtrees, there won't be any contention. Even this replication contention can be minimised if you are using async replication.
At the moment BR backs up the entire state of one node onto another node. In future (see Partitioning) will allow for different regions being backed up on different nodes.
The only bottleneck I can think of is on the transport layer (JGroups) when replicating to the same buddy node.
That's correct, and we're indeed seeing contention in the JGroups layer. As we're load testing on a couple of thousand accesses/replication per second, they become quite visible (and understandably so).
In future (see <a href="http://wiki.jboss.org/wiki/JBossCachePartitioning">Partitioning</a>) will allow for different regions being backed up on different nodes.
Ah, I didn't know. Thanks, I'll have a look at it over the weekend.
What version of JGroups are you using?
JGroups 2.6.2 with TCP transport (we had NAK/ACK issues with UDP).
I'm guessing then that you are using JGroups' concurrent stack. Not that that will help you very much in the scenario you painted though, since the concurrent stack only parallelizes messages from different senders, not from the same one.
I've actually started a discussion with the JGroups devs on how we can use the concurrent stack to parallelize work from the same sender.
That's correct. Also, the lock in question which prompted this post was on sending (as we're using REPL_ASYNCH) and we're going down the stack using one thread.