JBoss AS 5.1: How cluster merge views and state transfer when node re-join?
hhiep May 22, 2012 8:48 AMHi everyone,
I'm new to this forum and firstly I want to say "Hi" to all, :-)
I have a question about JBoss AS 5.1: How JBoss AS 5.1 cluster does the merge and state transfer after a node re-join (this node is disconnect because of network off and then re-join when network is on again)?
After searching in this forum and google for days, I just find something:
1. In JBoss 5.1 cluster, JGroups will handle cluster partitions when network on-off. JGroups will auto detect cluster view and merge views after network is on again.
+ But how does JGroups determine which sub-cluster is the primary partition or which node is the coordinator?
The partition which has more nodes? or the oldest node?
I also see that the node that "the coordinator is the member who has been up the longest." (https://community.jboss.org/wiki/JGroupsMERGE2)
But when I test with cluster includes 2 machines, after merge view the coordinator is not the machine which is up longer.
+ Is there any configuration in JBoss AS 5.1 about JGroups rules to select the coordinator?
2. After merge view finished, state transfer will process. How does Jboss-cache replicate data between caches in each node?
Is it true that cache in nodes (not in primary partition) will be flush and then be copied from the cache in primary partition?
I also find some discussions about: evict cache, evict cluster state... but how to config this in JBoss AS 5.1?
3. When state transfer happens after merge view, can we detect when state transfer finished? (to reload data from cache)
I use CacheListener and check for CacheUnblockedEvent with IsPre() == NO (this is notified after state trasfer finished) but this event may be notified more than 1 time and I can not determine which node must be reloaded. (only need to reload not up-to-date node)
About my test:
- Jboss AS 5.1, windows 7, 2 nodes: A (192.168.1.101) and B (192.168.1.110).
- Apache mod_jk on node A to be load balancer
- Cache is created by DefaultCacheFactory with: IsolationLevel.READ_COMMITTED, CacheMode.REPL_SYNC
- Steps:
+ Start node A
+ Start node B=> form a cluster 2 nodes
+ call service through apache on node A=> cache on all node will be updated
+ Turn off network on node A
+ call service through apache on node A=> cache on node A will be updated
+ Turn on network on node A => cluster auto detect and merge to a cluster with 2 nodes: A and B.
+ After receive new view and state transfer finished, cache data is the same as cache in node B (which is not updated and is not oldest node)
(log files is attached)
Would you give me some suggestions?
Thank you,
Hiep
-
node 110.txt.zip 10.3 KB
-
node 101.txt.zip 10.1 KB