Looks like client 94 also lost a session when node3 was stopped.
The perf03 log is showing missing responses to data gravitation requests. Due to timeout. Dominik, can you change the buddy replication config in the jboss-web-cluster.sar/META-INF/jboss-service.xml and see what happens:
We want 20 secs instead of 2 secs.
Manik, since this param drives the wait time for data gravitation, does it make sense to you that in general the value should be the same as SyncReplTimeout?
I'd thought it was only used for the group formation messages; even there 2 secs is probably too low.
Discussion of test failure reported in http://jira.jboss.com/jira/browse/JBAS-4766
At this point you got failures. The kill/restart of node0 went OK, but then when node1 was killed 2 of 100 sessions were lost.
Does that summarize it correctly?
Yes, that's correct. However, that's not the rule. During other test runs, sessions were lost even when first node failed.
Yep; I continued digging and found the issue discussed in my last post. That could crop up at any point.