On Solaris x86 5.10
We have production JBoss deployment where we are running it in three different configurations -
Cluster 1: 2 AS nodes - JMS hosting (just Queue and Topics registered here)
Cluster 2: 2 AS nodes - Core Java service (Remote method invocations over JMS mostly)
Cluster 3: 3 AS nodes - Enterprise java services hosting mostly entity beans, one stateful session bean and a few stateless session beans
Cluster 2 is a purely symmetrical cluster. Cluster 3 on the other hand is not as some state, which is not transferable amongst servers, is kept. For some reason one or another of the Cluster 3 nodes would crash for no reason silently. Cluster 2 (or 1) never have such problems.
There is no stack and no symptoms as to why this is happening. We only know when the users complain about losing connectivity. We tend to blame it on the garbage collection sometimes as this happens around the time Java concurrent GC's promotion fails and a stop the world collection happens.
Is this a known issue? If not, would anyone suggest ideas to debug this weird occurrence. Every week we will have at least one of the three servers going down.