1 Reply Latest reply on Apr 4, 2013 5:37 AM by bevans88

    JGroups dropped messages when splitting/adding to cluster.

    bevans88

      Hi,

       

      We have an application that is used to distribute data around an Infinispan cluster. Our application requires us to be able to configure numerous clusters and to be able to connect/disconnect them when required by the user, and to also be able to 'split' existing clusters into smaller clusters that have no knowledge of each other. After having tested this we are currently experiencing problems, as explained in the following:

       

      • A cluster is created which consists of 2 workstations –WS1 & WS2 - , with a node on each. This will be referred to as Cluster 1.
      • This cluster forms as expected, with no log messages out of the ordinary.
      • A cluster is created which consists of 4 workstations – WS3, WS4, WS5 & WS6 -, with a node on each. This will be referred to as Cluster 2.
      • This cluster forms as expected, with no log messages out of the ordinary.

       

      • At this point Cluster 1 and Cluster 2 have no knowledge of each other.

       

      • WS1 and WS2 add WS3 to the initial_hosts in the tcp.xml and then restart the application. This causes JGroups and Infinispan to also restart.
      • This causes no problems.

       

      • WS3, WS4, WS5 & WS6 add WS1 to the initial_hosts in the tcp.xml and then restart the application. This causes JGroups and Infinispan to also restart.
      • This causes no problems.

       

      • WS3 & WS4 now remove WS5 & WS6 from there initial_hosts in the tcp.xml.
      • WS5 & WS6 now remove WS3 & WS4 from there initial_hosts in the tcp.xml and then restart the application. This causes JGroups and Infinispan to also restart.

       

      This causes our application to fail to start on all nodes due to “no physical address for <uuid>, dropping message” errors.

       

      I have attached our tcp.xml file (this is missing the <config> element due to it being typed across from a network with no internet access, however it does contain all of the configuration elements that we are using).

       

      Any help would be much appreciated.

       

      Thanks,

       

      Brent