1 Reply Latest reply on Sep 1, 2011 4:13 PM by belaban

Nodes not joining cluster

benze Aug 26, 2011 5:20 PM

After several frustrating days, I finally got JDBC_PING seemingly working. Apparently JBoss AS6 / 6.1 ships with JGroups 2.11, which does not properly support JDBC_PING. I upgraded server/all/lib/jgroups.jar to JGroups 2.12.2.Final and now my DB connection is created any my table is populated with cluster information.

I have two machines both communicating via the same JDBC connection properties, and both have access to the DB, and yet I don't get any connection between the two machines. There are no firewalls on either machine, so I would expect them to behave properly.

However, on startup, I get the following log messages:

{noformat}
16:58:39,702 INFO  [STDOUT] 
16:58:39,702 INFO  [STDOUT] -------------------------------------------------------------------
16:58:39,703 INFO  [STDOUT] GMS: address=ip-10-113-51-56.ec2.internal:1099, cluster=DefaultPartition-HAPartition, physical address=10.113.51.56:57723
16:58:39,703 INFO  [STDOUT] -------------------------------------------------------------------
16:58:39,814 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] Received new cluster view: [ip-10-113-51-56.ec2.internal:1099|0] [ip-10-113-51-56.ec2.internal:1099]
16:58:40,105 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] Cache local address is ip-10-113-51-56.ec2.internal:1099, physical addresses are [10.113.51.56:57723]
16:58:40,106 INFO  [org.infinispan.factories.GlobalComponentRegistry] Infinispan version: Infinispan 'Ursus' 4.2.0.FINAL
16:58:40,140 INFO  [org.infinispan.jmx.ComponentsJmxRegistration] Could not register object with name: org.infinispan:type=Cache,name="distributed-state(repl_sync)",manager="ha-partition",component=Cache
16:58:40,140 INFO  [org.infinispan.jmx.CacheJmxRegistration] MBeans were successfully registered to the platform mbean server.
16:58:40,140 INFO  [org.infinispan.factories.ComponentRegistry] Infinispan version: Infinispan 'Ursus' 4.2.0.FINAL
16:58:40,148 INFO  [org.jboss.ha.ispn.DefaultCacheContainerFactory] Started "distributed-state" cache from "ha-partition" container
16:58:40,171 INFO  [org.jboss.ha.framework.server.ClusterPartition.DefaultPartition] Number of cluster members: 1
16:58:40,172 INFO  [org.jboss.ha.framework.server.ClusterPartition.DefaultPartition] Fetching initial service state (will wait for 30000 milliseconds for each service):
{noformat}

And no members are added to the cluster.

If I let the server run some more, I see the node keeps adding and removing itself from the DB. I assume that this is the "ping" portion. Although I have to wonder why it keeps deleting and adding as opposed to a simple update.

{noformat}

17:16:21,357 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Removed c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition from database.

17:16:21,362 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Registered c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition into database.

17:16:22,504 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Removed c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition from database.

17:16:22,523 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Registered c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition into database.

17:16:23,171 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Removed c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition from database.

17:16:23,235 DEBUG [org.jgroups.protocols.JDBC_PING] org.jgroups.protocols.JDBC_PING Registered c1075ad6-8fa6-7969-9c42-8f620edc0779 for clustername DefaultPartition-HAPartition into database.

{noformat}

Am I missing something? Why am I not getting additional nodes in my cluster? How/where can I start debugging this?

Thanks,

Eric

1. Re: Nodes not joining cluster

belaban Sep 1, 2011 4:13 PM (in response to benze)

Note that Eric found out that a firewall prevented nodes from talking to each other. Once the firewall had rules to let JGroups traffic through, everything worked.
Actions