New member servers loop through socket address errors
fahrv Nov 8, 2005 1:29 PMMy situation is this: I was given an install howto, an existing cluster of four JBoss 4.0.2 servers, and was told to add 4 new servers to the existing cluster.
However, when the new servers come up, they just keep looping through the same errors in the treecache. In this log snippet, taken from server35p, server35p is the new server I'm trying to bring online, and server07p is the existing master.
2005-11-08 13:19:03,386 ERROR [org.jgroups.protocols.FD_SOCK] socket address for server07p:43430 could not be fetched, retrying 2005-11-08 13:19:11,694 ERROR [org.jgroups.protocols.FD_SOCK] socket address for server07p:43430 could not be fetched, retrying 2005-11-08 13:19:20,002 ERROR [org.jgroups.protocols.FD_SOCK] socket address for server07p:43430 could not be fetched, retrying 2005-11-08 13:19:31,978 WARN [org.jgroups.protocols.pbcast.GMS] checkSelfInclusion() failed, server35p:33178 is not a member of view [server07p:43430|48] [server07p:43430, server21p:38164, server22p:37383]; discarding view 2005-11-08 13:19:31,978 WARN [org.jgroups.protocols.pbcast.GMS] I (server35p:33178) am being shunned, will leave and rejoin group (prev_members are [server07p:43430 server21p:38164 server22p:37383 server35p:33178 ]) 2005-11-08 13:19:32,793 INFO [STDOUT] ------------------------------------------------------- GMS: address is server35p:33181 ------------------------------------------------------- 2005-11-08 13:19:32,797 INFO [org.jboss.cache.TreeCache] viewAccepted(): new members: [server07p:43430, server21p:38164, server22p:37383, server35p:33181] 2005-11-08 13:19:35,798 ERROR [org.jgroups.protocols.FD_SOCK] received null cache; retrying 2005-11-08 13:19:39,302 ERROR [org.jgroups.protocols.FD_SOCK] received null cache; retrying 2005-11-08 13:19:42,806 ERROR [org.jgroups.protocols.FD_SOCK] received null cache; retrying 2005-11-08 13:19:43,310 INFO [org.jboss.cache.TreeCache] received the state (size=192 bytes) 2005-11-08 13:19:43,310 INFO [org.jboss.cache.TreeCache] transient state: 140 bytes 2005-11-08 13:19:43,310 INFO [org.jboss.cache.TreeCache] setting transient state 2005-11-08 13:19:43,311 DEBUG [org.jboss.cache.lock.IdentityLock] Cache instance is null. Use default lock strategy 2005-11-08 13:19:43,311 INFO [org.jboss.cache.TreeCache] locking the old tree 2005-11-08 13:19:43,311 INFO [org.jboss.cache.TreeCache] locking the old tree was successful 2005-11-08 13:19:43,311 INFO [org.jboss.cache.TreeCache] setting the transient state was successful 2005-11-08 13:19:43,311 INFO [org.jboss.cache.TreeCache] forcing release of all locks in old tree 2005-11-08 13:19:49,615 ERROR [org.jgroups.protocols.FD_SOCK] socket address for server07p:43430 could not be fetched, retrying
Meanwhile on server07p I see:
2005-11-08 13:24:22,965 DEBUG [org.jboss.webservice.handler.HandlerChainBaseImpl] Enter: handleResponse 2005-11-08 13:24:22,965 DEBUG [org.jboss.webservice.handler.HandlerChainBaseImpl] Exit: handleResponse with status: true 2005-11-08 13:24:28,810 INFO [org.jboss.cache.TreeCache] viewAccepted(): new members: [server07p:43430, server21p:38164, server22p:37383] 2005-11-08 13:24:29,627 INFO [org.jboss.cache.TreeCache] viewAccepted(): new members: [server07p:43430, server21p:38164, server22p:37383, server35p:33190]
Except for the bind_addr, the /server/all/deploy/cluster-service.xml file is the same across all systems. It's just the default file, although I'll paste it if asked.
Testing connections by telnetting to server07:43430 from server 35p doesn't work. Should it?
If you guys can even point me toward the right documentation, I'd be thrilled. I've never worked with JBoss before this week, so I apologize in advance for my ignorance on this subject.