0 Replies Latest reply on Feb 15, 2005 8:42 PM by zalle_cool

Please Help!!!

zalle_cool Feb 15, 2005 8:42 PM

Hello there,

It is the first time I am dealing with JBoss clustering using JGroups. Basically, I have been recently handed over a live application running on JBoss 4.0 platform using Jboss-cache and Hibernate. This application is deployed in a clustered environment with 4 nodes, running on separate Win 2000 Adv.Svr boxes.

Everything was running fine until I changed the ip address mapping of the smtp host (used by the nodes) in the hosts files and restarted the machines. Following that I keep getting this error message in the JBoss log files:

2005-02-16 01:08:46,500 WARN [caw.util.hibernate.cache.JBossTreeCacheService] No transaction manager lookup class has been defined. TX will be null
2005-02-16 01:08:46,531 INFO [caw.util.hibernate.cache.JBossTreeCacheService] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
2005-02-16 01:08:46,531 INFO [caw.util.hibernate.cache.JBossTreeCacheService] cache mode is REPL_ASYNC
2005-02-16 01:08:47,500 INFO [STDOUT]
-------------------------------------------------------
GMS: address is primary:1079
-------------------------------------------------------
2005-02-16 01:08:54,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:03,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:12,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:21,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:30,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:39,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying
2005-02-16 01:09:48,593 WARN [org.jgroups.protocols.pbcast.ClientGmsImpl] handleJoin(primary:1079) failed, retrying

Below is the jboss-cach and cluster configuration:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE server>

jboss:service=Naming
jboss:service=TransactionManager

java:/TreeCache





NONE


REPL_ASYNC











false


1000


20


true


15000


10000


15000


TreeCache-Cluster






<UDP mcast_addr="228.1.2.3" mcast_port="12233"
ip_ttl="64" ip_mcast="true"
mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
bind_addr="primary.nic.b3xt.host"
loopback="true"/>
<PING timeout="2000" num_initial_members="3"
/>
<MERGE2 min_interval="10000" max_interval="20000"/>

<FD_SOCK/>
<VERIFY_SUSPECT timeout="1500"
up_thread="false" down_thread="false"/>
<pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
max_xmit_size="8192" up_thread="false" down_thread="false"/>
<UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
down_thread="false"/>
<pbcast.STABLE desired_avg_gossip="20000"
up_thread="false" down_thread="false"/>
<FRAG frag_size="8192"
down_thread="false" up_thread="false"/>
<pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
shun="true" print_local_addr="true"/>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>

It seems like the restarted nodes are trying to joing the cluster but are being unable to. Can someone point me in the direction to further localize this problem. I need to have this up and running before morning, so am pretty desparate.

Please Help!!

Many thanks
Zalle