-
1. Re: Strange behaviour with 3 nodes
slaboure Jan 6, 2002 5:34 AM (in response to robster)Hello,
Thank you for your report. Could you please provide us with a JBoss log file (in debug mode) or something equivalent that could help us to discover what is going wrong? You could also put this on Sourceforge as a bug.
Thank you! Cheers,
Sacha -
2. same problem
edwu00 Jan 7, 2002 1:24 PM (in response to robster)Only 2 nodes can work on the 3.0.0-alpha.
I read the log files. the 3rd node
can never join the partition.
Then when you try to kill the jboss by control-C.
There is dead lock since it still tries to join the
partition. I have to use kill -9 to do it.
Ed
==================
log file from 3rd node which can not join the partition:
[10:04:45,904,AutoDeployer] Auto deploy of file:/home/gcg/jboss-3.0.0alpha/jboss/deploy/cluster-service.xml
[10:04:45,931,ServiceCreator] About to create the beanJBOSS-SYSTEM:service=DefaultPartition
[10:04:45,956,ServiceCreator] Created the beanJBOSS-SYSTEM:service=DefaultPartition
[10:04:45,958,ClusterPartition] SynchronizedMBeans set to [JBOSS-SYSTEM:service=HASessionState, JBOSS-SYSTEM:service=HAJNDI]
[10:04:45,961,ServiceCreator] About to create the beanJBOSS-SYSTEM:service=HASessionState
[10:04:45,983,ServiceCreator] Created the beanJBOSS-SYSTEM:service=HASessionState
[10:04:45,984,HASessionStateService] Starting
[10:04:45,985,HASessionStateService] Started
[10:04:45,987,ServiceCreator] About to create the beanJBOSS-SYSTEM:service=HAJNDI
[10:04:46,010,ServiceCreator] Created the beanJBOSS-SYSTEM:service=HAJNDI
[10:04:46,011,HANamingService] Starting
[10:04:46,011,HANamingService] Started
[10:04:46,012,ClusterPartition] Starting
[10:04:46,012,ClusterPartition] Creating JavaGroups JChannel
[10:04:46,383,ClusterPartition] Creating HAPartition...
[10:04:46,477,ClusterPartition] ...Initing HAPartition...
[10:04:46,510,HAPartition:DefaultPartition] creating SubcontextHAPartition
[10:04:46,511,HAPartition:DefaultPartition] done initing..
[10:04:46,512,ClusterPartition] ...HAPartition initialized.
[10:04:46,512,ClusterPartition] registering JBOSS-SYSTEM:service=HASessionState
[10:04:46,554,HASessionState-/HASessionState/Default] creating SubcontextHASessionState
[10:04:46,555,HASessionState-/HASessionState/Default] ...HAPartition initialized.
[10:04:46,555,ClusterPartition] registered JBOSS-SYSTEM:service=HASessionState
[10:04:46,556,ClusterPartition] registering JBOSS-SYSTEM:service=HAJNDI
[10:04:46,556,HANamingService] Initializing HAJNDI server
[10:04:46,556,HANamingService] jndi lookup of /HAPartition/DefaultPartition
[10:04:46,557,HANamingService] Create remote object
[10:04:46,563,HANamingService] initialize HAJNDI
[10:04:46,564,HAJNDI] subscribeToStateTransferEvents
[10:04:46,564,HAJNDI] registerRPCHandler
[10:04:46,564,ClusterPartition] registered JBOSS-SYSTEM:service=HAJNDI
[10:04:46,564,ClusterPartition] Starting ClusterPartition: DefaultPartition
[10:04:46,565,ClusterPartition] Connecting to channel
[10:04:46,593,Default]
-------------------------------------------------------
GMS: address is southcity:32777
-------------------------------------------------------
[10:04:48,557,HAPartition:DefaultPartition] Handle: DistributedState._set
[10:04:49,625,HAPartition:DefaultPartition] new view accepted: 0 ([southcity:32777])
[10:04:49,625,HAPartition:DefaultPartition] ViewAccepted: initial members set
[10:04:49,626,ClusterPartition] Starting channel
[10:04:49,626,HAPartition:DefaultPartition] Num cluster members: 1
[10:04:49,630,HAPartition:DefaultPartition] SetState called
[10:04:49,631,HAPartition:DefaultPartition] state is null
[10:04:49,631,HAPartition:DefaultPartition] State could not be retrieved, (must be first member of group)
[10:04:49,631,DefaultPartition:ReplicantManager] mergemembers
[10:04:49,631,DefaultPartition:ReplicantManager] start MergeMembers
[10:04:49,952,DefaultPartition:ReplicantManager] notifyKeyListeners
[10:04:49,952,ClusterPartition] registering JBOSS-SYSTEM:service=HASessionState
[10:04:49,953,HASessionState-/HASessionState/Default] HASessionState node name : southcity:32777
[10:04:49,958,DefaultPartition:ReplicantManager] notifyKeyListeners
[10:04:49,959,HASessionState-/HASessionState/Default] A new HASessionState topology needs to be computed by the master node => this node.
[10:04:49,959,HASessionState-/HASessionState/Default] New nodes: [southcity:32777]
[10:04:49,962,HASessionState-/HASessionState/Default] Computed topology : {
SessionState-'/HASessionState/Default'-Group-1:[[southcity:32777]] aka '[]'
}
[10:04:49,968,HASessionState-/HASessionState/Default] Starting repartitioning... :{
SessionState-'/HASessionState/Default'-Group-1:[[southcity:32777]] aka '[]'
}
[10:04:49,969,HASessionState-/HASessionState/Default] We were not yet connected. We connect to sub-partition SessionState-'/HASessionState/Default'-Group-1
[10:04:49,985,HAPartition:SessionState-'|HASessionState|Default'-Group-1] done initing..
[10:04:49,990,Default]
-------------------------------------------------------
GMS: address is southcity:32779
-------------------------------------------------------
[10:04:52,998,HAPartition:SessionState-'|HASessionState|Default'-Group-1] new view accepted: 0 ([southcity:32779])
[10:04:52,999,HAPartition:SessionState-'|HASessionState|Default'-Group-1] ViewAccepted: initial members set
[10:04:53,000,HAPartition:SessionState-'|HASessionState|Default'-Group-1] Num cluster members: 1
[10:04:53,000,HAPartition:SessionState-'|HASessionState|Default'-Group-1] SetState called
[10:04:53,000,HAPartition:SessionState-'|HASessionState|Default'-Group-1] state is null
[10:04:53,000,HAPartition:SessionState-'|HASessionState|Default'-Group-1] State could not be retrieved, (must be first member of group)
[10:04:53,000,SessionState-'|HASessionState|Default'-Group-1:ReplicantManager] mergemembers
[10:04:53,001,SessionState-'|HASessionState|Default'-Group-1:ReplicantManager] start MergeMembers
[10:04:53,003,SessionState-'|HASessionState|Default'-Group-1:ReplicantManager] notifyKeyListeners
[10:04:53,003,HASessionState-/HASessionState/Default] Repartitioning done.
[10:04:53,004,ClusterPartition] registered JBOSS-SYSTEM:service=HASessionState
[10:04:53,004,ClusterPartition] registering JBOSS-SYSTEM:service=HAJNDI
[10:04:53,004,HANamingService] Starting HAJNDI server
[10:04:53,004,HANamingService] Create HARMIServer proxy
[10:04:53,133,DefaultPartition:ReplicantManager] notifyKeyListeners
[10:04:53,169,HANamingService] Start listener
[10:04:53,169,HANamingService] Started hajndiPort=1100
[10:04:53,175,ClusterPartition] registered JBOSS-SYSTEM:service=HAJNDI
[10:04:53,176,ClusterPartition] Started ClusterPartition: DefaultPartition
[10:04:53,176,ClusterPartition] Started -
3. Re: same problem
slaboure Jan 8, 2002 4:53 AM (in response to robster)Could you please provide us with your OS and JDK version?
3.0.0alpha can work with more than 2 nodes. This problem has occured with some JDK/OS. Next release will include a new JavaGroups property string that will better fit. Neverthteless, this new property string is not yet defined. In the clustering configuration file, may you please modify the default partition string with these properties:
UDP(mcast_addr=224.0.0.35;mcast_port=45566;ip_ttl=64;
mcast_send_buf_size=80000;mcast_recv_buf_size=80000):
PING(timeout=2000;num_initial_members=3):
MERGE2(min_interval=5000;max_interval=10000):
FD:
VERIFY_SUSPECT(timeout=1500):
pbcast.STABLE(desired_avg_gossip=20000):
pbcast.NAKACK(gc_lag=50;retransmit_timeout=300,600,1200,2400,4800):
UNICAST(timeout=5000;min_wait_time=2000):
FRAG(frag_size=4096;down_thread=false;up_thread=false):
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=true):
pbcast.STATE_TRANSFER
(if you don't use SFSB, disable the SFSB service in the clustering config file) -
4. Re: same problem
vlada Jan 8, 2002 4:09 PM (in response to robster)Sacha,
This issue is most likely related to InetAddress nis resolution while there is a traffic in jg. It happens at my home setup on adsl modem until I map my loopback 127.0.0.1 to my domain name xisnext.2y.net.
In university lab it works fine however.
Best,
Vladimir -
5. Re: same problem
slaboure Jan 11, 2002 8:22 AM (in response to robster)Hello Vladimir,
As a JavaGroups core contributor ;) could you please expand a little a bit on this? What do you suspect to be the problem? Do you suspect some reverse lookup taking place or something like this?