1 2 Previous Next 17 Replies Latest reply on Oct 19, 2005 10:24 AM by belaban

[ERROR] NAKACK.handleXmitReq()

slaboure Apr 1, 2004 6:26 AM

All JBoss versions prior to 3.2.4 may have clustering generate this kind of exception under load:

2004-03-28 02:47:27,450 DEBUG [org.javagroups.DefaultPartition] [Sun Mar 28 02:47:27 EST 2004] [ERROR] NAKACK.handleXmitRe
q(): (requester=choqtap4:32794 (additional data: 19 bytes)) message with seqno=0 not found in sent_msgs ! sent_msgs=18 17
16 15 14 13 12 11 10 9 8 7 6 5

This is due to an error in JGroup default protocol configuration. You can easily fix that by editing the file deploy/cluster-service.xml by reordering the UNICAST and pbcast.STABLE protocols. The ClusterPartition MBeans the becomes:

<mbean code="org.jboss.ha.framework.server.ClusterPartition"
 name="jboss:service=DefaultPartition">

 <!-- Name of the partition being built -->
 <attribute name="PartitionName">DefaultPartition</attribute>
 <!-- Determine if deadlock detection is enabled -->
 <attribute name="DeadlockDetection">False</attribute>
 <!-- The JGroups protocol configuration -->
 <attribute name="PartitionConfig">
 <Config>
 <!-- UDP: if you have a multihomed machine,
 set the bind_addr attribute to the appropriate NIC IP address -->
 <!-- UDP: On Windows machines, because of the media sense feature
 being broken with multicast (even after disabling media sense)
 set the loopback attribute to true -->
 <UDP mcast_addr="228.1.2.3" mcast_port="45566"
 ip_ttl="32" ip_mcast="true"
 mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
 ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
 loopback="false" />
 <PING timeout="2000" num_initial_members="3"
 up_thread="true" down_thread="true" />
 <MERGE2 min_interval="10000" max_interval="20000" />
 <FD shun="true" up_thread="true" down_thread="true"
 timeout="2500" max_tries="5" />
 <VERIFY_SUSPECT timeout="3000" num_msgs="3"
 up_thread="true" down_thread="true" />
 <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
 max_xmit_size="8192"
 up_thread="true" down_thread="true" />
 <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
 down_thread="true" />
 <pbcast.STABLE desired_avg_gossip="20000"
 up_thread="true" down_thread="true" />
 <FRAG frag_size="8192"
 down_thread="true" up_thread="true" />
 <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
 shun="true" print_local_addr="true" />
 <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
 </Config>
 </attribute>

 </mbean>

Cheers,

Sacha

1. Questions about functionality

slaboure Apr 1, 2004 6:26 AM (in response to slaboure)

We are looking into deploying a caching solution into a jboss server, so jbossCache is an obvious candidate. It looks like it does 90%+ of what we want. I have a few questions about the remaining functionality

1) We would like to assign a timeToIdleSeconds when we put something in the cache (programatically, not a global from the .xml). Is this possible? Can I get the RegionManager after I insert the new entry into the cache and add a new region that only affects my new entry?

2) Can JbossCache keep statistics? In order to properly set the idleTimeToLive/maxEntries etc.. We want to look at how many times a paticular node was successfully retrieved from the cache, as well as how many times an entry was attempted to be retireved and it had already been evicted (or was never there)

If this functionality does not already exist, are there hooks where I could write my own implementations (and of course submit the changes back).

-Thanks
-Kevin
Actions
2. Re: [ERROR] NAKACK.handleXmitReq()

slaboure Apr 1, 2004 6:26 AM (in response to slaboure)

Thanks Sacha for your quick response.

I have modified the proper cluster-service.xml file under JBOSSHOME/server/all/deploy directory.
I cross checked it twice.

Regards,
Monu
Actions

3. Re: [ERROR] NAKACK.handleXmitReq()

georgel Apr 1, 2004 1:36 PM (in response to slaboure)

Hi,

I tried what you/Bela suggested in the other thread,

http://jboss.org/index.html?module=bb&op=viewtopic&t=47563

Server is JBoss 3.2.1, Linux, Java 1.4.2-b28. Got the following errors, and went back to the old configuration. Should your fix work for 3.2.1 or do I get to upgrade?

thanks!

George

13:02:42,729 INFO [MainDeployer] Starting deployment of package: file:/home/jboss/jboss-3.2.1/server/all/deploy/cluster-service.xml
13:02:42,954 INFO [ClusterPartition] Creating
13:02:43,059 INFO [STDOUT] Thu Apr 01 13:02:43 CST 2004 Listening for connections ...
13:02:43,324 ERROR [ClusterPartition] Initialization failed
ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above
 at org.javagroups.JChannel.<init>(JChannel.java:141)
 at org.jboss.ha.framework.server.ClusterPartition.createService(ClusterPartition.java:202)
 at org.jboss.system.ServiceMBeanSupport.create(ServiceMBeanSupport.java:158)
 at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

13:02:43,327 WARN [ServiceController] Problem creating service jboss:service=DefaultPartition
ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above
 at org.javagroups.JChannel.<init>(JChannel.java:141)
 at org.jboss.ha.framework.server.ClusterPartition.createService(ClusterPartition.java:202)
 at org.jboss.system.ServiceMBeanSupport.create(ServiceMBeanSupport.java:158)

Depends On Me: , ObjectName: jboss:service=DefaultPartition
 state: FAILED
 I Depend On:
 Depends On Me: jboss:service=HASessionState
 jboss:service=HAJNDI
 jboss.cache:service=InvalidationBridge,type=JavaGroups
 jboss:service=FarmMember,partition=DefaultPartition
 jboss.j2ee:service=EJB,jndiName=clustering/HTTPSession
ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above, ObjectName: jboss:service=HASessionState
 state: CONFIGURED
 I Depend On: jboss:service=DefaultPartition

4. Re: [ERROR] NAKACK.handleXmitReq()

monu Apr 1, 2004 11:52 PM (in response to slaboure)

Hi Sacha,

I am still getting the same error after incorporating the changes as suggested by you in cluster-service.xml. It has been observed that the error:”
2004-03-28 02:47:27,450 DEBUG [org.javagroups.DefaultPartition] [Sun Mar 28 02:47:27 EST 2004] [ERROR] NAKACK.handleXmitRe
q(): (requester=choqtap4:32794 (additional data: 19 bytes)) message with seqno=0 not found in sent_msgs ! sent_msgs=18 17
16 15 14 13 12 11 10 9 8 7 6 5” comes when other node in cluster says suspected member and then jboss starts throwing that "message seqno=" error. It comes in high load and creates 20 MB server.log file in each 3 min.

Regards,
Monu
Actions
5. Re: [ERROR] NAKACK.handleXmitReq()

monu Apr 2, 2004 3:08 AM (in response to slaboure)

Thanks Sacha for your quick response.

I have modified the proper cluster-service.xml file under JBOSSHOME/server/all/deploy directory.
I cross checked it twice.

Regards,
Monu
Actions
6. 3828852

monu Apr 2, 2004 4:47 AM (in response to slaboure)

Hi Sacha/Bela,

I have copied part of server. log from my jboss running in cluster mode . I have two Jboss3.2.2 running in cluster mode. Each jboss is running in heavy load and write/read/remove some data to/from distributed state for each request.

Server.log from Node 1 :

a) After 2-3 hours of test starts getting the below error
2004-04-02 11:07:46,490 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:46 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
2004-04-02 11:07:46,728 INFO [STDOUT] [SOAK Queuing ] Connter : 444000
2004-04-02 11:07:48,984 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:48 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
2004-04-02 11:07:49,349 INFO [STDOUT] [SOAK Queuing ] Connter : 444100
2004-04-02 11:07:51,466 INFO [STDOUT] [SOAK Queuing ] Connter : 444200
2004-04-02 11:07:51,494 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:51 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
2004-04-02 11:07:53,670 INFO [STDOUT] [SOAK Queuing ] Connter : 444300
2004-04-02 11:07:54,004 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:54 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
2004-04-02 11:08:09,539 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:09 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
2004-04-02 11:08:09,829 INFO [org.jboss.ha.framework.interfaces.HAPartition.XSAM_Partition_cluster_GlobeTesting] Suspected member: jfk:63378 (additional data: 19 bytes)
2004-04-02 11:08:10,220 INFO [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.XSAM_Partition_cluster_GlobeTesting]

b) Then I can see one its showing one node as dead but the other node is still running.
New cluster view (id: 2, delta: -1) : [172.16.11.138:21099]
2004-04-02 11:08:10,224 INFO [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] Dead members: 1
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] trying to remove deadMember 172.16.11.145:21099 for key HAJNDI
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] 172.16.11.145:21099 was removed
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifyKeyListeners
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifying 1 listeners for key change: HAJNDI
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] trying to remove deadMember 172.16.11.145:21099 for key DCacheBridge-DefaultJGBridge
2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] 172.16.11.145:21099 was removed
2004-04-02 11:08:10,226 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifyKeyListeners
2004-04-02 11:08:10,226 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifying 1 listeners for key change: DCacheBridge-DefaultJGBridge
2004-04-02 11:08:10,274 INFO [STDOUT] [SOAK Queuing ] Connter : 444400
2004-04-02 11:08:13,349 INFO [STDOUT] [SOAK Queuing ] Connter : 444500
2004-04-02 11:08:16,127 INFO [STDOUT] [SOAK Queuing ] Connter : 444600
2004-04-02 11:08:18,844 INFO [STDOUT] [SOAK Queuing ] Connter : 444700
2004-04-02 11:08:21,708 INFO [STDOUT] [SOAK Queuing ] Connter : 444800
2004-04-02 11:08:24,732 INFO
c) Again the machine came back in cluster
[org.jboss.ha.framework.interfaces.HAPartition.lifecycle.XSAM_Partition_cluster_GlobeTesting] New cluster view (id: 3, delta: 1) : [172.16.11.138:21099, 172.16.11.145:65263]
2004-04-02 11:08:24,733 INFO [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] Dead members: 0
2004-04-02 11:08:24,770 INFO [STDOUT] [SOAK Queuing ] Connter : 444900
2004-04-02 11:08:24,956 ERROR
d) Then this exception
[org.jboss.ha.framework.interfaces.HAPartition.XSAM_Partition_cluster_GlobeTesting] GetState failed
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:762)
at java.util.HashMap$EntryIterator.next(HashMap.java:804)
at java.util.HashMap.writeObject(HashMap.java:956)
at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
at java.util.HashMap.writeObject(HashMap.java:958)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
at java.util.HashMap.writeObject(HashMap.java:958)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
at org.jboss.ha.framework.server.HAPartitionImpl.objectToByteBuffer(HAPartitionImpl.java:146)
at org.jboss.ha.framework.server.HAPartitionImpl.getState(HAPartitionImpl.java:346)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passUp(MessageDispatcher.java:462)
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:292)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:515)
at org.jgroups.JChannel.up(JChannel.java:860)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:314)
at org.jgroups.stack.ProtocolStack.receiveUpEvent(ProtocolStack.java:330)
at org.jgroups.stack.Protocol.passUp(Protocol.java:470)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:103)
at org.jgroups.stack.UpHandler.run(Protocol.java:55)

Server.log from Node 2:

a) First throws the below error continue for sometime.
2004-04-02 11:08:22,201 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] UDP.down(): dest address of message is null, and sending to default address fails as mcast_addr is null, too ! Discarding message MethodCall (name=DistributedState._set, number of args=3)
Args:
[XSAMTransactionLog (java.lang.String)]
[0123456789:020404112151 (java.lang.String)]
[{MT_MSG_IDS=[Ljava.lang.String;@123f331, MT_NO_OF_MSG=1, MT_TIMESTAMP=1080896902190} (java.util.TreeMap)]
2004-04-02 11:08:22,213 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
2004-04-02 11:08:22,219 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
2004-04-02 11:08:22,221 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
2004-04-02 11:08:22,227 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
Â…
Â…
Â…
Â…
b) then the error below and after sometimes continue Â“message with seqno=23 not found inÂ” error
2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
2004-04-02 11:08:25,395 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:25 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=22 not found in sent_msgs ! sent_msgs=94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 102 101 100 99 98 97 96 95
2004-04-02 11:08:25,396 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:25 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=23 not found in sent_msgs ! sent_msgs=94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 102 101 100 99 98 97 96 95
Â…
Â…
004-04-02 11:08:26,004 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:26 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=34 not found in sent_msgs ! sent_msgs=

Please give your valuable suggestion to resolve this issue. When Â“with seqno=34 not found in sent_msgs ! sent_msgs=Â” error starts coming jboss created 20 MB big server.log file in each 3-4 minutes.

Thanks in advance,
--monu
Actions
7. Re: [ERROR] NAKACK.handleXmitReq()

xsam_jboss_user Apr 2, 2004 6:32 AM (in response to slaboure)

To add to monu's comment the above problem occurs when we run our
application under load. Our application uses distributed state extensively.
The distributed state has about 300 to 400 thousand entries.
Actions
8. Re: [ERROR] NAKACK.handleXmitReq()

slaboure Apr 2, 2004 1:10 PM (in response to slaboure)

The ConcurrentModificationException is not a big deal, what is strange is the NAKACK thing with the udpated stack.

Can you try with a 3.2.4RC?

Bela?
Actions
9. Re: [ERROR] NAKACK.handleXmitReq()

monu Apr 3, 2004 2:15 AM (in response to slaboure)

Hi Sacha/Bela,

It would be nice if you can release fix for jboss version 3.2.2 itself .Switching to RC verison of jboss might not good choice for production.

Thanks,
Monu
Actions
10. Re: [ERROR] NAKACK.handleXmitReq()

belaban Apr 8, 2004 12:03 AM (in response to slaboure)

Does the NAKACK error only occur when the 2nd node is shunned, forced to leave the cluster and the rejoins ? Does this happen *immediately* after rejoining (and failing to get the state) ?

Bela
Actions
11. Re: [ERROR] NAKACK.handleXmitReq()

xsam_jboss_user Apr 8, 2004 1:13 AM (in response to slaboure)

Thanks much Sacha. We tried the GC_LAG and it seems the problem is still there. The problem occurs after about 10 hours. I think the problem occurs as soon as the other node is shunned out of the cluster.
Actions
12. Re: [ERROR] NAKACK.handleXmitReq()

belaban Apr 10, 2004 10:20 PM (in response to slaboure)

Can you try this with the latest JGroups and logging enabled (trace) for org.jgroups.* ?

This has not solved the problem, but I would be able to narrow the problem down.

Bela
Actions
13. Re: [ERROR] NAKACK.handleXmitReq()

belaban Apr 22, 2004 6:58 PM (in response to slaboure)

Okay, I fixed the ConcurrentModificationException, fix is in head and 3.2

I also think I fixed the NAKACK problem (in JGroups head). Anyone who wants to try this out, please check out the latest JGroups CVS and give it a shot.

If this turns out to fix the bug, I'll create a new kgroups.jar and update JBoss head and 3.2.

Bela
Actions
14. Re: [ERROR] NAKACK.handleXmitReq()

svelur Aug 12, 2005 3:40 PM (in response to slaboure)

We copied latest jgroups jars. I am sure, we have proper configuration in cluster-service.xml and tc5-cluster-service.xml. But, we are seeing the following errors. I assume, this is happening when one server get shunned and try to come back.

The cluster is perfectly fine for 4 - 5 hours. After that, the cluster has the following WARN, ERROR and Exceptions:

Server1:
Error:
2005-08-12 00:02:59,687 23678836 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
, 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|28]
[10.38.9.176:7800 (additional data: 16 bytes)]; discarding view
2005-08-12 00:10:51,445 24150594 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
myself from membership (waiting for EXIT message)
2005-08-12 00:10:51,446 24150595 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
, 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|30]
[10.38.9.176:7800 (additional data: 16 bytes)]; discarding view
2005-08-12 00:11:07,461 24166610 WARN [org.jgroups.protocols.pbcast.CoordGmsImpl] (MergeTask thread:) merge responses f
rom subgroup coordinators <= 1 ([]). Cancelling merge
2005-08-12 00:12:14,271 24233420 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
myself from membership (waiting for EXIT message)
2005-08-12 00:12:14,274 24233423 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
, 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|32]
[10.38.9.176:7800 (additional data: 16 bytes)]; discarding view

Server2
Error 1:
2005-08-12 01:05:31,093 27097034 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusio
n() failed, 10.38.9.176:7800 (additional data: 16 bytes) is not a member of view [10.38.9.174:7800 (additional
data: 16 bytes)|44] [10.38.9.174:7800 (additional data: 16 bytes)]; discarding view

Error 2:
2005-08-12 02:16:18,736 31344677 WARN [org.jgroups.protocols.pbcast.STABLE] (TimeScheduler.Thread:) ResumeTas
k resumed message garbage collection - this should be done by a RESUME_STABLE event; check why this event was
not received (or increase max_suspend_time for large state transfers)

Error 3:
2005-08-12 03:44:26,673 36632614 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
myself from membership (waiting for EXIT message)

2005-08-12 09:47:12,896 58398837 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)
2005-08-12 09:47:19,752 58405693 WARN [org.jgroups.protocols.pbcast.CoordGmsImpl] (UpHandler (GMS):) merge already in p
rogress, discarded MERGE event
2005-08-12 09:47:19,753 58405694 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)
2005-08-12 09:48:57,825 58503766 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)

Any help is greatly appreciated.

Thank You!.
Actions

1 2 Previous Next

Go to original post