1 2 Previous Next 17 Replies Latest reply on Oct 19, 2005 10:24 AM by Bela Ban

    [ERROR] NAKACK.handleXmitReq()

    Sacha Labourey Master

      All JBoss versions prior to 3.2.4 may have clustering generate this kind of exception under load:

      2004-03-28 02:47:27,450 DEBUG [org.javagroups.DefaultPartition] [Sun Mar 28 02:47:27 EST 2004] [ERROR] NAKACK.handleXmitRe
      q(): (requester=choqtap4:32794 (additional data: 19 bytes)) message with seqno=0 not found in sent_msgs ! sent_msgs=18 17
      16 15 14 13 12 11 10 9 8 7 6 5


      This is due to an error in JGroup default protocol configuration. You can easily fix that by editing the file deploy/cluster-service.xml by reordering the UNICAST and pbcast.STABLE protocols. The ClusterPartition MBeans the becomes:

      <mbean code="org.jboss.ha.framework.server.ClusterPartition"
       name="jboss:service=DefaultPartition">
      
       <!-- Name of the partition being built -->
       <attribute name="PartitionName">DefaultPartition</attribute>
       <!-- Determine if deadlock detection is enabled -->
       <attribute name="DeadlockDetection">False</attribute>
       <!-- The JGroups protocol configuration -->
       <attribute name="PartitionConfig">
       <Config>
       <!-- UDP: if you have a multihomed machine,
       set the bind_addr attribute to the appropriate NIC IP address -->
       <!-- UDP: On Windows machines, because of the media sense feature
       being broken with multicast (even after disabling media sense)
       set the loopback attribute to true -->
       <UDP mcast_addr="228.1.2.3" mcast_port="45566"
       ip_ttl="32" ip_mcast="true"
       mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
       ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
       loopback="false" />
       <PING timeout="2000" num_initial_members="3"
       up_thread="true" down_thread="true" />
       <MERGE2 min_interval="10000" max_interval="20000" />
       <FD shun="true" up_thread="true" down_thread="true"
       timeout="2500" max_tries="5" />
       <VERIFY_SUSPECT timeout="3000" num_msgs="3"
       up_thread="true" down_thread="true" />
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
       max_xmit_size="8192"
       up_thread="true" down_thread="true" />
       <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
       down_thread="true" />
       <pbcast.STABLE desired_avg_gossip="20000"
       up_thread="true" down_thread="true" />
       <FRAG frag_size="8192"
       down_thread="true" up_thread="true" />
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
       shun="true" print_local_addr="true" />
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true" />
       </Config>
       </attribute>
      
       </mbean>



      Cheers,


      Sacha


        • 1. Questions about functionality
          Sacha Labourey Master

          We are looking into deploying a caching solution into a jboss server, so jbossCache is an obvious candidate. It looks like it does 90%+ of what we want. I have a few questions about the remaining functionality

          1) We would like to assign a timeToIdleSeconds when we put something in the cache (programatically, not a global from the .xml). Is this possible? Can I get the RegionManager after I insert the new entry into the cache and add a new region that only affects my new entry?

          2) Can JbossCache keep statistics? In order to properly set the idleTimeToLive/maxEntries etc.. We want to look at how many times a paticular node was successfully retrieved from the cache, as well as how many times an entry was attempted to be retireved and it had already been evicted (or was never there)

          If this functionality does not already exist, are there hooks where I could write my own implementations (and of course submit the changes back).

          -Thanks
          -Kevin

          • 2. Re: [ERROR] NAKACK.handleXmitReq()
            Sacha Labourey Master

            Thanks Sacha for your quick response.

            I have modified the proper cluster-service.xml file under JBOSSHOME/server/all/deploy directory.
            I cross checked it twice.

            Regards,
            Monu


            • 3. Re: [ERROR] NAKACK.handleXmitReq()
              georgel Newbie

              Hi,

              I tried what you/Bela suggested in the other thread,

              http://jboss.org/index.html?module=bb&op=viewtopic&t=47563

              Server is JBoss 3.2.1, Linux, Java 1.4.2-b28. Got the following errors, and went back to the old configuration. Should your fix work for 3.2.1 or do I get to upgrade?

              thanks!

              George

              13:02:42,729 INFO [MainDeployer] Starting deployment of package: file:/home/jboss/jboss-3.2.1/server/all/deploy/cluster-service.xml
              13:02:42,954 INFO [ClusterPartition] Creating
              13:02:43,059 INFO [STDOUT] Thu Apr 01 13:02:43 CST 2004 Listening for connections ...
              13:02:43,324 ERROR [ClusterPartition] Initialization failed
              ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above
               at org.javagroups.JChannel.<init>(JChannel.java:141)
               at org.jboss.ha.framework.server.ClusterPartition.createService(ClusterPartition.java:202)
               at org.jboss.system.ServiceMBeanSupport.create(ServiceMBeanSupport.java:158)
               at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
               at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
              


              13:02:43,327 WARN [ServiceController] Problem creating service jboss:service=DefaultPartition
              ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above
               at org.javagroups.JChannel.<init>(JChannel.java:141)
               at org.jboss.ha.framework.server.ClusterPartition.createService(ClusterPartition.java:202)
               at org.jboss.system.ServiceMBeanSupport.create(ServiceMBeanSupport.java:158)
              



              Depends On Me: , ObjectName: jboss:service=DefaultPartition
               state: FAILED
               I Depend On:
               Depends On Me: jboss:service=HASessionState
               jboss:service=HAJNDI
               jboss.cache:service=InvalidationBridge,type=JavaGroups
               jboss:service=FarmMember,partition=DefaultPartition
               jboss.j2ee:service=EJB,jndiName=clustering/HTTPSession
              ChannelException: JChannel(): java.lang.Exception: Configurator.sanityCheck(): event GET_DIGEST is required by STABLE, but not provided by any of the layers above, ObjectName: jboss:service=HASessionState
               state: CONFIGURED
               I Depend On: jboss:service=DefaultPartition
              


              • 4. Re: [ERROR] NAKACK.handleXmitReq()
                Tameshwar Sahu Newbie

                Hi Sacha,

                I am still getting the same error after incorporating the changes as suggested by you in cluster-service.xml. It has been observed that the error:”
                2004-03-28 02:47:27,450 DEBUG [org.javagroups.DefaultPartition] [Sun Mar 28 02:47:27 EST 2004] [ERROR] NAKACK.handleXmitRe
                q(): (requester=choqtap4:32794 (additional data: 19 bytes)) message with seqno=0 not found in sent_msgs ! sent_msgs=18 17
                16 15 14 13 12 11 10 9 8 7 6 5” comes when other node in cluster says suspected member and then jboss starts throwing that "message seqno=" error. It comes in high load and creates 20 MB server.log file in each 3 min.


                Regards,
                Monu

                • 5. Re: [ERROR] NAKACK.handleXmitReq()
                  Tameshwar Sahu Newbie

                  Thanks Sacha for your quick response.

                  I have modified the proper cluster-service.xml file under JBOSSHOME/server/all/deploy directory.
                  I cross checked it twice.

                  Regards,
                  Monu


                  • 6. 3828852
                    Tameshwar Sahu Newbie

                    Hi Sacha/Bela,

                    I have copied part of server. log from my jboss running in cluster mode . I have two Jboss3.2.2 running in cluster mode. Each jboss is running in heavy load and write/read/remove some data to/from distributed state for each request.

                    Server.log from Node 1 :

                    a) After 2-3 hours of test starts getting the below error
                    2004-04-02 11:07:46,490 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:46 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
                    2004-04-02 11:07:46,728 INFO [STDOUT] [SOAK Queuing ] Connter : 444000
                    2004-04-02 11:07:48,984 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:48 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
                    2004-04-02 11:07:49,349 INFO [STDOUT] [SOAK Queuing ] Connter : 444100
                    2004-04-02 11:07:51,466 INFO [STDOUT] [SOAK Queuing ] Connter : 444200
                    2004-04-02 11:07:51,494 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:51 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
                    2004-04-02 11:07:53,670 INFO [STDOUT] [SOAK Queuing ] Connter : 444300
                    2004-04-02 11:07:54,004 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:07:54 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
                    2004-04-02 11:08:09,539 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:09 CEST 2004] [ERROR] FD.Monitor.run(): ping_dest is null
                    2004-04-02 11:08:09,829 INFO [org.jboss.ha.framework.interfaces.HAPartition.XSAM_Partition_cluster_GlobeTesting] Suspected member: jfk:63378 (additional data: 19 bytes)
                    2004-04-02 11:08:10,220 INFO [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.XSAM_Partition_cluster_GlobeTesting]

                    b) Then I can see one its showing one node as dead but the other node is still running.
                    New cluster view (id: 2, delta: -1) : [172.16.11.138:21099]
                    2004-04-02 11:08:10,224 INFO [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] Dead members: 1
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] trying to remove deadMember 172.16.11.145:21099 for key HAJNDI
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] 172.16.11.145:21099 was removed
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifyKeyListeners
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifying 1 listeners for key change: HAJNDI
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] trying to remove deadMember 172.16.11.145:21099 for key DCacheBridge-DefaultJGBridge
                    2004-04-02 11:08:10,225 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] 172.16.11.145:21099 was removed
                    2004-04-02 11:08:10,226 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifyKeyListeners
                    2004-04-02 11:08:10,226 DEBUG [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] notifying 1 listeners for key change: DCacheBridge-DefaultJGBridge
                    2004-04-02 11:08:10,274 INFO [STDOUT] [SOAK Queuing ] Connter : 444400
                    2004-04-02 11:08:13,349 INFO [STDOUT] [SOAK Queuing ] Connter : 444500
                    2004-04-02 11:08:16,127 INFO [STDOUT] [SOAK Queuing ] Connter : 444600
                    2004-04-02 11:08:18,844 INFO [STDOUT] [SOAK Queuing ] Connter : 444700
                    2004-04-02 11:08:21,708 INFO [STDOUT] [SOAK Queuing ] Connter : 444800
                    2004-04-02 11:08:24,732 INFO
                    c) Again the machine came back in cluster
                    [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.XSAM_Partition_cluster_GlobeTesting] New cluster view (id: 3, delta: 1) : [172.16.11.138:21099, 172.16.11.145:65263]
                    2004-04-02 11:08:24,733 INFO [XSAM_Partition_cluster_GlobeTesting:ReplicantManager] Dead members: 0
                    2004-04-02 11:08:24,770 INFO [STDOUT] [SOAK Queuing ] Connter : 444900
                    2004-04-02 11:08:24,956 ERROR
                    d) Then this exception
                    [org.jboss.ha.framework.interfaces.HAPartition.XSAM_Partition_cluster_GlobeTesting] GetState failed
                    java.util.ConcurrentModificationException
                    at java.util.HashMap$HashIterator.nextEntry(HashMap.java:762)
                    at java.util.HashMap$EntryIterator.next(HashMap.java:804)
                    at java.util.HashMap.writeObject(HashMap.java:956)
                    at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source)
                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                    at java.lang.reflect.Method.invoke(Method.java:324)
                    at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
                    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
                    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
                    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
                    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
                    at java.util.HashMap.writeObject(HashMap.java:958)
                    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                    at java.lang.reflect.Method.invoke(Method.java:324)
                    at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
                    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
                    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
                    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
                    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
                    at java.util.HashMap.writeObject(HashMap.java:958)
                    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
                    at java.lang.reflect.Method.invoke(Method.java:324)
                    at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:795)
                    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1294)
                    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1245)
                    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1052)
                    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:278)
                    at org.jboss.ha.framework.server.HAPartitionImpl.objectToByteBuffer(HAPartitionImpl.java:146)
                    at org.jboss.ha.framework.server.HAPartitionImpl.getState(HAPartitionImpl.java:346)
                    at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passUp(MessageDispatcher.java:462)
                    at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:292)
                    at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:515)
                    at org.jgroups.JChannel.up(JChannel.java:860)
                    at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:314)
                    at org.jgroups.stack.ProtocolStack.receiveUpEvent(ProtocolStack.java:330)
                    at org.jgroups.stack.Protocol.passUp(Protocol.java:470)
                    at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:103)
                    at org.jgroups.stack.UpHandler.run(Protocol.java:55)





                    Server.log from Node 2:

                    a) First throws the below error continue for sometime.
                    2004-04-02 11:08:22,201 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] UDP.down(): dest address of message is null, and sending to default address fails as mcast_addr is null, too ! Discarding message MethodCall (name=DistributedState._set, number of args=3)
                    Args:
                    [XSAMTransactionLog (java.lang.String)]
                    [0123456789:020404112151 (java.lang.String)]
                    [{MT_MSG_IDS=[Ljava.lang.String;@123f331, MT_NO_OF_MSG=1, MT_TIMESTAMP=1080896902190} (java.util.TreeMap)]
                    2004-04-02 11:08:22,213 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
                    2004-04-02 11:08:22,219 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
                    2004-04-02 11:08:22,221 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
                    2004-04-02 11:08:22,227 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:22 CEST 2004] [ERROR] GroupRequest.execute(): both corr and transport are null, cannot send group request
                    Â…
                    Â…
                    Â…
                    Â…
                    b) then the error below and after sometimes continue “message with seqno=23 not found in” error
                    2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
                    2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
                    2004-04-02 11:08:24,840 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:24 CEST 2004] [ERROR] MessageDispatcher.up(): corr == null
                    2004-04-02 11:08:25,395 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:25 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=22 not found in sent_msgs ! sent_msgs=94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 102 101 100 99 98 97 96 95
                    2004-04-02 11:08:25,396 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:25 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=23 not found in sent_msgs ! sent_msgs=94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 102 101 100 99 98 97 96 95
                    Â…
                    Â…
                    004-04-02 11:08:26,004 DEBUG [org.javagroups.XSAM_Partition_cluster_GlobeTesting] [Fri Apr 2 11:08:26 CEST 2004] [ERROR] NAKACK.handleXmitReq(): (requester=jfk:65263) message with seqno=34 not found in sent_msgs ! sent_msgs=



                    Please give your valuable suggestion to resolve this issue. When “with seqno=34 not found in sent_msgs ! sent_msgs=” error starts coming jboss created 20 MB big server.log file in each 3-4 minutes.


                    Thanks in advance,
                    --monu

                    • 7. Re: [ERROR] NAKACK.handleXmitReq()
                      xsam_jboss_user Newbie

                      To add to monu's comment the above problem occurs when we run our
                      application under load. Our application uses distributed state extensively.
                      The distributed state has about 300 to 400 thousand entries.

                      • 8. Re: [ERROR] NAKACK.handleXmitReq()
                        Sacha Labourey Master

                        The ConcurrentModificationException is not a big deal, what is strange is the NAKACK thing with the udpated stack.

                        Can you try with a 3.2.4RC?

                        Bela?

                        • 9. Re: [ERROR] NAKACK.handleXmitReq()
                          Tameshwar Sahu Newbie

                          Hi Sacha/Bela,

                          It would be nice if you can release fix for jboss version 3.2.2 itself .Switching to RC verison of jboss might not good choice for production.

                          Thanks,
                          Monu

                          • 10. Re: [ERROR] NAKACK.handleXmitReq()
                            Bela Ban Master

                            Does the NAKACK error only occur when the 2nd node is shunned, forced to leave the cluster and the rejoins ? Does this happen *immediately* after rejoining (and failing to get the state) ?

                            Bela

                            • 11. Re: [ERROR] NAKACK.handleXmitReq()
                              xsam_jboss_user Newbie

                              Thanks much Sacha. We tried the GC_LAG and it seems the problem is still there. The problem occurs after about 10 hours. I think the problem occurs as soon as the other node is shunned out of the cluster.

                              • 12. Re: [ERROR] NAKACK.handleXmitReq()
                                Bela Ban Master

                                Can you try this with the latest JGroups and logging enabled (trace) for org.jgroups.* ?

                                This has not solved the problem, but I would be able to narrow the problem down.

                                Bela

                                • 13. Re: [ERROR] NAKACK.handleXmitReq()
                                  Bela Ban Master

                                  Okay, I fixed the ConcurrentModificationException, fix is in head and 3.2

                                  I also think I fixed the NAKACK problem (in JGroups head). Anyone who wants to try this out, please check out the latest JGroups CVS and give it a shot.

                                  If this turns out to fix the bug, I'll create a new kgroups.jar and update JBoss head and 3.2.

                                  Bela

                                  • 14. Re: [ERROR] NAKACK.handleXmitReq()
                                    Su V Newbie

                                    We copied latest jgroups jars. I am sure, we have proper configuration in cluster-service.xml and tc5-cluster-service.xml. But, we are seeing the following errors. I assume, this is happening when one server get shunned and try to come back.


                                    The cluster is perfectly fine for 4 - 5 hours. After that, the cluster has the following WARN, ERROR and Exceptions:

                                    Server1:
                                    Error:
                                    2005-08-12 00:02:59,687 23678836 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
                                    , 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|28]
                                    [10.38.9.176:7800 (additional data: 16 bytes)]; discarding view
                                    2005-08-12 00:10:51,445 24150594 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
                                    myself from membership (waiting for EXIT message)
                                    2005-08-12 00:10:51,446 24150595 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
                                    , 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|30]
                                    [10.38.9.176:7800 (additional data: 16 bytes)]; discarding view
                                    2005-08-12 00:11:07,461 24166610 WARN [org.jgroups.protocols.pbcast.CoordGmsImpl] (MergeTask thread:) merge responses f
                                    rom subgroup coordinators <= 1 ([]). Cancelling merge
                                    2005-08-12 00:12:14,271 24233420 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
                                    myself from membership (waiting for EXIT message)
                                    2005-08-12 00:12:14,274 24233423 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusion() failed
                                    , 10.38.9.174:7800 (additional data: 16 bytes) is not a member of view [10.38.9.176:7800 (additional data: 16 bytes)|32]
                                    [10.38.9.176:7800 (additional data: 16 bytes)]; discarding view




                                    Server2
                                    Error 1:
                                    2005-08-12 01:05:31,093 27097034 WARN [org.jgroups.protocols.pbcast.GMS] (UpHandler (GMS):) checkSelfInclusio
                                    n() failed, 10.38.9.176:7800 (additional data: 16 bytes) is not a member of view [10.38.9.174:7800 (additional
                                    data: 16 bytes)|44] [10.38.9.174:7800 (additional data: 16 bytes)]; discarding view


                                    Error 2:
                                    2005-08-12 02:16:18,736 31344677 WARN [org.jgroups.protocols.pbcast.STABLE] (TimeScheduler.Thread:) ResumeTas
                                    k resumed message garbage collection - this should be done by a RESUME_STABLE event; check why this event was
                                    not received (or increase max_suspend_time for large state transfers)


                                    Error 3:
                                    2005-08-12 03:44:26,673 36632614 WARN [org.jgroups.protocols.FD] (UpHandler (FD):) I was suspected, but will not remove
                                    myself from membership (waiting for EXIT message)


                                    2005-08-12 09:47:12,896 58398837 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
                                    itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)
                                    2005-08-12 09:47:19,752 58405693 WARN [org.jgroups.protocols.pbcast.CoordGmsImpl] (UpHandler (GMS):) merge already in p
                                    rogress, discarded MERGE event
                                    2005-08-12 09:47:19,753 58405694 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
                                    itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)
                                    2005-08-12 09:48:57,825 58503766 WARN [org.jgroups.protocols.pbcast.NAKACK] (UpHandler (NAKACK):) 10.38.9.176:7800 (add
                                    itional data: 16 bytes)] discarded message from non-member 10.38.9.174:7801 (additional data: 16 bytes)




                                    Any help is greatly appreciated.

                                    Thank You!.

                                    1 2 Previous Next