2 Replies Latest reply on Apr 25, 2012 1:12 PM by Lin Ye

    Error message from JGroups

    Lin Ye Novice

      I upgraded to Infinispan 5.1.3.FINAL, and got the following error message in the log intermittently. Basically, T00696119-60477 is the current node reporting the error, and T00696119-53825 is a new node joining the cluster. It looks like before the new node joining the view, the current node received messages from the new node and rejected it. Is there a way to avoid this error?

      20:37:05,154 | ERROR | OOB-15,null  | UNICAST                      | 157 - com.ge.energy.ssi.core.datagrid.core - 2.0.0.rc1 | T00696119-60477: sender window for T00696119-53825 not found
      20:37:05,247 | INFO  | Incoming-8,null  | JGroupsTransport             |  -  -  | ISPN000094: Received new cluster view: [T00696119-60477|57] [T00696119-60477, T00696119-53825]
        • 1. Re: Error message from JGroups
          Bela Ban Master

          I don't suppose you can reproduce this, can you ?


          Without more information, here's why this could happen: let's call the members A and B.

          - A sends a few unicasts to B

          - B receives them, but the first unicast from A is dropped and will later get retransmitted

          - B drops the unicast and asks A for its first unicast

          - A receives the request and tries to furnish the first unicast, but meanwhile the table was dropped. This can happen on a view change (in which B was not a member)

          - A logs the error message (I changed this to a WARN, as it shouldn't affect the system)

          - When A sends another unicast to B, it will re-establish the connection; this time both A and B will have the same connection-id


          This *should* not effect the correctness of the system. Would be nice if you can reproduce it though, to make sure I'm right on this...

          • 2. Re: Error message from JGroups
            Lin Ye Novice

            I couldn't use a simple example to reproduce this issue. However, I did reproduce it frequently with our production system and a testing program. This happens to Infinispan 5.1.3. When I reverted to 5.1.0, it disappeared.


            I am not sure if what you described is my case, but it may be, as it happened around the time of a view change. And I don't think it effect the correctness of the system either. In which version of JGroups you changed it to a WARN? As a WARN makes me feel more comfortable with it, and easier to convince our QA team.