1 2 Previous Next 27 Replies Latest reply on Aug 17, 2010 2:59 PM by Shane Johnson

    State Transfer/Repl Queue - Out of Order

    Shane Johnson Novice

      Is it possible that the combination of in memory state transfer and a replication queue can cause out of order or dropped messages?

       

      Here is what we are seeing...

       

      • Start Node A
      • Start Node B
      • Begin Pushing Requests to A
      • A and B are in Sync
      • Restart B (While Pushing Requests to A)
      • State Transfer from A to B
      • A and B are NOT in Sync

       

      Does a state transfer (on the responding end) block the replication queue's thread?

       

      If not, is it possible that B receives a DELETE via a flush of the replication queue and then receives a PUT as part of the state transfer later? As a result it now has an entry that it should not.

       

      Or, would the replication queue flush wait until the state transfer has been sent and acknowledged.

       

      I was looking at the following code paths, but I didn't really see a common lock between these two processes. Perhaps I am not looking in the right place.

       

      State Transfer (Responding End - A)

      • JGroupsTransport.getState
      • -->InboundInvocationHandlerImpl.generateState
      • ---->StateTransferManagerImpl.generateState
      • ---->StateTransferManagerImpl.generateInMemoryState

       

      Execution of Replication Command within JGroups

      • JGroupsTransport.invokeRemotely
      • ** flushTracker.blockUntilRelease(distributedSyncTimeout, MILLISECONDS)

       

       

      Also, is B considered a 'member' of the group before the state transfer has completed? I ask this because of this snippet in the RpcManagerImpl.

       

      if (members.size() < 2)
          return Collections.emptyList()

       

      Perhaps an explanation of the locking around the state transfer and the replication queue would be of some benefit. Any references to the code would be particularly helpful to our understanding.

        • 1. Re: State Transfer/Repl Queue - Out of Order
          Shane Johnson Novice

          We have been continuing are tests, and this still seems to be an issue. I have extract a few lines from the logs of our last run. It seems more and more apparent that the slave is received commands from the master before it has even begun the state transfer.

           

          1. Master Starts
          2. Slave Starts
          3. Execute Commands on Master
          4. Slave Stops
          5. Execute Commands on Master
          6. Slave Starts
          7. Execute Commands on Master

           

          The idea is that we will restart a node while continuing to execute commands. The assumption is that both nodes will synchronize properly after the slave has restarted.

           

          *** Initial Join ***
          17:32:06,794 JGroupsTransport : Starting JGroups Channel
          17:32:07,176 JGroupsDistSync : Closing joinInProgress gate
          17:32:07,282 JGroupsDistSync : Releasing ReclosableLatch [State = 1, empty queue] gate

           

          *** Received MultipleRpcCommand ***
          17:33:07,531 CommandAwareRpcDispatcher : Attempting to execute command: MultipleRpcCommand{cacheName='A'}
          17:33:07,531 CommandAwareRpcDispatcher : Enough waiting; replayIgnored = false, sr STATE_PREEXISTED

           

          *** The cache has not been properly started yet. ***
          17:33:37,607 InboundInvocationHandlerImpl : Cache named A does not exist on this cache manager!
          17:33:37,607 CommandAwareRpcDispatcher : Unable to execute command, got invalid response

           

          *** The cache is started and the state transfer is started/completed. ***
          17:35:48,502 StateTransferManagerImpl : Initiating state transfer process
          17:35:48,507 JGroupsTransport : Received state for cache named 'A'.  Attempting to apply state.

           

           

          My interpretation of these logs remains that the slave is receiving commands from the masters queue before it has started and completed a state transfer with the master.

           

          Thoughts? This is a blocker for us.

          • 2. Re: State Transfer/Repl Queue - Out of Order
            Galder Zamarreño Master

            Shane, do you have a unit test of this where you've replicated this?

            • 3. Re: State Transfer/Repl Queue - Out of Order
              Shane Johnson Novice

              Hi Galder,

               

              I don't have a Java unit test as this is part of a broader system wide test. I can, however, try to explain it a little better.

               

              We set up a 2 node cluster in replicated mode. We consider one to be the master and one to be the slave. We then begin pumping operations to the master. All the while it is replicating to the slave via a replication queue. In the midst of these operations we will suddenly bounce the slave. The master continues to recieve operations. At some point the slave comes back, receives some operations (I believe from the master's replication queue) and discards them, then starts and finishes a state transfer from the master.

               

              We had thought this was only occurring once in a while since our counts usually matched, but on further analysis this seems to be happening all the time. It's just that depending on whether the dropped operations are deletes or not, we may or may not see a discrepency in the counts. However, they will be inconsistent and certains puts that are updates may be dropped.

               

              Shane

              • 4. Re: State Transfer/Repl Queue - Out of Order
                Galder Zamarreño Master

                Shane, I've built a small test case (attached) of what you're saying and haven't been able to replicate what you're saying (at least in latest 4.1.x - very close to CR2).

                 

                The overall design of how state transfer works in Infinispan is explained in http://community.jboss.org/wiki/JBossCacheNon-BlockingStateTransfer. Basically, what should be happening is that once the 2nd node gets restarted, any writes from the 1st node should be logged and applied once the state has been transfered. Replication queue shouldn't be any different cos at the end of the day is uses the same underlying mechanism to send stuff except that it sends them in batches.

                 

                Maybe you can enhance the test to better match what you're seeing in your environment?

                 

                What does you config look like btw?

                • 5. Re: State Transfer/Repl Queue - Out of Order
                  Shane Johnson Novice

                  Galder,

                   

                  I'll take a look at your test and upload our configuration here shortly.

                   

                  Setting aside the replication queue/state transfer, perhaps I'll narrow it down to how are we receiving commands before the cache is even ready?

                   

                  Shane

                  • 6. Re: State Transfer/Repl Queue - Out of Order
                    Shane Johnson Novice

                    Here are both the JGroups and Infinispan configuration files that we are using.

                    • 7. Re: State Transfer/Repl Queue - Out of Order
                      Galder Zamarreño Master

                      You have defined several caches in the config, which is the one used by the test that's failing for you?

                      • 8. Re: State Transfer/Repl Queue - Out of Order
                        Shane Johnson Novice

                        It is happening to all of them to varying degrees. One cache receives much more traffic and I might see 5 dropped commands. Another receives a little less and we saw just one command dropped. One of them receives very few commands so I think it is usually alright.

                        • 9. Re: State Transfer/Repl Queue - Out of Order
                          Shane Johnson Novice

                          What we're seeing is that there is a tiny window between when the node joins and when the writes are logged. I keep feeling like maybe the logging applies to writes that haven't yet been placed in the repl queue and that maybe we're getting a few commands sent via a timed flush before the write logging starts.

                          • 10. Re: State Transfer/Repl Queue - Out of Order
                            Galder Zamarreño Master

                            Shane, if you have TRACE logs from both nodes with the issue fully replicated, why don't u attach them all? Maybe reading them I can find out what's wrong, particularly if you could point to the missing modification in the other node.

                            • 11. Re: State Transfer/Repl Queue - Out of Order
                              Shane Johnson Novice

                              We performed another test today, and we performed a more thorough analysis. It looks like we had the right idea with respect to out of order commands and the replication queue, but it had nothing to do with the join / state transfer process as we had suspected earlier. I suppose it receives some commands prior to the cache being created and the state transfer taking place, but they are appropriately dropped and that is fine. The state transfer would take care of things as it should.

                               

                              What we found was that our slave had exactly one more entry than the master. We then dumped the keys and located the extra entry. Next, we looked at the logs to find the operations on this specific key. Here is what we found.


                              Master

                               

                              2010-08-05 11:38:21,028 // Put Key: 342197608235468 (OrderCache)

                              2010-08-05 11:38:21,230 // Remove Key: 342197608235468 (OrderCache)

                              2010-08-05 11:38:23,784 // State Transfer Response (OrderCache)

                               

                              Slave

                               

                              2010-08-05 11:38:23,782 // State Transfer Request (OrderCache)

                              2010-08-05 11:38:34,529 // Remove Key: 342197608235468 (OrderCache) *Request 693*

                              2010-08-05 11:46:08,845 // Put Key: 342197608235468 (OrderCache) * Request 692*

                               

                              It looks like the master sends a PUT followed by a REMOVE. Then the state transfer takes place. However, the slave receives the PUT and REMOVE after the state transfer AND out of order. As a result, it now has an entry that it should not.

                               

                              I have attached a couple of the cleaned up logs. I'll check to see if I can get the full TRACE logs up. I may have to email them to you directly.

                              • 12. Re: State Transfer/Repl Queue - Out of Order
                                Galder Zamarreño Master

                                Shane, thanks for the latest info. This is important stuff in trying to replicate the issue. My previous test worked on different keys, so would not have detected this type of issues. I've improved my test to do a put/remove and verify that there're no contents at the end of the test but the test still passes. Could you email me the full (uncleaned) TRACE logs to galder jboss org?

                                • 13. Re: State Transfer/Repl Queue - Out of Order
                                  Galder Zamarreño Master

                                  First of all, I've created https://jira.jboss.org/browse/ISPN-577 so that you can add further details of the issue.

                                   

                                  Secondly, IIRC, yesterday Craig said on IRC that you're using CR1. This week I've fixed an issue with the order in which replication queue and underlying transport is closed (see https://jira.jboss.org/browse/ISPN-551). I wonder whether the reason I'm unable to replicate it is cos this issue has been fixed in 4.1.x branch, where I'm working on. So, I'd suggest that you guys try out with the latest snapshot that we published this morning (4.1.0-20100806.082729). To use it, simply update your maven pom.xml to use 4.1.0-SNAPSHOT version.

                                  1 2 Previous Next