8 Replies Latest reply on Jan 28, 2012 5:41 PM by manik

    Write Skew issue (versioning)




      I think I have spotted a problem with the write skew check implementation based on versioning.


      I've made this test to confirm:


      I have a global counter that is incremented concurrently by two different nodes, running ISPN with Repeatable Read with write skew enabled. I expected that each successfully transaction will commit a different value.


      In detail, each node do the following:



      Integer count = cache.get("counter");

      count = count + 1;

      cache.put("counter", count)



      To avoid errors, I've run this test on two ISPN versions: 5.1.0.CR4 and 5.0.1.Final. In 5.0.1.Final, it works as expected. However, on 5.1.0.CR4 I have a lot of repeated values. After a first check at the code, I've the impression that the problem may be due to that the version numbers of the keys for which the write skew check should be run is not sent with the prepare command.


      The ISPN config file can be found here: http://pastebin.com/UCxGXw3K



      Pedro Ruivo

        • 1. Re: Write Skew issue (versioning)

          Hi Pedro. 


          I don't understand how this could have worked in 5.0.x since write skew checks in a cluster was not supported until 5.1. 


          Are you testing local mode?




          • 2. Re: Write Skew issue (versioning)

            One way or the other there shouldn't get duplicate counter values, right?

            • 3. Re: Write Skew issue (versioning)



              I'm testing in replicated mode (full replication).


              In 5.0.x it works because of the locking scheme. In more detail, two cases can happen (list of events);


              1) write skew is detected:


              localTx reads "counter" and gets the value x

              remote prepare (remoteTx) is received

              remoteTx acquires lock on "counter"

              localTx tries to acquire lock on "counter"

              remoteTx updates "counter" to x+1

              remoteTx releases the lock

              localTx acquires the lock

              localTx detects that "counter"'s value is x+1 and aborts (see [1])


              2) deadlock/timeout acquiring the locks


              localTx reads "counter" and gets the value x

              localTx acquires the lock on "counter"

              remote prepare (remoteTx) is received

              remoteTx tries to acquire lock on "counter"


              deadlock is detected (or a timeout is triggered)


              For 5.1.x, I was expecting behavior like this:


              localTx reads "counter" and gets the value x (version y)

              remote prepare (remoteTx) is received and updates the "counter" to x+1 (version y+1)

              localTx sends the prepare command and the coordinator performs the write skew check


              The coordinator detects that the read version (y) is different from the actual version (y+1) and aborts the transaction


              This is my "definition" of write skew.






              [1] in RepetableReadEntry

              if (actualValue != null && actualValue != value) {


                throw new CacheException("Detected write skew");


              • 4. Re: Write Skew issue (versioning)

                No, in 5.0.x you may still get dupes.

                • 5. Re: Write Skew issue (versioning)

                  BTW is this unit test in a form that can be added to the Infinispan codebase?  If you could fork the project and create a pull request with a commit containing the test that would be great.

                  • 6. Re: Write Skew issue (versioning)

                    No. The code was implemented in a modified version of radargun... However, I can try to implement it as a unit test this weekend if you are interested


                    How hard is to implement a unit test?

                    • 7. Re: Write Skew issue (versioning)

                      I have made a pull request with the test case. It's my first time that I create a test case and a pull request. If anything is wrong, please let me know.




                      • 8. Re: Write Skew issue (versioning)

                        Thanks for the test case.  I've incorporated this into Infinispan's test suite.  The bug is documented here and fixed here.