8 Replies Latest reply on Nov 15, 2010 11:23 AM by infinikli

Explicit locking in DIST_SYNC mode

infinikli Nov 12, 2010 12:00 PM

Hello everybody.

We're currently evaluating infinispan as a data grid. Because we need to have control over the access of the shared objects, we need locking-mechanics. The idea is to lock data-access in the following manner:

    public <K,V> boolean performLockedCacheEntryAction(ItemAction<V> action, K key) throws NotSupportedException, SystemException {
        TransactionManager tm = null;
        try {
            Cache<K, V> c = cacheManager.getCache();
            /*
             * Obtain TransactionManager.
             */
            tm = c.getAdvancedCache().getTransactionManager();

            /*
             * Begin transaction.
             */
            tm.begin();
            /*
             * Lock key clusterwide to avoid concurrent access on the action.
             */
            c.getAdvancedCache().lock(key);

            /*
             * Perform action.
             */
            boolean changedState = action.doAction();
            if(changedState){
                /*
                 * Put object into cache when state has changed.
                 */
                c.put(key, action.getItem());
            }
            /*
             * Commit transaction and release locks.
             */
            tm.commit();
            return changedState;
        } catch (Throwable t) {
            logger.error(t,t);
            if(tm!=null && tm.getStatus() == Status.STATUS_ACTIVE){
                logger.debug("Rolling back transaction "+tm);
                tm.rollback();
            }
            return false;
        }
    }

However when we test this code-snippet on two nodes, the action is sometimes performed twice. The Cache is configured as follows:

        Configuration config = new Configuration();
        config.setCacheMode(CacheMode.DIST_SYNC);
        config.setL1CacheEnabled(true);
        config.setL1Lifespan(60000);                             config.setTransactionManagerLookupClass("org.infinispan.transaction.lookup.JBossStandaloneJTAManagerLookup");
        config.setEagerLockSingleNode(true);
        cacheManager = new DefaultCacheManager(GlobalConfiguration.getClusteredDefault());

Are we missing something?

N.B: We are using 4.2.0.BETA1 artifacts. All versions below had massive problems with locking which lead into TimeoutExceptions while acquiring the lock.

1. Re: Explicit locking in DIST_SYNC mode

manik Nov 15, 2010 8:28 AM (in response to infinikli)

I'm not sure I understand what you mean by the action is sometimes performed twice. Is your method (performLockedCacheEntryAction) called in a loop? By multiple threads? Both?
Actions
2. Re: Explicit locking in DIST_SYNC mode

infinikli Nov 15, 2010 8:38 AM (in response to manik)

We run this action on two different nodes simultaneously.
Node A: putIntoCache(key1, objectXY) @ T = 1
Node A: performLockedCacheEntryAction(actionXY,key1) @ T=2
Node B: performLockedCacheEntryAction(actionXY,key1) @ T=2
Where the action changes the state of the objectXY (with 'key1' as key in our distributed cache).
Actions
3. Re: Explicit locking in DIST_SYNC mode

manik Nov 15, 2010 8:47 AM (in response to infinikli)

Yes, in this case performLockedCacheEntryAction would happen twice, if the tx on NodeA and the tx on NodeB don't overlap. Or one may block on the other to finish, and then run.
Actions
4. Re: Explicit locking in DIST_SYNC mode

infinikli Nov 15, 2010 8:55 AM (in response to manik)

Ok, my fault. I should have explained what the action is doing
boolean doaction{
     if(item.state==0){
          item.state == 1;     //state changed
          return true;
     }
     return false;
     //state not changed
}
The problem is that this method returns true on both nodes.
Actions
5. Re: Explicit locking in DIST_SYNC mode

an1310 Nov 15, 2010 9:35 AM (in response to infinikli)

You mentioned lots of timeouts. Can you try removing the L1 caching. See https://jira.jboss.org/browse/ISPN-763 for more details.

Also, I'm unsure of where you're getting your state -- the item.state value returned by doAction(). From the code snippet above, it doesn't look like it's coming from the cache. You might be missing a get() call after you lock the key.
Actions
6. Re: Explicit locking in DIST_SYNC mode

infinikli Nov 15, 2010 9:46 AM (in response to an1310)

The item from above is an item which is shared across the cache. Before performing the action, we create an Action-object which takes this cached item as constructor-parameter.

//getFromCache performs a get() on the cache.
ItemTestImpl itl = cacheManager.<Integer,ItemTestImpl>getFromCache(currentCacheName, key);
//create the action
TestAction testAction = new TestAction(itl);
//perform the action
boolean intoCache = cacheManager.performLockedCacheEntryAction(currentCacheName, test, key);
if(intoCache) logger.info("Action performed");
Actions
7. Re: Explicit locking in DIST_SYNC mode

an1310 Nov 15, 2010 10:10 AM (in response to infinikli)

Have you read the locking and transaction sections of the wiki? Specifically, ISPN uses MVCC -- in a nutshell, reads don't block writers. The default isolation level is READ_COMMITTED as well.

It's certainly possible, given your chain of events, that the following is happening.

Node A: putIntoCache(key1, objectXY). Value is 1.
Node A: Reads value of objectXY. Last committed value is 1.
Node B: Reads value of objectXY. Last committed value is 1.
Node A: performLockedCacheEntryAction(actionXY,key1)
Node B: performLockedCacheEntryAction(actionXY,key1)

It seems like to get the behavior you want, you might need to either change the isolation level to REPEATABLE_READ and handle the possibilities of a write-skew error, or explicitly lock the cache key before reading the value.
Actions
8. Re: Explicit locking in DIST_SYNC mode

infinikli Nov 15, 2010 11:23 AM (in response to an1310)

Ok, thanks for the response. I finally got the point
Actions

Go to original post