14 Replies Latest reply on Nov 3, 2005 1:46 PM by ben.wang

    Problems with Optimistic Locking in 1.2.4 FINAL

    xavierpayne2

      Manik>

      Per your request I have downloaded and installed the 1.2.4 final jbosscache.jar into my 4.0.2 jboss deployment.

      I still get the same transaction exceptions with 1.2.4 final as I did with the cvs releases.

      An important thing to note is that I have 1 remote client making repeated calls to a stateless session bean. The Stateless Session bean is then writing to the cache. If I run that client by itself it can run for 100,000s of iterations just fine.

      If have another remote client that makes repeated calls to another stateless session bean that does cache reads.

      If I run these clients in parallel (same time but seperate vms) It works with Pessimistic locking but with Optimistic locking I get this exception after between 50 - 100 iterations.

      10:54:13,480 INFO [OptimisticValidatorInterceptor] DataNode [/Core/Connections/4b6s1o1j-mbd0of-ef6vdhq2-1-ef6vfv4z-4/SubscriberSequences/4b6s1o1j-mbd0of-ef6vdhq2-1-ef6vfv9r-5/SubStats] version number (52) is greater than or equal to workspace node version 52
      10:54:13,480 WARN [OptimisticTxInterceptor] runPreparePhase() failed. Transaction is marked as rolled back
      org.jboss.cache.CacheException: unable to validate nodes
       at org.jboss.cache.interceptors.OptimisticValidatorInterceptor.validateNodes(OptimisticValidatorInterceptor.java:115)
       at org.jboss.cache.interceptors.OptimisticValidatorInterceptor.invoke(OptimisticValidatorInterceptor.java:70)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticLockingInterceptor.invoke(OptimisticLockingInterceptor.java:87)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticReplicationInterceptor.invoke(OptimisticReplicationInterceptor.java:76)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.handleLocalPrepare(OptimisticTxInterceptor.java:262)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.access$000(OptimisticTxInterceptor.java:30)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor$SynchronizationHandler.beforeCompletion(OptimisticTxInterceptor.java:600)
       at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
       at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1384)
       at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1076)
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:296)
       at org.jboss.tm.TxManager.commit(TxManager.java:200)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.handleLocalTx(OptimisticTxInterceptor.java:216)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.invoke(OptimisticTxInterceptor.java:109)
       at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:4339)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3083)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3024)
       at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:585)
       at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:141)
       at org.jboss.mx.server.Invocation.dispatch(Invocation.java:80)
       at org.jboss.mx.server.Invocation.invoke(Invocation.java:72)
       at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:249)
       at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:644)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.invokeServerMethod(JBossSerializableCacheServiceHelper.java:266)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.put(JBossSerializableCacheServiceHelper.java:183)
       at infrastructure.platform.server.util.cache.impl.SerializableJBossCacheMap.put(SerializableJBossCacheMap.java:374)
       at infrastructure.platform.server.util.cache.CachedMap.put(CachedMap.java:255)
       at infrastructure.platform.shared.capi.core.plugin.j2ee.jms.impl.SubProliferator.disseminate(SubProliferator.java:223)
       at infrastructure.platform.server.publication.core.impl.PluggableDisseminator.disseminate(PluggableDisseminator.java:84)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.exec(PSDisseminationWorker.java:270)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.run(PSDisseminationWorker.java:350)
       at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:743)
       at java.lang.Thread.run(Thread.java:595)
      10:54:13,527 WARN [OptimisticTxInterceptor] Rolling back exception encountered
      org.jboss.tm.JBossRollbackException: Unable to commit, tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=WILEY/669, BranchQual=, localId=669] status=STATUS_NO_TRANSACTION; - nested throwable: (org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.CacheException: unable to validate nodes))
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:344)
       at org.jboss.tm.TxManager.commit(TxManager.java:200)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.handleLocalTx(OptimisticTxInterceptor.java:216)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.invoke(OptimisticTxInterceptor.java:109)
       at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:4339)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3083)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3024)
       at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:585)
       at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:141)
       at org.jboss.mx.server.Invocation.dispatch(Invocation.java:80)
       at org.jboss.mx.server.Invocation.invoke(Invocation.java:72)
       at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:249)
       at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:644)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.invokeServerMethod(JBossSerializableCacheServiceHelper.java:266)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.put(JBossSerializableCacheServiceHelper.java:183)
       at infrastructure.platform.server.util.cache.impl.SerializableJBossCacheMap.put(SerializableJBossCacheMap.java:374)
       at infrastructure.platform.server.util.cache.CachedMap.put(CachedMap.java:255)
       at infrastructure.platform.shared.capi.core.plugin.j2ee.jms.impl.SubProliferator.disseminate(SubProliferator.java:223)
       at infrastructure.platform.server.publication.core.impl.PluggableDisseminator.disseminate(PluggableDisseminator.java:84)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.exec(PSDisseminationWorker.java:270)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.run(PSDisseminationWorker.java:350)
       at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:743)
       at java.lang.Thread.run(Thread.java:595)
      Caused by: org.jboss.util.NestedRuntimeException: ; - nested throwable: (org.jboss.cache.CacheException: unable to validate nodes)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor$SynchronizationHandler.beforeCompletion(OptimisticTxInterceptor.java:631)
       at org.jboss.cache.interceptors.OrderedSynchronizationHandler.beforeCompletion(OrderedSynchronizationHandler.java:72)
       at org.jboss.tm.TransactionImpl.doBeforeCompletion(TransactionImpl.java:1384)
       at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1076)
       at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:296)
       ... 24 more
      Caused by: org.jboss.cache.CacheException: unable to validate nodes
       at org.jboss.cache.interceptors.OptimisticValidatorInterceptor.validateNodes(OptimisticValidatorInterceptor.java:115)
       at org.jboss.cache.interceptors.OptimisticValidatorInterceptor.invoke(OptimisticValidatorInterceptor.java:70)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticLockingInterceptor.invoke(OptimisticLockingInterceptor.java:87)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticReplicationInterceptor.invoke(OptimisticReplicationInterceptor.java:76)
       at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:41)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.handleLocalPrepare(OptimisticTxInterceptor.java:262)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.access$000(OptimisticTxInterceptor.java:30)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor$SynchronizationHandler.beforeCompletion(OptimisticTxInterceptor.java:600)
       ... 28 more
      10:54:13,527 WARN [OptimisticTxInterceptor] Roll back failed encountered
      java.lang.IllegalStateException: No transaction.
       at org.jboss.tm.TxManager.rollback(TxManager.java:331)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.handleLocalTx(OptimisticTxInterceptor.java:224)
       at org.jboss.cache.interceptors.OptimisticTxInterceptor.invoke(OptimisticTxInterceptor.java:109)
       at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:4339)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3083)
       at org.jboss.cache.TreeCache.put(TreeCache.java:3024)
       at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:585)
       at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:141)
       at org.jboss.mx.server.Invocation.dispatch(Invocation.java:80)
       at org.jboss.mx.server.Invocation.invoke(Invocation.java:72)
       at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:249)
       at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:644)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.invokeServerMethod(JBossSerializableCacheServiceHelper.java:266)
       at infrastructure.platform.server.util.services.cache.impl.JBossSerializableCacheServiceHelper.put(JBossSerializableCacheServiceHelper.java:183)
       at infrastructure.platform.server.util.cache.impl.SerializableJBossCacheMap.put(SerializableJBossCacheMap.java:374)
       at infrastructure.platform.server.util.cache.CachedMap.put(CachedMap.java:255)
       at infrastructure.platform.shared.capi.core.plugin.j2ee.jms.impl.SubProliferator.disseminate(SubProliferator.java:223)
       at infrastructure.platform.server.publication.core.impl.PluggableDisseminator.disseminate(PluggableDisseminator.java:84)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.exec(PSDisseminationWorker.java:270)
       at infrastructure.platform.server.publication.core.utils.impl.PSDisseminationWorker.run(PSDisseminationWorker.java:350)
       at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:743)
       at java.lang.Thread.run(Thread.java:595)


        • 1. Re: Problems with Optimistic Locking in 1.2.4 FINAL
          belaban

          Why is it rolling back the TX if the versions are the same (52) ? This should not be the case...

          • 2. Re: Problems with Optimistic Locking in 1.2.4 FINAL
            manik

            It does do this; because the moment a node is copied from the tree to the workspace it has it's version incremented. So, the node in the workspace should always have a greater version than the node in the tree. If this isn't the case (i.e., they may be equal), someone has updated the version of the node in the tree and something is out of date.

            I suspect there may be a concurrency bug around this, when a node is copied from the tree to the workspace. Looking into it.

            • 3. Re: Problems with Optimistic Locking in 1.2.4 FINAL
              manik

              There are some basic premises with optimistic locking. To achieve the levels of concurrency, optimistic locking is only really recommended in a read-mostly scenario.

              The moment you have concurrent writes to a node (even if it is just two tx's writing to the same node) one of the tx's will be forced to roll back (via a RollbackException) when validation fails.

              In your test, does the SLSB write to the cache under the same fqn every time? If this is the case I can see why this happens, and I'd consider this normal behaviour ... something the application would have to be prepared to deal with by re-trying the tx.

              • 4. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                xavierpayne2

                Yes the slsb is writing to the same fqn every time...

                As more of a philisophical question. Should a developer using jbosscache have to worry about this (catching and retrying a failed write) or should this behavior be inlined inside the api but only get used when the cache is configured for optimistic locking? (the max retries could even be configurable in the service descriptor)

                As a developer using the api it would be nice if I didn't need to worry about the semantics involved with enabling optimistic locking. (much like I don't need to worry about transactions at all as the transaction manager can handle them.)

                In our implementation we toggle many properties of the 4 caches we use based on the deployment environment. (for example if its being used in a cluster we might change one cache from reply_async to repl_sync to ensure consistency.) It's nice to be able to do this and not have to worry about our 200,000 plus lines of code blowing up because the semantics of the api we are are calling have changed under the hood and now regularly throw exceptions we never had to catch before.

                Just my thoughts. I won't be at all offended if they land in the bit bucket. :)

                • 5. Re: Problems with Optimistic Locking in 1.2.4 FINAL

                  Yes, user should expect that not every time the put operation will succeed, for instance, due to write contention. JBoss Cache should throw something like LockTimeoutException, etc.

                  But in this case, we may need to be more explicit.

                  -Ben

                  • 6. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                    belaban

                    We thought about this before. The problem is that when such an exception occurs, the TX will be rolled back, so you would have to submit the entire batch of operations that you submitted within the TX scope again, not just the failed operation. We chose to let the developer handle this.

                    • 7. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                      manik

                      Precisely. Since the tx may span resources beyond the scope of JBossCache, the best that can be done is to throw a RollbackException so all resources may be rolled back and the tx potentially re-run.

                      If we tried to do this transparently, we could only do so within the scope of JBossCache and other resources participating in the tx may be using stale/invalid cache data.

                      • 8. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                        xavierpayne2

                        But this is only the case if I am also writing the transactions myself right? Meaning if I as the developer open a new transaction do a bunch of puts then commit.

                        If I am just calling put over and over with no transaction code of my own in the middle then each interaction (each put) is an individual isolated transaction right?

                        it's in this case that I think there might be some value to inlining the retries. I agree that in the context of transactions that involve multple puts/get this behavior would be a bad thing.

                        At the end of the day you guys probably have the best idea of what the right thing to do is. So again. I take no offense if mines a lame idea. I just tossed it out there as fodder for discussion. :)

                        • 9. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                          belaban

                          I get your point, if we don't use TXs, then it may be worthwile to do some retries.
                          This leads to different semantics for TXs/no-TXs though.
                          wHAT DO OTHER PEOPLE THINK ?

                          • 10. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                            manik

                             

                            "bela@jboss.com" wrote:

                            wHAT DO OTHER PEOPLE THINK ?


                            Well, Bela, now there's no reason to shout! :)

                            DashV, makes sense. With optimistic locking, we first check for an existing transaction, and if there isn't one, we implicitly start one. Before the method returns, if we started the tx ourselves, we make sure we commit it. So perhaps there is scope to 'replay' the tx a fixed number of times here if it were to throw a RollbackException on commit.

                            Bela, I don't think this leads to very different semantics - at the moment, if the developer explicitly starts a tx before any cache operations, the cache may throw a RollbackException if it is unable to commit. If the developer doesn't explicitly start a tx, the cache may fail silently if it is unable to commit, logging errors but not throwing any exceptions. So there is scope here for silent retries.

                            • 11. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                              belaban

                               

                              "manik.surtani@jboss.com" wrote:
                              "bela@jboss.com" wrote:

                              wHAT DO OTHER PEOPLE THINK ?


                              Well, Bela, now there's no reason to shout! :)


                              Damn caps lock keys :-)


                              DashV, makes sense. With optimistic locking, we first check for an existing transaction, and if there isn't one, we implicitly start one. Before the method returns, if we started the tx ourselves, we make sure we commit it. So perhaps there is scope to 'replay' the tx a fixed number of times here if it were to throw a RollbackException on commit.

                              Bela, I don't think this leads to very different semantics - at the moment, if the developer explicitly starts a tx before any cache operations, the cache may throw a RollbackException if it is unable to commit. If the developer doesn't explicitly start a tx, the cache may fail silently if it is unable to commit, logging errors but not throwing any exceptions. So there is scope here for silent retries.


                              I agree. Let's create a JIRA feature for this. Put it into 1.3, although this one may get moved to a subsequent release...

                              • 12. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                                manik
                                • 13. Re: Problems with Optimistic Locking in 1.2.4 FINAL
                                  rino_salvade

                                  I do think the performing retries automatically, i.e. as part of the API, can lead to some strange effects. The reason that the optimistic locking fails is normally that someone else has modified the node in the meantime. Since the node contains data that is relevant to the application, the application is the only place where I can decide if I would like to keep these changes or if I want to replace them with the data of my original transaction.
                                  This said I think the only valid solution (even in the case with no transactions) is to throw an exception (maybe a special type) and let the application do the rest.

                                  • 14. Re: Problems with Optimistic Locking in 1.2.4 FINAL

                                    I agree. I dodn't think it is the job of cache to do re-try (because then you will need to provide another parameter for MAX_RETRY, etc.).

                                    What we need is a more specific exception telling caller like LOCK_TIMEOUT or REPL_FAILED. They should try to handle the exception just like they would for tx rollback.

                                    -Ben