1 2 Previous Next 20 Replies Latest reply on Sep 25, 2016 4:39 PM by jamat

    HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException

    gilby

      Hello,

       

      We recently upgraded our two-server cluster from JBoss EAP 6.1.0 Alpha to Wildfly 8.2.0.Final. We run a web app built with Wicket (http://wicket.apache.org/), and use Wildfly’s distributed web cache for http session failover. Ever since the upgrade, we’ve been receiving intermittent org.infinispan.util.concurrent.TimeoutExceptions (sample stack trace below).

       

      The exception occurs when a client sends multiple requests to our application (e.g. in different browser tabs), and one page (or AJAX action) takes longer than 15 seconds to render/complete (way too long for a user to wait, but that’s a different problem). The long-running request causes the user’s other requests to block (ostensibly because of an Infinispan lock on the session), and they eventually fail with the org.infinispan.util.concurrent.TimeoutException.

       

      Here’s our distributed web cache config from standalone-full-ha.xml:

       

        <cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
           <transport lock-timeout="60000"/>
           <distributed-cache name="dist" mode="ASYNC" batching="true" l1-lifespan="0" owners="2">
                <file-store/>
           </distributed-cache>
        </cache-container>
      
      

       

      We experimented with other locking isolation levels (e.g. READ_UNCOMMITTED), but that didn’t make any difference. We also tried turning off batching, but that caused Wildfly to complain about our cache config on startup. So, we’re wondering: is there a way to configure Infinispan with a distributed web cache that doesn’t effectively queue requests from the same user (i.e. session)? Concurrent session access seemed to work well in JBoss EAP 6.1.0.

       

      2015-03-30 11:26:25,240 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-8) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [15 seconds] on key [4KPG9Zr7y8hvxHMlOLNHWgVs] for requestor [GlobalTransaction:<null>:15:local]! Lock held by [GlobalTransaction:<null>:14:local]
      at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:198)
      at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:171)
      at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:169)
      at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.visitGetKeyValueCommand(PessimisticLockingInterceptor.java:70)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
      at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:112)
      at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:74)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
      at org.infinispan.interceptors.TxInterceptor.enlistReadAndInvokeNext(TxInterceptor.java:226)
      at org.infinispan.interceptors.TxInterceptor.visitGetKeyValueCommand(TxInterceptor.java:221)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
      at org.infinispan.interceptors.CacheMgmtInterceptor.visitGetKeyValueCommand(CacheMgmtInterceptor.java:92)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
      at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:110)
      at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:73)
      at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:74)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
      at org.infinispan.interceptors.BatchingInterceptor.handleDefault(BatchingInterceptor.java:79)
      at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:74)
      at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:40)
      at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:333)
      at org.infinispan.CacheImpl.get(CacheImpl.java:377)
      at org.infinispan.DecoratedCache.get(DecoratedCache.java:396)
      at org.infinispan.AbstractDelegatingCache.get(AbstractDelegatingCache.java:271)
      at org.jboss.as.clustering.infinispan.invoker.Locator$FindOperation.invoke(Locator.java:54)
      at org.jboss.as.clustering.infinispan.invoker.Locator$LockingFindOperation.invoke(Locator.java:71)
      at org.jboss.as.clustering.infinispan.invoker.SimpleCacheInvoker.invoke(SimpleCacheInvoker.java:34)
      at org.jboss.as.clustering.infinispan.invoker.RetryingCacheInvoker.invoke(RetryingCacheInvoker.java:87)
      at org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionFactory.findValue(CoarseSessionFactory.java:109)
      at org.wildfly.clustering.web.infinispan.session.coarse.CoarseSessionFactory.findValue(CoarseSessionFactory.java:55)
      at org.wildfly.clustering.web.infinispan.session.InfinispanSessionManager.findSession(InfinispanSessionManager.java:148)
      at org.wildfly.clustering.web.undertow.session.DistributableSessionManager.getSession(DistributableSessionManager.java:115)
      at io.undertow.servlet.spec.ServletContextImpl.getSession(ServletContextImpl.java:688) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.servlet.spec.HttpServletRequestImpl.getSession(HttpServletRequestImpl.java:364) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at org.jboss.weld.servlet.SessionHolder.requestInitialized(SessionHolder.java:47) [weld-core-impl-2.2.6.Final.jar:2014-10-03 10:05]
      at org.jboss.weld.servlet.HttpContextLifecycle.requestInitialized(HttpContextLifecycle.java:212) [weld-core-impl-2.2.6.Final.jar:2014-10-03 10:05]
      at org.jboss.weld.servlet.WeldInitialListener.requestInitialized(WeldInitialListener.java:160) [weld-core-impl-2.2.6.Final.jar:2014-10-03 10:05]
      at io.undertow.servlet.core.ApplicationListeners.requestInitialized(ApplicationListeners.java:216) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:260) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:247) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:76) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:166) [undertow-servlet-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.server.Connectors.executeRootHandler(Connectors.java:197) [undertow-core-1.1.0.Final.jar:1.1.0.Final]
      at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:759) [undertow-core-1.1.0.Final.jar:1.1.0.Final]
      
      
        • 1. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
          gilby

          Update: https://developer.jboss.org/wiki/ClusteringChangesInWildFly8 states, "Pessimistic locking of cache entries effectively ensures that only a single client on a single node ever accesses a given session at any given time." This seems to explain the problem we're having, so now I can make my question more specific: is there a way to configure Infinispan with a distributed web cache that uses optimistic locking?

          • 2. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
            pferraro

            Add the following to your cache configuration:

            <transaction locking="OPTIMISTIC"/>
            
            • 3. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
              gilby

              Ah...I had added

              <transaction mode="NON_XA" locking="OPTIMISTIC"/>
              

              via the web console, but that prevented the server from starting up. Apparently, mode="NONE" is the default when you omit the mode attribute.

               

              Serving concurrent http requests within the same http session seems like a pretty common scenario, so I'm curious: why is PESSIMISTIC the default locking policy for the distributable web cache? Is there some danger in using OPTIMISTIC locking that I'm unaware of? As I understand it, OPTIMISTIC locking could cause a request to fail if the session is modified during concurrent session access, but otherwise, it seems to provide a significant performance boost.

               

              pferraro thanks so much for your response!

              • 4. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                gilby

                pferraro When I added <transaction locking="OPTIMISTIC"/> to our <distributed-cache> config, Wildfly logged this error on startup:

                 

                org.jboss.msc.service.StartException in service jboss.infinispan.web.dist.config: org.infinispan.commons.CacheConfigurationException: Write-skew checking is only supported in REPL_SYNC, DIST_SYNC and LOCAL modes.  DIST_ASYNC cannot be used with write-skew checking
                

                 

                So, it looks like OPTIMISTIC locking can only be used with synchronous or local caches. We'd prefer to continue using an asynchronous cache for performance reasons...do you know of any other way to configure the cache to allow concurrent requests within the same session?

                • 5. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                  gilby

                  We finally found a way to address this issue:

                   

                              <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan">
                                  <transport lock-timeout="60000"/>
                                  <replicated-cache name="repl" mode="ASYNC" batching="true">
                                      <transaction locking="OPTIMISTIC"/>
                                      <locking isolation="READ_COMMITTED"/>
                                      <file-store/>
                                  </replicated-cache>
                              </cache-container>
                  

                   

                  By using OPTIMISTIC locking with an isolation level of READ_COMMITTED, we are now able to serve concurrent requests within the same HTTP session while also supporting failover. Although we chose to switch to an asynchronous replicated cache, I believe the transaction and locking settings shown above will also work with synchronous or asynchronous distributed or replicated caches.

                   

                  A bit more detail: By default, Wildfly configures distributed and replicated Infinispan caches to use PESSIMISTIC locking with REPEATABLE_READ isolation, which causes reads from the session cache to block other read attempts. As I understand it, this modified configuration allows concurrent reads from the session cache, but will still lock around writes (If someone has a different/more accurate understanding of this, please chime in!).


                  Here are some links to the Infinispan documentation RE: these settings: http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_transaction_modes, http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_isolation_levels.

                  • 6. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                    tabs_ls

                    Hi Allen,

                     

                    Thank you for sharing this info. The configuration that you posted seems to solve the deadlock we were seeing when using   @Suspended AsyncResponse

                    in a rest call. (long polling)

                     

                    https://developer.jboss.org/message/928746?et=watches.email.thread#928746

                     

                    Regards,

                    David.

                    • 7. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                      pferraro

                      I used PESSIMISTIC, REPEATABLE_READ in the default configuration in order to enforce 7.7.2 of the servlet specification:

                      "Within an application marked as distributable, all requests that are part of a session must be handled by one JVM at a time."

                      This is, admittedly, overly coarse, since exclusivity should ideally be per-node, not per-thread.

                      • 8. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                        humanbeing

                        I agree with David,

                         

                        This has saved my day.  Thanks to everyone!

                        • 9. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                          tabs_ls

                          Hi Paul,

                           

                          Thanks for the reply.

                           

                          Correct me if I`m wrong but would the spec conflict with the usage of AsyncResponse?.

                           

                          We were under the impression that with the introduction of AsyncResponse, the browser would be able to maintain a full-duplex connection with the server.

                           

                          eg using @Suspend on the server to wait for events/messages to turn up from an external source whilst simultaneously being able to send data to server.

                           

                          Regards,

                          David.

                          • 10. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                            pferraro

                            The spec doesn't conflict with the usage of AsyncResponse - but our default configuration forces serial access to the session, which isn't ideal.

                            I'm working on a way to allow concurrent access to the session from a single node only, such that we still enforce the single-JVM requirement of the spec.

                            • 11. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                              scyta1e

                              Guys, it looks like the config change to OPTIMISTIC/READ_COMMITTED changes the problem from being an immediate locking problem into a memory leak. Our overnight tests failed within hours with 1.6m - org.infinispan.commands.write.ReplaceCommand queued up. Going to do some more investigation today and will report back.

                              • 12. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                                scyta1e

                                Quick update - we couldn't reproduce this memory leak problem in a days worth of testing. However we are seeing another problem which manifests as simply random errors when trying to access the HTTP session during form based authentication in JAAS. For example:

                                 

                                2015-05-15 10:09:53,767 ERROR [io.undertow.request] (default task-25) Undertow request failed HttpServerExchange{ POST /FXOEService/api/j_security_check}: java.lang.NullPointerException

                                        at io.undertow.servlet.spec.ServletInputStreamImpl.readIntoBuffer(ServletInputStreamImpl.java:171)

                                 

                                Caused by trying to access the HTTP session and it being null randomly. There's a few other variants of this which crop up dependent on where the session is being accessed. This is when running OPTIMISTIC/READ_COMMITTED.

                                • 13. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                                  scyta1e

                                  Does anyone know if there's a way of just turning off the web session part of clustering and keep ejb + hornetq which seems to be working fine?

                                  At the moment (less than a month from production) we don't strictly need web session clustering - it's a nice to have..

                                  • 14. Re: HTTP Session Replication - org.infinispan.util.concurrent.TimeoutException
                                    tabs_ls

                                    Hi Paul,

                                     

                                    If you limit concurrent access to only one node at one time, does this not mean that if we are using async. with suspend from one request and that request gets handled by one node, and the user (browser) makes another request, to the server to say another node, the second request will still deadlock on the second node?. (because of the 7.7.2 spec)

                                     

                                    But nonetheless, the changes you are making would be more than welcome since we can use sticky session to redirect to node only.

                                     

                                    Regards,

                                    David.

                                    1 2 Previous Next