3 Replies Latest reply on Mar 27, 2007 10:25 PM by brian.stansberry

    JBPAPP-85 Discussion Thread

    brian.stansberry

      Discussion of http://jira.jboss.com/jira/browse/JBPAPP-85.

      Dominik tried replacing the JBC and JGroups in 4.0.5 to see if the error appeared in 4.0.5. It didn't. He also tried using the 4.2.0.CR1 protocol stack config in 4.0.5 w/ JG 2.4.1.SP1 and also so no OOME.

      He reports it went away in 4.2.0.CR1 if JDK 1.6 is used, which is interesting since in 4.2.0.CR1's session repl layer one EDU.oswego ConcurrentHashMap was replaced with the java.util.concurrent version.

        • 1. Re: JBPAPP-85 Discussion Thread
          brian.stansberry

           

          "Dominik Pospisil" wrote:
          I was not able to reproduce OOME with 4.0.5 using replacet config. However, I received both with 4.0.5 and 4.2 following error. But it happen only very rarely, ~ 1 error / 20 min running time.

          [JBoss] java.lang.RuntimeException: JBossCacheService: exception occurred in cache put ...
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:150)
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheService.putSession(JBossCacheService.java:319)
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:121)
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1097)
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheManager.storeSession(JBossCacheManager.java:652)
          [JBoss] at org.jboss.web.tomcat.service.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:49)
          [JBoss] at org.jboss.web.tomcat.service.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:98)
          [JBoss] at org.jboss.web.tomcat.service.session.JvmRouteValve.invoke(JvmRouteValve.java:84)
          [JBoss] at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
          [JBoss] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
          [JBoss] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:104)
          [JBoss] at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:156)
          [JBoss] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
          [JBoss] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:216)
          [JBoss] at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:437)
          [JBoss] at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:447)
          [JBoss] at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:445)
          [JBoss] at java.lang.Thread.run(Thread.java:595)
          [JBoss] Caused by: java.lang.RuntimeException: java.lang.RuntimeException: failed executing request [GroupRequest:
          [JBoss] req_id=1174905878791
          [JBoss] caller=10.16.0.123:32816
          [JBoss] 10.16.0.121:32813: sender=10.16.0.121:32813, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.124:32817: sender=10.16.0.124:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.126:32817: sender=10.16.0.126:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.127:32795: sender=10.16.0.127:32795, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.122:32808: sender=10.16.0.122:32808, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.125:32817: sender=10.16.0.125:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.128:32795: sender=10.16.0.128:32795, retval=null, received=false, suspected=false
          [JBoss]
          [JBoss] request_msg: [dst: <null>, src: <null> (1 headers), size = 5369 bytes]
          [JBoss] rsp_mode: GET_NONE
          [JBoss] done: true
          [JBoss] timeout: 20000
          [JBoss] expected_mbrs: 0
          [JBoss] ]
          [JBoss] at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5782)
          [JBoss] at org.jboss.cache.TreeCache.put(TreeCache.java:3759)
          [JBoss] at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)
          [JBoss] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          [JBoss] at java.lang.reflect.Method.invoke(Method.java:585)
          [JBoss] at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
          [JBoss] at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
          [JBoss] at org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
          [JBoss] at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
          [JBoss] at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
          [JBoss] at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
          [JBoss] at $Proxy70.put(Unknown Source)
          [JBoss] at org.jboss.web.tomcat.service.session.JBossCacheWrapper.put(JBossCacheWrapper.java:141)
          [JBoss] ... 17 more
          [JBoss] Caused by: java.lang.RuntimeException: failed executing request [GroupRequest:
          [JBoss] req_id=1174905878791
          [JBoss] caller=10.16.0.123:32816
          [JBoss] 10.16.0.121:32813: sender=10.16.0.121:32813, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.124:32817: sender=10.16.0.124:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.126:32817: sender=10.16.0.126:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.127:32795: sender=10.16.0.127:32795, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.122:32808: sender=10.16.0.122:32808, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.125:32817: sender=10.16.0.125:32817, retval=null, received=false, suspected=false
          [JBoss] 10.16.0.128:32795: sender=10.16.0.128:32795, retval=null, received=false, suspected=false
          [JBoss]
          [JBoss] request_msg: [dst: <null>, src: <null> (1 headers), size = 5369 bytes]
          [JBoss] rsp_mode: GET_NONE
          [JBoss] done: true
          [JBoss] timeout: 20000
          [JBoss] expected_mbrs: 0
          [JBoss] ]
          [JBoss] at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:432)
          [JBoss] at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:192)
          [JBoss] at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
          [JBoss] at org.jboss.cache.TreeCache.callRemoteMethodsViaReflection(TreeCache.java:4404)
          [JBoss] at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4358)
          [JBoss] at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4311)
          [JBoss] at org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4422)
          [JBoss] at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:110)
          [JBoss] at org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:88)
          [JBoss] at org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:119)
          [JBoss] at org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:88)
          [JBoss] at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
          [JBoss] at org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:365)
          [JBoss] at org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:160)
          [JBoss] at org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
          [JBoss] at org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167)
          [JBoss] at org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5776)
          [JBoss] ... 29 more
          [JBoss] Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
          [JBoss] at java.util.ArrayList.RangeCheck(ArrayList.java:546)
          [JBoss] at java.util.ArrayList.get(ArrayList.java:321)
          [JBoss] at org.jgroups.protocols.FC.handleDownMessage(FC.java:390)
          [JBoss] at org.jgroups.protocols.FC.down(FC.java:320)
          [JBoss] at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
          [JBoss] at org.jgroups.protocols.FC.receiveDownEvent(FC.java:314)
          [JBoss] at org.jgroups.stack.Protocol.passDown(Protocol.java:551)
          [JBoss] at org.jgroups.protocols.FRAG2.down(FRAG2.java:167)
          [JBoss] at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
          [JBoss] at org.jgroups.stack.Protocol.passDown(Protocol.java:551)
          [JBoss] at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:294)
          [JBoss] at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
          [JBoss] at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:385)
          [JBoss] at org.jgroups.JChannel.down(JChannel.java:1231)
          [JBoss] at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:788)
          [JBoss] at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passDown(MessageDispatcher.java:765)
          [JBoss] at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:299)
          [JBoss] at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:444)
          [JBoss] at org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:193)
          [JBoss] at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:429)
          [JBoss] ... 45 more


          I've opened a JIRA for this. See http://jira.jboss.com/jira/browse/JGRP-447.

          • 2. Re: JBPAPP-85 Discussion Thread
            dpospisil

            Few notes after completing more test cycles.

            - OOME can be easily reproduced when continuously increasing server load. OOME is the first error the server fails with. (it is not the case with 4.0.5)

            - OOME error can be reproduced using high constant load but not if using low load. If the server is loaded with ~ 1/2 of failing configuration, it can run for hours without throwing OOME.

            It lead to question if it should be treated as an error? Or is it just reaching of server limits?

            • 3. Re: JBPAPP-85 Discussion Thread
              brian.stansberry

              Notes from our IRC discussion today:

              1) The OOME needs to be treated as an error. When the server reaches its limits it should degrade more or less gracefully, but not fail.

              2) The OOME occurs when your tests get the load up to levels that aren't attainable with 4.0.5. Thus the absence of the problem in 4.0.5 + JBC 1.4.1 + JG 2.4.1 doesn't really say all that much. Good thing though is this is likely not a regression.

              3) You're going to try to confirm the absence of the problem with JDK 6. Before you didn't have enough runs to say for sure it doesn't appear there. You'll also trying setting UDP.use_outgoing_packet_handler="true".