Infinispan cache not clustering on AS 7.1.1 startup.
safetytrick Apr 25, 2012 11:47 AMI have an Infinispan cache container and a few caches defined in my standalone.xml. When I startup a cluster (currently 4 nodes i'm testing with) I have a 50% chance that one of the nodes won't join the cluster correctly. The JGroups subsystem shows all 4 of my nodes and each node is aware of the other three. My first access to the cache on a corrupted node throws this exception:
org.infinispan.CacheException: Unable to invoke method private void org.infinispan.statetransfer.BaseStateTransferManagerImpl.start() throws java.lang.Exception on object
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:238)
at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:882)
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:637)
at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:626)
at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:173)
at org.infinispan.CacheImpl.start(CacheImpl.java:499)
at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:626)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:516)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:530)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:148)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:139)
at com.me.cache.InfinispanCacheFactory.createCache(InfinispanCacheFactory.java:29)
at com.me.cache.CacheService.getCache(CacheService.java:34)
at com.me.session.SessionService.getCache(SessionService.java:265)
at com.me.session.SessionService.writeSession(SessionService.java:81)
at com.me.session.SessionService.writeLocalSessions(SessionService.java:282)
at com.me.util.ThreadEnvironmentManager$1.run(ThreadEnvironmentManager.java:41)
at com.me.util.ThreadEnvironmentManager$ThreadEnvironment.destroy(ThreadEnvironmentManager.java:126)
at com.me.util.ThreadEnvironmentManager$ThreadEnvironment.access$000(ThreadEnvironmentManager.java:119)
at com.me.util.ThreadEnvironmentManager.stop(ThreadEnvironmentManager.java:77)
at com.me.r15.servlet.RequestInitializationFilter.doFilter(RequestInitializationFilter.java:44)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161)
at org.jboss.as.web.session.ClusteredSessionValve.handleRequest(ClusteredSessionValve.java:125)
at org.jboss.as.web.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:91)
at org.jboss.as.web.session.JvmRouteValve.invoke(JvmRouteValve.java:88)
at org.jboss.as.web.session.LockingValve.invoke(LockingValve.java:56)
at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:671)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to x1-1/node
at org.infinispan.util.Util.rewrapAsCacheException(Util.java:524)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:168)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:478)
at org.infinispan.cacheviews.CacheViewsManagerImpl.join(CacheViewsManagerImpl.java:214)
at org.infinispan.statetransfer.BaseStateTransferManagerImpl.start(BaseStateTransferManagerImpl.java:146)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:236)
... 38 more
Caused by: org.jgroups.TimeoutException: timeout sending message to x1-1/node
at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:360)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:263)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:163)
... 46 more
After the exception is thrown Cache put operations succeed without problems but get operations almost never return correct data (they succeed for a few millis). The node only starts participating in the cache once it has been restarted.
My cache container in standalone looks like this:
<cache-container name="me" jndi-name="java:jboss/infinispan/me" start="EAGER">
<transport lock-timeout="60000"/>
<distributed-cache name="SESSION" virtual-nodes="48" mode="ASYNC" batching="true">
<eviction strategy="LRU" max-entries="132000"/>
<expiration max-idle="28800000" lifespan="-1" interval="60000"/>
<file-store/>
</distributed-cache>
<distributed-cache name="RECENT_UPDATES" virtual-nodes="48" mode="ASYNC" batching="true">
<eviction strategy="LRU" max-entries="32768"/>
<expiration max-idle="-1" lifespan="1800000" interval="60000"/>
<file-store/>
</distributed-cache>
</cache-container>
The rest of my standalone.xml is very standard, the jgroups configuration is identical to what ships with 7.1.1.
I did switching from jgroups udp stack to tcp stack at one point, this made the problem significantly worse, 2 nodes out of 4 would get this error on every startup.
Any steps to proceed?
Michael