3 Replies Latest reply on Apr 9, 2019 8:32 AM by Dan Berindei

    Infinispan node cannot join cache after crash

    Nikolai Sahattchiev Newbie

      Hi,

       

      we have a distributed Infinispan cache running in 4 nodes and with global state and persistence:

       

      <global-state>

          <persistent-location path="rocksdb/${localNodeId}/persistent" />
          <shared-persistent-location path="rocksdb/${localNodeId}/shared"/>
          <temporary-location path="rocksdb/${localNodeId}/tmp"/>
          <overlay-configuration-storage />
      </global-state>
      .....
      <distributed-cache name="EiwoDistributedCache" mode="SYNC" remote-timeout="300000" owners="2" segments="100">
          <locking concurrency-level="1000" acquire-timeout="60000"/>
          <transaction mode="NONE"/>

          <persistence passivation="false">
              <rocksdbStore:rocksdb-store preload="true" fetch-state="true" path="rocksdb/${localNodeId}/data/">
                  <rocksdbStore:expiration path="rocksdb/${localNodeId}/expired/"/>
              </rocksdbStore:rocksdb-store>
          </persistence>
          <indexing index="NONE"/>

          <state-transfer timeout="120000" await-initial-transfer="true"></state-transfer>
      </distributed-cache>

       

      After an out-of-memory in node 2 the whole cluster was in an unstable state and we tried to restart it. Nodes 1, 2 and 4 could be started without any issues, but node 3 failed always with the following exception:

       

      org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.start() throws java.lang.Exception on object of type StateTransferManagerImpl

           at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:83)

           at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:71)

           at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:76)

           at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:185)

           at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:968)

           at org.infinispan.factories.AbstractComponentRegistry.lambda$invokePrioritizedMethods$6(AbstractComponentRegistry.java:703)

           at org.infinispan.factories.SecurityActions.lambda$run$1(SecurityActions.java:72)

           at org.infinispan.security.Security.doPrivileged(Security.java:44)

           at org.infinispan.factories.SecurityActions.run(SecurityActions.java:71)

           at org.infinispan.factories.AbstractComponentRegistry.invokePrioritizedMethods(AbstractComponentRegistry.java:696)

           at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:689)

                at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:607)

           at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:244)

           at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:1051)

           at org.infinispan.cache.impl.AbstractDelegatingCache.start(AbstractDelegatingCache.java:421)

           at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:646)

           at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:591)

           at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:477)

           at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:463)

           at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:449)

           ..........

      Caused by: org.infinispan.topology.CacheJoinException: ISPN000410: Node eiwopoc-14554 attempting to join cache EiwoDistributedCache with incompatible state

           at org.infinispan.topology.ClusterCacheStatus.addMember(ClusterCacheStatus.java:233)

           at org.infinispan.topology.ClusterCacheStatus.doJoin(ClusterCacheStatus.java:692)

           at org.infinispan.topology.ClusterTopologyManagerImpl.handleJoin(ClusterTopologyManagerImpl.java:212)

           at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:178)

           at org.infinispan.topology.CacheTopologyControlCommand.invokeAsync(CacheTopologyControlCommand.java:160)

           at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.invokeReplicableCommand(GlobalInboundInvocationHandler.java:169)

           at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.runReplicableCommand(GlobalInboundInvocationHandler.java:150)

           at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.lambda$handleReplicableCommand$1(GlobalInboundInvocationHandler.java:144)

           at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$RunnableWrapper.run(BlockingTaskAwareExecutorServiceImpl.java:212)

           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

           at java.lang.Thread.run(Thread.java:748)

       

      How can we get it up and running again without losing any data? We use Infinispan version 9.3.6.Final.

       

      Regards

      Nikolai