5 Replies Latest reply on May 20, 2015 3:11 AM by hchiorean

    Infinispan OutOfMemoryError when transferring state in a WildFly cluster

    ryanweber

      Running ModeShape 4.1 as subsystem on WildFly 8.2.0.Final, cluster of two nodes (A and B). On each node, ModeShape is configured with two replicated caches for a binary store (one for meta data), each backed by a file store -- see below.

       

              <subsystem xmlns="urn:jboss:domain:infinispan:2.0">
                  ...
                  <cache-container name="modeshape" module="org.modeshape">
                      <transport lock-timeout="60000"/>
                      <replicated-cache name="myrepository" batching="false" mode="SYNC">
                          <transaction mode="NON_XA"/>
                          <file-store passivation="false" purge="false" relative-to="jboss.server.data.dir" path="modeshape/store/myrepository"/>
                      </replicated-cache>
                  </cache-container>
                  <cache-container name="modeshape-binary-store" module="org.modeshape">
                      <transport lock-timeout="60000"/>
                      <replicated-cache name="myrepository-binary-data" batching="false" mode="SYNC">
                          <transaction mode="NON_XA"/>
                          <file-store passivation="false" purge="false" relative-to="jboss.server.data.dir" path="modeshape/binary-store/myrepository"/>
                      </replicated-cache>
                      <replicated-cache name="myrepository-binary-metadata" batching="false" mode="SYNC">
                          <transaction mode="NON_XA"/>
                          <file-store passivation="false" purge="false" relative-to="jboss.server.data.dir" path="modeshape/binary-store/myrepository-metadata"/>
                      </replicated-cache>
                  </cache-container>
              </subsystem>
              <subsystem xmlns="urn:jboss:domain:modeshape:2.0">
                  <repository name="myrepository" cache-name="myrepository" cache-container="modeshape">
                      <cache-binary-storage data-cache-name="myrepository-binary-data" metadata-cache-name="myrepository-binary-metadata" cache-container="modeshape-binary-store"/>
                      <workspaces allow-workspace-creation="false">
                          <workspace name="default"/>
                      </workspaces>
                  </repository>
              </subsystem>

       

      First, the cache container on Node A starts when it is accessed. Then, the following exception occurs on Node A when the Node B cache container is accessed and starts:

       

      2015-04-13 23:05:01,460 ERROR [org.infinispan.statetransfer.OutboundTransferTask] (transport-thread-11) Failed to send entries to node A/modeshape-binary-store : java.lang.OutOfMemoryError: Java heap space: org.infinispan.commons.CacheException: java.lang.OutOfMemoryE
      rror: Java heap space
              at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:294) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:233) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:174) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_75]
              at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_75]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_75]
              at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_75]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_75]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_75]
              at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_75]
      Caused by: java.lang.OutOfMemoryError: Java heap space
              at org.infinispan.commons.io.ExposedByteArrayOutputStream.write(ExposedByteArrayOutputStream.java:71)
              at org.jboss.marshalling.SimpleDataOutput.write(SimpleDataOutput.java:108)
              at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:300)
              at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
              at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
              at org.infinispan.container.entries.metadata.MetadataImmortalCacheEntry$Externalizer.writeObject(MetadataImmortalCacheEntry.java:58) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.container.entries.metadata.MetadataImmortalCacheEntry$Externalizer.writeObject(MetadataImmortalCacheEntry.java:54) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.writeObject(ExternalizerTable.java:395) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:138)
              at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
              at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
              at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:25)
              at org.infinispan.marshall.exts.ListExternalizer.writeObject(ListExternalizer.java:44) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.exts.ListExternalizer.writeObject(ListExternalizer.java:26) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.writeObject(ExternalizerTable.java:395) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:138)
              at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
              at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
              at org.infinispan.statetransfer.StateChunk$Externalizer.writeObject(StateChunk.java:80) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.statetransfer.StateChunk$Externalizer.writeObject(StateChunk.java:65) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.writeObject(ExternalizerTable.java:395) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:138)
              at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
              at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
              at org.infinispan.commons.marshall.MarshallUtil.marshallCollection(MarshallUtil.java:25)
              at org.infinispan.marshall.exts.ListExternalizer.writeObject(ListExternalizer.java:44) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.exts.ListExternalizer.writeObject(ListExternalizer.java:26) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.infinispan.marshall.core.ExternalizerTable$ExternalizerAdapter.writeObject(ExternalizerTable.java:395) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
              at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:138)
              at org.jboss.marshalling.AbstractObjectOutput.writeObject(AbstractObjectOutput.java:58)
              at org.jboss.marshalling.AbstractMarshaller.writeObject(AbstractMarshaller.java:111)
              at org.infinispan.marshall.exts.ReplicableCommandExternalizer.writeCommandParameters(ReplicableCommandExternalizer.java:57) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]

       

      This configuration was working last week, but more files were added to the repository. However, at present, the repository contains no more than 50 files total, each less than 3 MB.

       

      Increasing the JVM heap size to 3 GB (from the defaults) on each node resolves the issue for now. I am concerned that this exception will occur again when the repository has grown to a certain size.

        • 1. Re: Infinispan OutOfMemoryError when transferring state in a WildFly cluster
          hchiorean

          None of your caches have eviction enabled. Without eviction, all cache data is held in memory and never released, hence the OOM.

          • 2. Re: Infinispan OutOfMemoryError when transferring state in a WildFly cluster
            ryanweber

            I decreased JVM heap size back to the defaults and added the following eviction to all three Modeshape caches:

             

            <eviction strategy="LRU" max-entries="10"/>

             

            The OutOfMemoryError is still occurring. 10 is a very low max-entries number, implying minimal memory usage; thus, I do not think this is a simple case of the in-memory cache exceeding available memory.

             

            Also, to clarify, the error is occurring on initial state transfer from Node A to Node B. After the OutOfMemoryError occurs on Node A, the following exception eventually occurs on Node B:

             

            javax.jcr.RepositoryException: Error while starting 'myrepository' repository: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
            • 3. Re: Infinispan OutOfMemoryError when transferring state in a WildFly cluster
              hchiorean

              This is not something we've seen before, so you should profile and see where the OOM is coming from. It may be a bug in Infinispan or something else. ModeShape has no direct control over how the state transfer works, it simply expects Infinispan "to do its job".

              • 4. Re: Infinispan OutOfMemoryError when transferring state in a WildFly cluster
                ryanweber

                Setting fetch-state="false" on the file store on Node A clears up the issue, perhaps working around it. There appear to be no negative affects on behavior. According to the Infinispan User Guide, fetch-state (fetchPersistentState) fetches "the persistent state of a cache when joining a cluster" and applies it "to the local cache store of the joining node." (It appears that WildFly turns on fetch-state by default.)

                 

                Also, upon analysis of a heap dump (from the OutOfMemoryError), an ExposedByteArrayOutputStream is present that contains a "buf" byte array whose size is 615 MB, roughly the same size as the .dat binary store file on disk at present. (This is not surprising, as the exception is thrown inside ExposedByteArrayOutputStream.) This implies the entire repository was being included in the state transfer in the form of this stream, causing the OutOfMemoryError. Thus, the required heap size increases with the size of the repository. With the aforementioned configuration (and fetch-state="true"), is the expected behavior to include the entire repository in the state transfer? Again, configuring eviction and expiration had no effect.

                • 5. Re: Infinispan OutOfMemoryError when transferring state in a WildFly cluster
                  hchiorean

                  is the expected behavior to include the entire repository in the state transfer?

                  ModeShape has no influence whatsoever over what is or isn't transferred during state transfer. That's something handled by the Infinispan internals.

                   

                  However, not necessarily related to state transfer, there is this bug: [MODE-2466] ModeShape's EAP/WF kit does not use the correct workspace cache configuration, causing potential OOM errors … which can cause OOMs in Wildfly. This is something which will be fixed in 4.3. In the meantime you should update your config and try the workaround described in the JIRA issue.