4 Replies Latest reply on Sep 23, 2014 4:12 AM by vladimir_v

    broker failover not triggered on java.lang.OutOfMemoryError

    vladimir_v

      Hi folks,

       

      I have Jboss Fuse 6.1 running in Fabric with three root containers and each of them has child with A-MQ (one Master, two Slaves) with NFSv4 for back-end storage for the brokers.

       

      Today I saw that the current A-MQ master was out of memory and I'm wondering why failover was not triggered. In the logs of the Master I have:

       

      2014-09-16 14:53:59,724 | WARN | 987588645-167914 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed

      2014-09-16 14:55:38,839 | WARN | 987588645-167921 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed

      java.lang.OutOfMemoryError: Java heap space

      2014-09-16 14:54:28,065 | ERROR | pool-17-thread-1 | GitDataStore | abric8.git.internal.GitDataStore 1117 | 84 - io.fabric8.fabric-git - 1.0.0.redhat-387 | Failed to pull from the remote git repo /services/jboss-fuse/fabric-01-brq/fabric8-karaf-1.0.0.redhat-379/instances/brq-amq01/data/git/local/fabric. Reason: java.lang.OutOfMemoryError: Java heap space

      java.lang.OutOfMemoryError: Java heap space

      2014-09-16 14:58:46,844 | WARN | qtp987588645-101 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed

      java.lang.OutOfMemoryError: Java heap space

      2014-09-16 14:58:41,707 | WARN | TCP Accept-44445 | tcp | sun.rmi.runtime.Log$LoggerLog 236 | - - | RMI TCP Accept-44445: accept loop for ServerSocket[addr=/0.0.0.0,localport=44445] throws

      java.lang.OutOfMemoryError: Java heap space

      2014-09-16 14:58:12,520 | WARN | tyMonitor Worker | FailoverTransport | sport.failover.FailoverTransport 267 | 107 - org.apache.activemq.activemq-osgi - 5.9.0.redhat-610387 | Transport (ssl://fuse-fabric-02-stg.jboss.org/10.34.40.177:61616) failed, attempting to automatically reconnect

      2014-09-16 14:57:37,410 | WARN | per@0.0.0.0:8182 | AbstractConnector | erver.AbstractConnector$Acceptor 955 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 |

      java.lang.OutOfMemoryError: Java heap space

      2014-09-16 14:57:21,957 | WARN | 645-99 Selector0 | nio | io.nio.SelectorManager$SelectSet 518 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 |

      java.lang.OutOfMemoryError: Java heap space

      From the CLI "fabric:container-list" displayed it as "false" and no bindings in "cluster-list". The JVM process itself was still there.

       

      On the Slave side it was giving errors that it can't obtain lock on KahaDB. That's correct because the Master JVM was still holding the lock.

       

      My question is how can I avoid this? I want if the Master is in failed/false state to kill the JVM and release the NFS lock so a Slave can pick it up.

       

      Thanks