broker failover not triggered on java.lang.OutOfMemoryError
vladimir_v Sep 18, 2014 4:37 AMHi folks,
I have Jboss Fuse 6.1 running in Fabric with three root containers and each of them has child with A-MQ (one Master, two Slaves) with NFSv4 for back-end storage for the brokers.
Today I saw that the current A-MQ master was out of memory and I'm wondering why failover was not triggered. In the logs of the Master I have:
2014-09-16 14:53:59,724 | WARN | 987588645-167914 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed
2014-09-16 14:55:38,839 | WARN | 987588645-167921 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed
java.lang.OutOfMemoryError: Java heap space
2014-09-16 14:54:28,065 | ERROR | pool-17-thread-1 | GitDataStore | abric8.git.internal.GitDataStore 1117 | 84 - io.fabric8.fabric-git - 1.0.0.redhat-387 | Failed to pull from the remote git repo /services/jboss-fuse/fabric-01-brq/fabric8-karaf-1.0.0.redhat-379/instances/brq-amq01/data/git/local/fabric. Reason: java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
2014-09-16 14:58:46,844 | WARN | qtp987588645-101 | nio | tty.io.nio.SelectChannelEndPoint 697 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 | handle failed
java.lang.OutOfMemoryError: Java heap space
2014-09-16 14:58:41,707 | WARN | TCP Accept-44445 | tcp | sun.rmi.runtime.Log$LoggerLog 236 | - - | RMI TCP Accept-44445: accept loop for ServerSocket[addr=/0.0.0.0,localport=44445] throws
java.lang.OutOfMemoryError: Java heap space
2014-09-16 14:58:12,520 | WARN | tyMonitor Worker | FailoverTransport | sport.failover.FailoverTransport 267 | 107 - org.apache.activemq.activemq-osgi - 5.9.0.redhat-610387 | Transport (ssl://fuse-fabric-02-stg.jboss.org/10.34.40.177:61616) failed, attempting to automatically reconnect
2014-09-16 14:57:37,410 | WARN | per@0.0.0.0:8182 | AbstractConnector | erver.AbstractConnector$Acceptor 955 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 |
java.lang.OutOfMemoryError: Java heap space
2014-09-16 14:57:21,957 | WARN | 645-99 Selector0 | nio | io.nio.SelectorManager$SelectSet 518 | 95 - org.eclipse.jetty.aggregate.jetty-all-server - 8.1.14.v20131031 |
java.lang.OutOfMemoryError: Java heap space
From the CLI "fabric:container-list" displayed it as "false" and no bindings in "cluster-list". The JVM process itself was still there.
On the Slave side it was giving errors that it can't obtain lock on KahaDB. That's correct because the Master JVM was still holding the lock.
My question is how can I avoid this? I want if the Master is in failed/false state to kill the JVM and release the NFS lock so a Slave can pick it up.
Thanks