I am having a strange issue with random blocking of random topic queues. It is always the same topic as it is my messaging backbone.
We started noticing this problem when we would get random OutOfMemory crashes and after investigations noticed that it was a topic queue filling up until the memory was exhausted.
We do run on a very limited hardware (Quadcore ARMv7, 2gbRam) and the system will run fine during my performance tests (load average: 4.98, 4.85, 4.44, stable 68% Mem) but then it will randomly have issues:
2016-07-14 09:10:06,776 WARN [org.hornetq.core.client] (Thread-1683) HQ212054: Destination address=jms.topicMyChannel is blocked. If the system is configured to block make sure you consume messages on this configuration.
It can run for days doing the same tests (every 5 minutes we do something that 'floods' the system) and it works fine and then out of the blue....
There are no known outside influencers and the system has the same set-up all the time. Only slight differences in monitoring & diagnostic tools (we use the jbosscli to gain info).
I also really occasionally get this warning:
HQ222172: Queue jms.queue.QueueName was busy for more than 10,000 milliseconds. There are possibly consumers hanging on a network operation
The Topic itself is set to not use persistence, DLQ or paging. I did not remove the entries for journal though. Did not think it would hurt if configured but not used.
Connector and acceptors are set to in-vm.
The first time it was a listener that used a Wildcard in the selector, but after simply adding in multiple listeners for it, the issue remains.
So far it has also been different Topic/Queues that were filling up. The subsequent handlers all do (more or less) the same basic things and write to the same Infinispan Cache.
The listeners are a mix of annotated and programmatically created (those wildcard replacements) versions. I have seen the issue on both.
On the occasion I was able to check on a system that was starting to fill up, I saw that there was a consumer, it just seemingly did not do anything.
Though I do see two differences:
a) a topics queue will fill up until it consumes all the memory and crashes. With the tools I was able to see the queue and the amount of messages it had until it busted.
Clearly visible with large amount of log warnings.
b) a topics queue will fill up, but then seemingly empty out again.
Here I only see a entries for only a short time.
I have two monitors running:
one checks the delivering count of the topic
another iterates through all queues of the topic and checks the message count. If it is > 0, it prints out more details
both use jbosscli remotely which, while remote is faster then local because ARM, it is still kinda slow.
Any ideas what to look for or how to improve my monitoring & diagnostics?
Anything else needed for help?
I saw in this Thread the mention of ActiveMQ. Should I switch over?
Note: I do have an issue with Oracle's ARMv7 HF JVM segfaulting for unknown reasons. Could be related but I am not betting on it.