Lost HornetQ messages
venom27 Sep 2, 2015 4:57 PMWe have a big issue with HornetQ under Wildfly 8.2.0.FINAL
JMS messages sometimes lost without any errors or notifications.
We have a cluster with 21 wildfly nodes. Every node has only one deployed app. Most of applications are 'single-noded'. Several are deployed on two nodes for scalability reasons. Config for all nodes is the same.
Almost all from standalone-full-ha.xml. HornetQ config below:
<subsystem xmlns="urn:jboss:domain:messaging:2.0">
<hornetq-server>
<security-enabled>false</security-enabled>
<cluster-user>jmscluster</cluster-user>
<cluster-password>${jboss.messaging.cluster.password:R2xld2VrRGltZWw3}</cluster-password>
<journal-file-size>102400</journal-file-size>
<connectors>
<http-connector name="http-connector" socket-binding="http">
<param key="http-upgrade-endpoint" value="http-acceptor"/>
</http-connector>
<http-connector name="http-connector-throughput" socket-binding="http">
<param key="http-upgrade-endpoint" value="http-acceptor-throughput"/>
<param key="batch-delay" value="50"/>
</http-connector>
<in-vm-connector name="in-vm" server-id="0"/>
</connectors>
<acceptors>
<http-acceptor http-listener="default" name="http-acceptor"/>
<http-acceptor http-listener="default" name="http-acceptor-throughput">
<param key="batch-delay" value="50"/>
<param key="direct-deliver" value="false"/>
</http-acceptor>
<in-vm-acceptor name="in-vm" server-id="0"/>
</acceptors>
<broadcast-groups>
<broadcast-group name="bg-group1">
<socket-binding>messaging-group</socket-binding>
<connector-ref>
http-connector
</connector-ref>
</broadcast-group>
</broadcast-groups>
<discovery-groups>
<discovery-group name="dg-group1">
<socket-binding>messaging-group</socket-binding>
</discovery-group>
</discovery-groups>
<cluster-connections>
<cluster-connection name="my-cluster">
<address>jms</address>
<connector-ref>http-connector</connector-ref>
<discovery-group-ref discovery-group-name="dg-group1"/>
</cluster-connection>
</cluster-connections>
<security-settings>
<security-setting match="#">
<permission type="send" roles="guest"/>
<permission type="consume" roles="guest"/>
<permission type="createNonDurableQueue" roles="guest"/>
<permission type="deleteNonDurableQueue" roles="guest"/>
</security-setting>
</security-settings>
<address-settings>
<address-setting match="#">
<dead-letter-address>jms.queue.DLQ</dead-letter-address>
<expiry-address>jms.queue.ExpiryQueue</expiry-address>
<max-size-bytes>10485760</max-size-bytes>
<page-size-bytes>2097152</page-size-bytes>
<message-counter-history-day-limit>10</message-counter-history-day-limit>
<redistribution-delay>1000</redistribution-delay>
</address-setting>
</address-settings>
<jms-connection-factories>
<connection-factory name="InVmConnectionFactory">
<connectors>
<connector-ref connector-name="in-vm"/>
</connectors>
<entries>
<entry name="java:/ConnectionFactory"/>
</entries>
</connection-factory>
<connection-factory name="RemoteConnectionFactory">
<connectors>
<connector-ref connector-name="http-connector"/>
</connectors>
<entries>
<entry name="java:jboss/exported/jms/RemoteConnectionFactory"/>
</entries>
<ha>true</ha>
<block-on-acknowledge>true</block-on-acknowledge>
<reconnect-attempts>-1</reconnect-attempts>
</connection-factory>
<pooled-connection-factory name="hornetq-ra">
<transaction mode="xa"/>
<connectors>
<connector-ref connector-name="in-vm"/>
</connectors>
<entries>
<entry name="java:/JmsXA"/>
<entry name="java:jboss/DefaultJMSConnectionFactory"/>
</entries>
</pooled-connection-factory>
</jms-connection-factories>
<jms-destinations>
...
</jms-destinations>
</hornetq-server>
</subsystem>
Each server starts like this:
java -D[Standalone] -Xms128m -Xmx1024m -XX:MaxPermSize=512m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true -server -XX:MaxDirectMemorySize=256M -XX:+UseThreadPriorities -XX:+AggressiveOpts -XX:+UseBiasedLocking -XX:+UseFastAccessorMethods -XX:+UseCompressedOops -XX:+OptimizeStringConcat -XX:+UseStringCache -XX:+UseCodeCacheFlushing -XX:+UseLargePages -XX:+UseG1GC -XX:+UseTLAB -XX:+CMSClassUnloadingEnabled -Dorg.jboss.boot.log.file=/opt/wildfly/standalone/log/server.log -Dlogging.configuration=file:/opt/wildfly/standalone/configuration/logging.properties -jar /opt/wildfly/jboss-modules.jar -mp /opt/wildfly/modules org.jboss.as.standalone -Djboss.home.dir=/opt/wildfly -Djboss.server.base.dir=/opt/wildfly/standalone -c standalone-full-ha.xml -Djboss.bind.address=0.0.0.0 -Djboss.bind.address.unsecure=0.0.0.0 -Djboss.bind.address.management=0.0.0.0 -Djboss.messaging.group.address=231.7.7.9 -Djboss.node.name=some-srv2 -Djboss.default.multicast.address=230.0.44.195
Where
jboss.messaging.group.address
Has same value for all nodes,
jboss.default.multicast.address
Is unique per node. Servers with the same applications have the same subnet (230.0.44.195 and 230.0.44.35 for example)
Under java we use hornetq-ra
connection factory with usual javaEE MDB classes and JMSContext
for sending.
Plus sometimes we have errors like this on server startup:
ERROR [Thread-5 (HornetQ-server-HornetQServerImpl::serverUUID=95088f81-5139-11e5-b318-6fe8ac3312f9-171344418)] [] [core.client] HQ214016: Failed to create netty connection: java.nio.channels.UnresolvedAddressException
Every cluster start the issue appears on different nodes. Tried to figure out proper sequence of server start as workaround. Nothing helps. With 5 nodes cluster works usually fine. Any help will be appreciated.
Update
We 'detect' lost message by logs. We have some common MessageSender which shortly looks like
public void sendEvent(Event event) {
try (JMSContext jmsContext = connectionFactory.createContext()) {
jmsContext.createProducer().send(destination, event);
logger.info("Event with type [" + event.getType() + "] successfully sent to destination: " + destination);
}
}
And MDB on other side receive the message in this way:
@MessageDriven(name = "AppMDB", activationConfig = {
@ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue"),
@ActivationConfigProperty(propertyName = "destinationLookup", propertyValue = "java:jms/queue/someInQueue"),
@ActivationConfigProperty(propertyName = "minSession", propertyValue = "1"),
@ActivationConfigProperty(propertyName = "maxSession", propertyValue = "4")
})
public class AppMDB implements MessageListener {
@Override
@EventEntryPoint
public void onMessage(Message message) {
logger.info("App received incoming event: " + event);
...
}
}
I see successful sending from one side, and no 'receive' message on other. This is how we detect 'lost' ones.
BTW,
- The cluster is under Amazon AWS
- JTA transactions are used