1 Reply Latest reply on May 16, 2017 8:57 AM by wwang2016

    HA for messaging in a true clustered setting (with mod_cluster) does not work consistently

    wwang2016

      Hi,

       

      The tests were on Windows 10. Eventually, it will be tested in Linux environment.

       

      I am setting up a cluster of two Wildfly 10 servers with a shared-store configuration for messaging. The mod_cluster is set up so that requests are sent to the load balancer (mod_cluster 1.3.1), and get distributed to the two wildfly servers. All tests were fine except for one scenario which does not show consistent HA for messaging.

       

      The application:

      This is the wildfly 10 quick start example (helloworld-mdb) which has a servlet (HelloWorldMDBServletClient) to send 10000 messages to a queue, and mdb (HelloWorldQueueMDB) process the messages from the queue

       

      Test set-up:

       

      Start up wildfly server 1: standalone.bat --server-config=standalone-full-ha.xml -Djboss.node.name=nodeOne -Djboss.socket.binding.port-offset=100

      start up wildfly server 2 : standalone.bat --server-config=standalone-full-ha.xml -Djboss.node.name=nodeTwo -Djboss.socket.binding.port-offset=200

      start up mod_cluster 1.3.1

      Accessing http://localhost/mod_cluster_manager It showed that two wildfly instances were running

       

      Test 1 : consistently working

      send 10000 requests/ messages to the load balancer while both wildfly 1 and wildfly 2 are live at ALL time:

      both servers processed messages. All 10000 messages were processed -> good

       

      Test 2: Not consistently working

      send 10000 requests/ messages to the load balancer, and in the middle, wildfly 1 was shut down while wildfly 2 and the load balancer were live at ALL time:

      both wildfly servers processed messages. Sometimes all 10000 messages were sent from client and got processed, and sometimes, only a portion of messages were sent from client and got processed.

       

      In a total of four tests:

       

      Test 2.1 (No good)

      shut down wildfly 1 in the middle, the web site showed  only 6262 messages created and sent. Wildfly #1 log showed 1828 messages processed and Wildfly #2 log showed 4434 messages processed. The total number of messages sent from client does match that of the total messages processed.

       

      However, all 10000 messages should be sent and processed since the load balancer and the wildfly #2 were alive. Somehow the shutting down of wildfly #1 interrupted the client?

       

      Test 2.2 (good)

      shut down wildfly 1 in the middle, the web site showed  10000  messages created and sent. Wildfly #1 log showed 852 messages processed and Wildfly #2 log showed 9147 messages processed. The total number of messages sent from client does match that of the total messages processed. All 10000 messages were sent and processed

       

       

      Test 2.3 (good)

      shut down wildfly 1 in the middle, the web site showed  10000  messages created and sent. Wildfly #1 log showed 1103 messages processed and Wildfly #2 log showed 8897 messages processed. The total number of messages sent from client does match that of the total messages processed. All 10000 messages were sent and processed

       

      Test 2.4 (No good)

      shut down wildfly 1 in the middle (I let it run for a longer time before shutting it down), the web site showed  only 8743 messages created and sent. Wildfly #1 log showed 3069 messages processed and Wildfly #2 log showed 5674 messages processed. The total number of messages sent from client does match that of the total messages processed.

       

      However, all 10000 messages should be sent and processed since the load balancer and the wildfly #2 were alive. Somehow the shutting down of wildfly #1 interrupted the client?

       

      The server log in Test 2.2 and Test 2.3 showed clean shutdown of wildfly #1. However, the server log in Test 2.1 and 2.4 showed the shutting down was not very clean

       

      Test 2.1 server #1 log

       

      2017-05-09 09:14:06,923 WARN  [org.jboss.modcluster] (management-handler-thread - 4) MODCLUSTER000022: Failed to drain 1 remaining pending requests from default-host:/wildfly-helloworld-mdb within 10.0 seconds

      2017-05-09 09:14:06,931 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-177 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3641

      2017-05-09 09:14:06,939 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-342 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3643

      2017-05-09 09:14:06,946 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-475 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3645

      2017-05-09 09:14:06,954 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-301 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3647

      2017-05-09 09:14:06,962 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-165 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3649

      2017-05-09 09:14:06,970 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-312 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3651

      2017-05-09 09:14:06,979 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-124 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3653

      2017-05-09 09:14:06,986 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-328 (ActiveMQ-client-global-threads-215146872)) Received Message from queue: This is message 3655

      2017-05-09 09:14:07,031 INFO  [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested.

      2017-05-09 09:14:07,067 INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 81) WFLYMSGAMQ0006: Unbound messaging object to jndi name java:jboss/exported/jms/RemoteConnectionFactory

      2017-05-09 09:14:07,070 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-6) ISPN000080: Disconnecting JGroups channel hibernate

      2017-05-09 09:14:07,070 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-7) ISPN000080: Disconnecting JGroups channel web

      2017-05-09 09:14:07,070 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-5) ISPN000080: Disconnecting JGroups channel ejb

      2017-05-09 09:14:07,071 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-8) ISPN000080: Disconnecting JGroups channel server

      2017-05-09 09:14:07,085 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-8) ISPN000082: Stopping the RpcDispatcher for channel server

      2017-05-09 09:14:07,092 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-7) ISPN000082: Stopping the RpcDispatcher for channel web

      2017-05-09 09:14:07,100 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-5) ISPN000082: Stopping the RpcDispatcher for channel ejb

      2017-05-09 09:14:07,107 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-6) ISPN000082: Stopping the RpcDispatcher for channel hibernate

      2017-05-09 09:14:07,132 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 84) AMQ221002: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-011 [4777348b-3428-11e7-83c7-ade9e84c0a03] stopped

      2017-05-09 09:14:17,146 WARN  [org.jboss.modcluster] (ServerService Thread Pool -- 91) MODCLUSTER000022: Failed to drain 1 remaining pending requests from default-host:/wildfly-helloworld-mdb within 10.0 seconds

      2017-05-09 09:14:17,157 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 91) WFLYUT0022: Unregistered web context: /wildfly-helloworld-mdb

      2017-05-09 09:14:17,182 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-5) WFLYUT0019: Host default-host stopping

      2017-05-09 09:14:17,184 INFO  [org.jboss.weld.deployer] (MSC service thread 1-7) WFLYWELD0010: Stopping weld service for deployment wildfly-helloworld-mdb.war

      2017-05-09 09:14:17,184 ERROR [io.undertow.request] (default task-1) UT005023: Exception handling request to /wildfly-helloworld-mdb/HelloWorldMDBServletClient: javax.jms.IllegalStateRuntimeException: The session is closed

        at org.apache.activemq.artemis.jms.client.JmsExceptionUtils.convertToRuntimeException(JmsExceptionUtils.java:59)

       

      Test 2.4 server #1 log

       

      2017-05-09 11:03:41,644 WARN  [org.jboss.modcluster] (management-handler-thread - 4) MODCLUSTER000022: Failed to drain 1 remaining pending requests from default-host:/wildfly-helloworld-mdb within 10.0 seconds

      2017-05-09 11:03:41,650 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-439 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6123

      2017-05-09 11:03:41,658 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-14 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6125

      2017-05-09 11:03:41,666 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-341 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6127

      2017-05-09 11:03:41,673 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-204 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6129

      2017-05-09 11:03:41,681 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-307 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6131

      2017-05-09 11:03:41,689 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-278 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6133

      2017-05-09 11:03:41,697 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-53 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6135

      2017-05-09 11:03:41,706 INFO  [class org.jboss.as.quickstarts.mdb.HelloWorldQueueMDB] (Thread-218 (ActiveMQ-client-global-threads-772300193)) Received Message from queue: This is message 6137

      2017-05-09 11:03:41,759 INFO  [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested.

      2017-05-09 11:03:41,785 INFO  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool -- 80) AMQ221002: Apache ActiveMQ Artemis Message Broker version 1.1.0.wildfly-011 [4777348b-3428-11e7-83c7-ade9e84c0a03] stopped

      2017-05-09 11:03:41,785 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-5) ISPN000080: Disconnecting JGroups channel server

      2017-05-09 11:03:41,805 INFO  [org.wildfly.extension.messaging-activemq] (ServerService Thread Pool -- 82) WFLYMSGAMQ0006: Unbound messaging object to jndi name java:/ConnectionFactory

      2017-05-09 11:03:41,805 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000080: Disconnecting JGroups channel ejb

      2017-05-09 11:03:41,805 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-6) ISPN000080: Disconnecting JGroups channel web

      2017-05-09 11:03:41,806 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-3) ISPN000080: Disconnecting JGroups channel hibernate

      2017-05-09 11:03:41,808 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-5) ISPN000082: Stopping the RpcDispatcher for channel server

      2017-05-09 11:03:41,817 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-6) ISPN000082: Stopping the RpcDispatcher for channel web

      2017-05-09 11:03:41,828 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000082: Stopping the RpcDispatcher for channel ejb

      2017-05-09 11:03:41,867 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-3) ISPN000082: Stopping the RpcDispatcher for channel hibernate

      2017-05-09 11:03:51,816 WARN  [org.jboss.modcluster] (ServerService Thread Pool -- 83) MODCLUSTER000022: Failed to drain 1 remaining pending requests from default-host:/wildfly-helloworld-mdb within 10.0 seconds

      2017-05-09 11:03:51,825 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 83) WFLYUT0022: Unregistered web context: /wildfly-helloworld-mdb

      2017-05-09 11:03:51,846 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0019: Host default-host stopping

      2017-05-09 11:03:51,847 INFO  [org.jboss.weld.deployer] (MSC service thread 1-1) WFLYWELD0010: Stopping weld service for deployment wildfly-helloworld-mdb.war

      2017-05-09 11:03:51,849 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 80) MODCLUSTER000002: Initiating mod_cluster shutdown

      2017-05-09 11:03:51,882 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-4) WFLYJCA0010: Unbound data source [java:jboss/datasources/ExampleDS]

      2017-05-09 11:03:51,882 INFO  [org.wildfly.extension.messaging-activemq] (MSC service thread 1-7) WFLYMSGAMQ0006: Unbound messaging object to jndi name java:jboss/DefaultJMSConnectionFactory

      2017-05-09 11:03:51,882 INFO  [org.jboss.as.connector.deployment] (MSC service thread 1-5) WFLYJCA0011: Unbound JCA ConnectionFactory [java:/JmsXA]

      2017-05-09 11:03:51,883 INFO  [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-8) WFLYJCA0019: Stopped Driver service with driver-name = h2

      2017-05-09 11:03:51,897 WARN  [org.jboss.jca.core.connectionmanager.pool.strategy.PoolByCri] (ServerService Thread Pool -- 81) IJ000615: Destroying active connection in pool: ActiveMQConnectionDefinition (org.apache.activemq.artemis.ra.ActiveMQRAManagedConnection@452af8dc)

      2017-05-09 11:03:51,902 ERROR [io.undertow.request] (default task-1) UT005023: Exception handling request to /wildfly-helloworld-mdb/HelloWorldMDBServletClient: java.lang.RuntimeException: javax.naming.NameNotFoundException: java:comp/TransactionSynchronizationRegistry