7 Replies Latest reply on May 25, 2017 4:25 PM by Paul Ferraro

    Wildfly 10 - Infinispan - Error synchronizing between servers

    Paulo Souza Newbie

      Hi!

       

      I am trying to configure wildfly 10 to distribute messages between 2  servers running as active-active.

       

      Architecture.png

      Application Server A and Application server B are two independent servers, not instances inside the wildfly. The requests are distributed by a third server called Alteon.

      After several fixes as described in the discussion about the replication of messages: Re: Wildfly 10 - Messages distributed by clusters

       

      The problem now that the wildfly is logging the trace below. This error occurs when a server tries to synchronize the JMS queue with its peer.:

       

      2017-05-22 13:16:25,644 INFO  [org.apache.activemq.artemis.core.server] (Thread-9 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@2595555a-282792407)) AMQ221027: Bridge ClusterConnectionBridge@5bc02572 [name=sf.desev-cluster.b11a846a-3f09-11e7-b6ad-8f3628e6c16f, queue=QueueImpl[name=sf.desev-cluster.b11a846a-3f09-11e7-b6ad-8f3628e6c16f, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=b11a846b-3f09-11e7-b6ad-8f3628e6c16f]]@3b7918d1 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@5bc02572 [name=sf.desev-cluster.b11a846a-3f09-11e7-b6ad-8f3628e6c16f, queue=QueueImpl[name=sf.desev-cluster.b11a846a-3f09-11e7-b6ad-8f3628e6c16f, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=b11a846b-3f09-11e7-b6ad-8f3628e6c16f]]@3b7918d1 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=in-vm, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1818062455[nodeUUID=b11a846b-3f09-11e7-b6ad-8f3628e6c16f, connector=TransportConfiguration(name=http-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEnabled=true&httpPpgradeEndpoint=http-acceptor&port=8480&host=10-201-36-103, address=10,dev,des,jms, server=ActiveMQServerImpl::serverUUID=b11a846b-3f09-11e7-b6ad-8f3628e6c16f])) [initialConnectors=[TransportConfiguration(name=in-vm, factory=org-apache-activemq-artemis-core-remoting-impl-invm-InVMConnectorFactory) ?serverId=0], discoveryGroupConfiguration=null]] is connected
      2017-05-22 13:16:52,695 ERROR [org.infinispan.CLUSTER] (transport-thread--p16-t3) ISPN000196: Failed to recover cluster state after the current node became the coordinator (or after merge): java.util.concurrent.ExecutionException: org.infinispan.remoting.transport.jgroups.SuspectException: Cache not running on node desen-server-domain-widfly:10.171.193.102-jms-live
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
        at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:100)
        at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:567)
        at org.infinispan.topology.ClusterTopologyManagerImpl.recoverClusterStatus(ClusterTopologyManagerImpl.java:437)
        at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:358)
        at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.lambda$handleViewChange$0(ClusterTopologyManagerImpl.java:692)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.infinispan.executors.SemaphoreCompletionService$QueueingTask.runInternal(SemaphoreCompletionService.java:172)
        at org.infinispan.executors.SemaphoreCompletionService$QueueingTask.run(SemaphoreCompletionService.java:151)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.jboss.as.clustering.infinispan.ClassLoaderThreadFactory.lambda$newThread$12(ClassLoaderThreadFactory.java:48)
        at java.lang.Thread.run(Thread.java:748)
      Caused by: org.infinispan.remoting.transport.jgroups.SuspectException: Cache not running on node desen-server-domain-widfly:10.171.193.102-jms-live
        at org.infinispan.remoting.transport.AbstractTransport.checkResponse(AbstractTransport.java:46)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:795)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$1(JGroupsTransport.java:642)
        at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
        at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
        at org.infinispan.remoting.transport.jgroups.RspListFuture.futureDone(RspListFuture.java:31)
        at org.jgroups.blocks.Request.checkCompletion(Request.java:152)
        at org.jgroups.blocks.GroupRequest.receiveResponse(GroupRequest.java:116)
        at org.jgroups.blocks.RequestCorrelator.dispatch(RequestCorrelator.java:427)
        at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:357)
        at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:245)
        at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:664)
        at org.jgroups.JChannel.up(JChannel.java:738)
        at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:120)
        at org.jgroups.stack.Protocol.up(Protocol.java:380)
        at org.jgroups.protocols.FORK.up(FORK.java:114)
        at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:390)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:1040)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234)
        at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1070)
        at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:785)
        at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:426)
        at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:649)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155)
        at org.jgroups.protocols.FD.up(FD.java:260)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:310)
        at org.jgroups.protocols.MERGE3.up(MERGE3.java:285)
        at org.jgroups.protocols.Discovery.up(Discovery.java:296)
        at org.jgroups.protocols.MPING.up(MPING.java:178)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1601)
        at org.jgroups.protocols.TP$SingleMessageHandler.run(TP.java:1817)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.jboss.as.clustering.jgroups.ClassLoaderThreadFactory.lambda$newThread$4(ClassLoaderThreadFactory.java:52)
        ... 1 more
      
      
        • 1. Re: Wildfly 10 - Error synchronizing JMS queue between servers
          Justin Bertram Master

          As I noted on your other thread on this topic, this doesn't appear to have anything to do with Artemis.  As far as I can see the stack-trace from Infinispan is just coincidental and doesn't indicate that Artemis itself is having any trouble.  I recommend you change the title and description of your comment so it will attract the attention of people who work with Infinispan.

          • 3. Re: Wildfly 10 - Infinispan - Error synchronizing between servers
            Paul Ferraro Master

            This looks like the channel used for messaging is using the same JGroups stack as the channel used by infinispan.  Can you attach your messaging-activemq subsystem configuration?

            • 4. Re: Wildfly 10 - Infinispan - Error synchronizing between servers
              Paulo Souza Newbie

              It follows the messaging-activemq subsystem. The infinispan settings are the wildfly originals.

               

                          <subsystem xmlns="urn:jboss:domain:messaging-activemq:1.0">
                              <server name="jms-master-live">
                                  <security enabled="false"/>
                                  <cluster password="${jboss.messaging.cluster.password:changeIt}"/>
                                  <shared-store-master failover-on-server-shutdown="true"/>
                                  <bindings-directory path="${path-live}/bindings" relative-to="jms.path.live"/>
                                  <journal-directory path="${path-live}/journal" relative-to="jms.path.live"/>
                                  <large-messages-directory path="${path-live}/largemessages" relative-to="jms.path.live"/>
                                  <paging-directory path="${path-live}/paging" relative-to="jms.path.live"/>
                                  <security-setting name="#">
                                      <role name="guest" send="true" consume="true" create-non-durable-queue="true" delete-non-durable-queue="true"/>
                                  </security-setting>
                                  <address-setting name="#" dead-letter-address="jms.queue.DLQ" expiry-address="jms.queue.ExpiryQueue" max-size-bytes="1073741824" page-size-bytes="2097152" message-counter-history-day-limit="10" redistribution-delay="0"/>
                                  <http-connector name="http-connector" socket-binding="http" endpoint="http-acceptor"/>
                                  <http-connector name="http-connector-throughput" socket-binding="http" endpoint="http-acceptor-throughput">
                                      <param name="batch-delay" value="50"/>
                                  </http-connector>
                                  <remote-connector name="netty" socket-binding="messaging">
                                      <param name="use-nio" value="true"/>
                                      <param name="use-nio-global-worker-pool" value="true"/>
                                  </remote-connector>
                                  <in-vm-connector name="in-vm" server-id="0"/>
                                  <http-acceptor name="http-acceptor" http-listener="default"/>
                                  <http-acceptor name="http-acceptor-throughput" http-listener="default">
                                      <param name="batch-delay" value="50"/>
                                      <param name="direct-deliver" value="false"/>
                                  </http-acceptor>
                                  <in-vm-acceptor name="in-vm" server-id="0"/>
                                  <broadcast-group name="bg-desen-group" jgroups-stack="tcp" jgroups-channel="activemq-cluster" broadcast-period="250" connectors="http-connector in-vm"/>
                                  <discovery-group name="dg-desen-group" jgroups-stack="tcp" jgroups-channel="activemq-cluster" refresh-timeout="250"/>
                                  <cluster-connection name="desev-cluster" address="10,dev,des,jms" connector-name="http-connector" message-load-balancing-type="STRICT" discovery-group="dg-desen-group"/>
                                  <jms-queue name="ExpiryQueue" entries="java:/jms/queue/ExpiryQueue"/>
                                  <jms-queue name="DLQ" entries="java:/jms/queue/DLQ"/>
                                  <jms-queue name="TSupervisorSupervisaoQueue" entries="java:/queue/tSupervisorSupervisaoQueue java:jboss/exported/jms/queue/tSupervisorSupervisaoQueue"/>
                                  <jms-queue name="TSupervisorAgenteQueue" entries="java:/queue/tSupervisorAgenteQueue java:jboss/exported/jms/queue/tSupervisorAgenteQueue"/>
                                  <jms-queue name="tSupervisorMonitoringScheduledQueue" entries="java:/queue/tSupervisorMonitoringScheduledQueue java:jboss/exported/jms/queue/tSupervisorMonitoringScheduledQueue"/>
                                  <connection-factory name="InVmConnectionFactory" entries="java:/ConnectionFactory" connectors="http-connector in-vm" ha="true" consumer-window-size="0"/>
                                  <connection-factory name="RemoteConnectionFactory" entries="java:jboss/exported/jms/RemoteConnectionFactory" connectors="http-connector" ha="true" consumer-window-size="0" block-on-acknowledge="true" reconnect-attempts="-1"/>
                                  <pooled-connection-factory name="activemq-ra" entries="java:/JmsXA java:jboss/DefaultJMSConnectionFactory" connectors="http-connector in-vm" consumer-window-size="0" transaction="xa"/>
                              </server>
                              <server name="jms-master-backup">
                                  <security enabled="false"/>
                                  <cluster password="${jboss.messaging.cluster.password:changeIt}"/>
                                  <shared-store-slave failover-on-server-shutdown="true"/>
                                  <bindings-directory path="${path-bkp}/bindings" relative-to="jms.path.live"/>
                                  <journal-directory path="${path-bkp}/journal" relative-to="jms.path.live"/>
                                  <large-messages-directory path="${path-bkp}/largemessages" relative-to="jms.path.live"/>
                                  <paging-directory path="${path-bkp}/paging" relative-to="jms.path.live"/>
                                  <security-setting name="#">
                                      <role name="guest" send="true" consume="true" create-durable-queue="true" delete-durable-queue="true" create-non-durable-queue="true" delete-non-durable-queue="true" manage="true"/>
                                  </security-setting>
                                  <address-setting name="#" dead-letter-address="jms.queue.DLQ" expiry-address="jms.queue.ExpiryQueue" max-size-bytes="10485760" page-size-bytes="2097152" message-counter-history-day-limit="10" redistribution-delay="0"/>
                                  <http-connector name="http-connector" socket-binding="http" endpoint="http-acceptor"/>
                                  <http-connector name="http-connector-throughput" socket-binding="http" endpoint="http-acceptor-throughput">
                                      <param name="batch-delay" value="50"/>
                                  </http-connector>
                                  <remote-connector name="netty" socket-binding="messaging-backup">
                                      <param name="use-nio" value="true"/>
                                      <param name="use-nio-global-worker-pool" value="true"/>
                                  </remote-connector>
                                  <remote-acceptor name="netty" socket-binding="messaging-backup"/>
                                  <broadcast-group name="bg-desen-group" jgroups-stack="tcp" jgroups-channel="activemq-cluster" broadcast-period="250" connectors="netty"/>
                                  <discovery-group name="dg-desen-group" jgroups-stack="tcp" jgroups-channel="activemq-cluster" refresh-timeout="250"/>
                                  <cluster-connection name="desev-cluster" address="10,dev,des,jms" connector-name="netty" discovery-group="dg-desen-group"/>
                              </server>
                          </subsystem>
              
              • 5. Re: Wildfly 10 - Infinispan - Error synchronizing between servers
                Paul Ferraro Master

                Can you attach your jgroups and infinispan subsystem configuration as well?  There seems to be a conflict with multiple services trying to create JGroups channels using the same protocol stack - and thus inadvertently creating a cluster with a mix of messaging and infinispan members.

                 

                If you remove your jgroups-stack attribute, then WildFly should be able to service both messaging and Infinispan services using the same JGroups channel.  But, again, it depends on what your infinispan subsystem configuration looks like.

                • 6. Re: Wildfly 10 - Infinispan - Error synchronizing between servers
                  Paulo Souza Newbie

                  Follows as configurators:

                   

                              <subsystem xmlns="urn:jboss:domain:infinispan:4.0">
                                  <cache-container name="server" aliases="singleton cluster" default-cache="default" module="org.wildfly.clustering.server">
                                      <transport lock-timeout="60000"/>
                                      <replicated-cache name="default" mode="SYNC">
                                          <transaction mode="BATCH"/>
                                      </replicated-cache>
                                  </cache-container>
                                  <cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
                                      <transport lock-timeout="60000"/>
                                      <replicated-cache name="sso" mode="SYNC"/>
                                      <distributed-cache name="dist" mode="ASYNC" l1-lifespan="0" owners="2">
                                          <locking isolation="REPEATABLE_READ"/>
                                          <transaction mode="BATCH"/>
                                          <file-store/>
                                      </distributed-cache>
                                      <distributed-cache name="concurrent" mode="SYNC" l1-lifespan="0" owners="2">
                                          <file-store/>
                                      </distributed-cache>
                                  </cache-container>
                                  <cache-container name="ejb" aliases="sfsb" default-cache="dist" module="org.wildfly.clustering.ejb.infinispan">
                                      <transport lock-timeout="60000"/>
                                      <distributed-cache name="dist" mode="ASYNC" l1-lifespan="0" owners="2">
                                          <locking isolation="REPEATABLE_READ"/>
                                          <transaction mode="BATCH"/>
                                          <file-store/>
                                      </distributed-cache>
                                  </cache-container>
                                  <cache-container name="hibernate" default-cache="local-query" module="org.hibernate.infinispan">
                                      <transport lock-timeout="60000"/>
                                      <local-cache name="local-query">
                                          <eviction strategy="LRU" max-entries="10000"/>
                                          <expiration max-idle="100000"/>
                                      </local-cache>
                                      <invalidation-cache name="entity" mode="SYNC">
                                          <transaction mode="NON_XA"/>
                                          <eviction strategy="LRU" max-entries="10000"/>
                                          <expiration max-idle="100000"/>
                                      </invalidation-cache>
                                      <replicated-cache name="timestamps" mode="ASYNC"/>
                                  </cache-container>
                                  <cache-container name="security" default-cache="auth-cache" statistics-enabled="false">
                                      <transport lock-timeout="60000"/>
                                      <local-cache name="replicated-local">
                                          <eviction strategy="LRU" max-entries="60000"/>
                                          <expiration max-idle="60000"/>
                                      </local-cache>
                                      <invalidation-cache name="security-invalidation" mode="SYNC">
                                          <transaction mode="NON_XA"/>
                                          <eviction strategy="LRU" max-entries="60000"/>
                                          <expiration max-idle="60000"/>
                                      </invalidation-cache>
                                      <replicated-cache name="security-replicated" mode="ASYNC"/>
                                  </cache-container>
                              </subsystem>
                  

                   

                   

                              <subsystem xmlns="urn:jboss:domain:jgroups:4.0">
                                  <channels default="activemq-cluster">
                                      <channel name="activemq-cluster" stack="tcp"/>
                                  </channels>
                                  <stacks>
                                      <stack name="udp">
                                          <transport type="UDP" socket-binding="jgroups-udp"/>
                                          <protocol type="PING"/>
                                          <protocol type="MERGE3"/>
                                          <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
                                          <protocol type="FD_ALL"/>
                                          <protocol type="VERIFY_SUSPECT"/>
                                          <protocol type="pbcast.NAKACK2"/>
                                          <protocol type="UNICAST3"/>
                                          <protocol type="pbcast.STABLE"/>
                                          <protocol type="pbcast.GMS"/>
                                          <protocol type="UFC"/>
                                          <protocol type="MFC"/>
                                          <protocol type="FRAG2"/>
                                      </stack>
                                      <stack name="tcp">
                                          <transport type="TCP" socket-binding="jgroups-tcp"/>
                                          <protocol type="MPING" socket-binding="jgroups-mping"/>
                                          <protocol type="MERGE3"/>
                                          <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
                                          <protocol type="FD"/>
                                          <protocol type="VERIFY_SUSPECT"/>
                                          <protocol type="pbcast.NAKACK2"/>
                                          <protocol type="UNICAST3"/>
                                          <protocol type="pbcast.STABLE"/>
                                          <protocol type="pbcast.GMS"/>
                                          <protocol type="MFC"/>
                                          <protocol type="FRAG2"/>
                                      </stack>
                                      <stack name="tcphq">
                                          <transport type="TCP" socket-binding="jgroups-tcp"/>
                                          <protocol type="TCPPING">
                                              <property name="initial_hosts">
                                                  DEVCTX232SPODB[7600],DEVCTX233SPODB[7600]
                                              </property>
                                              <property name="port_range">
                                                  400
                                              </property>
                                          </protocol>
                                          <protocol type="MERGE3"/>
                                          <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
                                          <protocol type="FD"/>
                                          <protocol type="VERIFY_SUSPECT"/>
                                          <protocol type="pbcast.NAKACK2"/>
                                          <protocol type="UNICAST3"/>
                                          <protocol type="pbcast.STABLE"/>
                                          <protocol type="pbcast.GMS"/>
                                          <protocol type="MFC"/>
                                          <protocol type="FRAG2"/>
                                      </stack>
                                  </stacks>
                              </subsystem>
                  
                  • 7. Re: Wildfly 10 - Infinispan - Error synchronizing between servers
                    Paul Ferraro Master

                    OK - I see the problem.  Your Infinispan cache-containers are configured to use the default channel, and your messaging broadcast/discovery is configured to create a channel using the same stack and cluster name used by Infinispan.  This won't work, as both will join the same cluster and not make any sense of each other's messages.

                     

                    You have 2 options:

                    1. Configure messaging to *share* the channel used by Infinispan.
                    2. Configure messaging and Infinispan to use distinct channels

                     

                    To use option #1, remove the jgroups-stack attribute (or, alternatively, set it to "activemq-cluster").  This will result in messaging using a ForkChannel, based off of the default (i.e. "activemq-cluster") JChannel.  In this setup, there is only 1 jgroups channel, but shared among both Infinispan and messaging.

                    To use option #2, configure a different default channel in your jgroups subsystem.  The default configuration uses a channel named "ee", for use by all infinispan cache containers.  In this setup, there are 2 channels, one used by messaging, and the other used by Infinispan.