2 Replies Latest reply on Dec 4, 2015 10:27 AM by thiago.presa

Setting up jgroups for multiple applications in domain mode?

thiago.presa Oct 20, 2015 4:29 PM

Hi,

I have a wildfly 9 cluster set up, and the first application works fine, including session failover and load balancing. When I bring the second application up, I get

2015-10-20 18:11:52,842 ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p3-t5) ISPN000196: Failed to recover cluster state after the current node became the coordinator: org.infinispan.commons.CacheException: Unsuccessful response received from node host-slave-2:deployment1: CacheNotFoundResponse

at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:482)

at org.infinispan.topology.ClusterTopologyManagerImpl.recoverClusterStatus(ClusterTopologyManagerImpl.java:350)

at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:286)

at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener$1.run(ClusterTopologyManagerImpl.java:590)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

2015-10-20 18:11:52,909 INFO [org.infinispan.CLUSTER] (transport-thread--p4-t9) ISPN000310: Starting cluster-wide rebalance for cache dist, topology CacheTopology{id=5, rebalanceId=2, currentCH=DefaultConsistentHash{ns=80, owners = (2)[host-slave-1:deployment1: 40+40, host-slave-2:deployment1: 40+40]}, pendingCH=DefaultConsistentHash{ns=80, owners = (3)[host-slave-2:deployment2: 26+27, host-slave-1:deployment1: 27+27, host-slave-2:deployment1: 27+26]}, unionCH=null, actualMembers=[host-slave-2:deployment2, host-slave-1:deployment1, host-slave-2:deployment1]}

2015-10-20 18:11:53,101 INFO [org.infinispan.CLUSTER] (transport-thread--p4-t21) ISPN000336: Finished cluster-wide rebalance for cache dist, topology id = 5

2015-10-20 18:12:43,287 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-14,dev,host-slave-2:deployment2) ISPN000093: Received new, MERGED cluster view for channel web: MergeView::[host-slave-2:deployment2|4] (4) [host-slave-2:deployment2, host-slave-1:deployment1, host-slave-2:deployment1, 15fc2146-a245-54fc-ca19-db43c80fbd7b], 2 subgroups: [host-slave-1:deployment1|3] (3) [host-slave-1:deployment1, host-slave-2:deployment1, 15fc2146-a245-54fc-ca19-db43c80fbd7b], [host-slave-2:deployment2|2] (3) [host-slave-2:deployment2, host-slave-1:deployment1, host-slave-2:deployment1]

2015-10-20 18:12:43,289 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-14,dev,host-slave-2:deployment2) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[host-slave-2:deployment2|4] (4) [host-slave-2:deployment2, host-slave-1:deployment1, host-slave-2:deployment1, 15fc2146-a245-54fc-ca19-db43c80fbd7b], 2 subgroups: [host-slave-1:deployment1|3] (3) [host-slave-1:deployment1, host-slave-2:deployment1, 15fc2146-a245-54fc-ca19-db43c80fbd7b], [host-slave-2:deployment2|2] (3) [host-slave-2:deployment2, host-slave-1:deployment1, host-slave-2:deployment1]

2015-10-20 18:12:43,290 SEVERE [org.jgroups.protocols.pbcast.GMS] (Incoming-14,dev,host-slave-2:deployment2) JGRP000027: failed passing message up: java.lang.RuntimeException: java.lang.NullPointerException

at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:680)

at org.jgroups.JChannel.up(JChannel.java:739)

at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1029)

at org.jgroups.protocols.FORK.up(FORK.java:110)

at org.jgroups.protocols.RSVP.up(RSVP.java:201)

at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)

at org.jgroups.protocols.FlowControl.up(FlowControl.java:394)

at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:735)

at org.jgroups.protocols.pbcast.CoordGmsImpl.handleViewChange(CoordGmsImpl.java:244)

at org.jgroups.protocols.pbcast.GMS.up(GMS.java:925)

at org.jgroups.stack.Protocol.up(Protocol.java:412)

at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:294)

at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:474)

at org.jgroups.protocols.pbcast.NAKACK2.deliverBatch(NAKACK2.java:982)

at org.jgroups.protocols.pbcast.NAKACK2.removeAndPassUp(NAKACK2.java:912)

at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:846)

at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:618)

at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155)

at org.jgroups.protocols.FD.up(FD.java:260)

at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:311)

at org.jgroups.protocols.MERGE3.up(MERGE3.java:286)

at org.jgroups.protocols.Discovery.up(Discovery.java:295)

at org.jgroups.protocols.TP.passMessageUp(TP.java:1577)

at org.jgroups.protocols.TP$3.run(TP.java:1511)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.NullPointerException

at org.wildfly.clustering.server.group.ChannelNodeFactory.createNode(ChannelNodeFactory.java:55)

at org.wildfly.clustering.server.group.ChannelNodeFactory.createNode(ChannelNodeFactory.java:40)

at org.wildfly.clustering.server.dispatcher.ChannelCommandDispatcherFactory.getNodes(ChannelCommandDispatcherFactory.java:195)

at org.wildfly.clustering.server.dispatcher.ChannelCommandDispatcherFactory.getNodes(ChannelCommandDispatcherFactory.java:189)

at org.wildfly.clustering.server.dispatcher.ChannelCommandDispatcherFactory.viewAccepted(ChannelCommandDispatcherFactory.java:204)

at org.jgroups.blocks.MessageDispatcher.handleUpEvent(MessageDispatcher.java:600)

at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:677)

... 26 more

Should both deployments cluster together? If not, how should I configure the domain.xml so that they don't cluster together?

Thanks!

1. Re: Setting up jgroups for multiple applications in domain mode?

pferraro Dec 4, 2015 9:30 AM (in response to thiago.presa)

This should be fixed in WF10 by this commit:
WFLY-5189 Eliminate "discarding discovery request for cluster=X from … · wildfly/wildfly@25e62d6 · GitHub
Actions
2. Re: Setting up jgroups for multiple applications in domain mode?

thiago.presa Dec 4, 2015 10:27 AM (in response to pferraro)

I've switched to WF10 CR4 and I'm seeing the same issue. Also, this issue doesn't allow the servers to boot up correctly.
Actions

Go to original post