14 Replies Latest reply on Jul 31, 2017 6:26 PM by jbertram

    JMS HornetQ NAT'ed IP's issue with clustered environment

    jasonglass

      Hey All!  So I'm having some serious issues with our JBoss environment trying to make it accessible in a Saas environment, e.g. one of our vendors is moving to Saas and they connect to the JBoss environment at our company.

       

      Version: JBoss EAP 6.1.0.GA (AS 7.2.0.Final-redhat-8)

       

      So this previous discussion I had posted worked fine for the vendor (see the last couple comments at the end)

      This worked fine in a single node domain, e.g. the client had the FQDN mapped to the external NAT IP and the JBoss server had the FQDN mapped to its local non-nat'ed IP

      https://developer.jboss.org/thread/275282

      And even a solution like this might work with all 4 nodes if they were behind a F5 with rules and the FQDN pointed to an F5 VIP...

       

      However we have four nodes in a domain.  The same issue is occurring as had occurred in the above previous post, instead of the server names or server FQDN's being displayed, sorry I call it "advertised" by the HornetQConnectionFactory it's their individual IP's.  Now if someone could advise how to do it, it would probably work to be able to advertise instead the NAT IP's as if push came to shove, I could add those NAT IP's as Virtual IP's/aliases on the servers network adapters so they would be able to route to themselves (the NAT IP's are unrouteable on our network, but adding them as VIP makes things work...) but preferably the server names or FQDN's would be best, even if just a single FQDN could be set somehow as then it could be routed to an F5 Virtual IP with the four nodes behind that VIP.


      I tried the client-mapping destination-address workaround, it didnt work:AS 7 Messaging ( HORNETQ ) Client access remote:// Problem with Server listening on 0.0.0.0 Interface


      Also tried netty-connectors and acceptors similar to this

      https://developer.jboss.org/thread/155468

       

      That also didnt work

       

      This simplified official red hat support version doesnt work - likely as we have a domain with four nodes

      https://access.redhat.com/solutions/461873

       

      And I had asked Justin Bertram jbertram in this thread since it was related but I figured I'd get asked to open a new discussion so I am more than happy to oblige!

      https://developer.jboss.org/message/973776#973776

       

      Again, any help would be greatly appreciated as I've been fighting with and troubleshooting/testing/researching the issue for days!

       

      Also, *possibly*, if I could just modify the clients code that connects to us that would work however I am having trouble finding a full example (but again, we'd be hesitant to modify code thats been in use for the last two years) - e.g. modifying it to use a TransportConfiguration as such (note 10206 is the messaging port)

       

      List<TransportConfiguration> transportConfigurationList = new ArrayList<TransportConfiguration>();

       

        Map<String, Object> transportProperties = new HashMap<String, Object>();

                  transportProperties.put("host", "an00sigap001u.uat.corp.mycompany.net");

                  transportProperties.put("port", 10206);

        transportConfigurationList.add(new TransportConfiguration("org.hornetq.core.remoting.impl.netty.NettyConnectorFactory", transportProperties));

       

        transportProperties = new HashMap<String, Object>();

                  transportProperties.put("host", "an00sigap002u.uat.corp.mycompany.net");

                  transportProperties.put("port", 10206);

        transportConfigurationList.add(new TransportConfiguration("org.hornetq.core.remoting.impl.netty.NettyConnectorFactory", transportProperties));

       

        transportProperties = new HashMap<String, Object>();

                  transportProperties.put("host", "an00sigap003u.uat.corp.mycompany.net");

                  transportProperties.put("port", 10206);

        transportConfigurationList.add(new TransportConfiguration("org.hornetq.core.remoting.impl.netty.NettyConnectorFactory", transportProperties));

       

        transportProperties = new HashMap<String, Object>();

                  transportProperties.put("host", "an00sigap004u.uat.corp.mycompany.net");

                  transportProperties.put("port", 10206);

        transportConfigurationList.add(new TransportConfiguration("org.hornetq.core.remoting.impl.netty.NettyConnectorFactory", transportProperties));

       

       

        HornetQJMSConnectionFactory connectionFactory = new HornetQJMSConnectionFactory(true, transportConfigurationList.toArray(new TransportConfiguration[4]));

       

      I was trying to integrate that with some old code that works without the NAT'ed IP (or with it in the very first link referenced above, e.g. by FQDN) but I am a little stuck on merging the two... I do believe Justin saind the JNDI wasnt really relevant or neccessary, but doesnt it help to lookup the other servers if the first server that successfully connects goes down?

       

      Properties properties = new Properties();

       

                  String jndiProps =

                          "java.naming.factory.url.pkgs=org.jboss.ejb.client.naming\n"

                                  + "java.naming.factory.initial=org.jboss.naming.remote.client.InitialContextFactory\n"

                                  + "java.naming.provider.url=remote://10.140.40.153:10202,remote://10.140.40.154:10202,remote://10.140.40.155:10202,remote://10.140.40.156:10202\n"

                                  + "java.naming.security.principal=myUserId\n"

                                  + "java.naming.security.credentials=myPasword\n"

                                  + "jboss.naming.client.ejb.context=true\n"

                                  + "jboss.naming.client.connect.options.org.xnio.Options.SASL_POLICY_NOANONYMOUS=true\n"

                                  + "jboss.naming.client.connect.options.org.xnio.Options.SASL_DISALLOWED_MECHANISMS=JBOSS-LOCAL-USER\n"

                                  + "jboss.naming.client.connect.options.org.xnio.Options.SASL_POLICY_NOPLAINTEXT=false\n"

                                  + "jboss.naming.client.connect.options.org.xnio.Options.SSL_STARTTLS=true\n"

                                  + "jboss.naming.client.remote.connectionprovider.create.options.org.xnio.Options.SSL_ENABLED=true";

                  props = new Properties();

                  props.load(new StringReader(jndiProps));

       

      QueueSender sender = null;

        QueueSession session2 = null;

        QueueConnectionFactory factory = null;

        QueueConnection queueConnection = null;

        Context ctx = null;

        Queue queue1 = null;

       

        ctx = new InitialContext(props);

                 

        queue1 = (Queue) ctx.lookup("java:/com/sigma/samp/imp/cableone/ejb/NotificationQueue");

                  System.out.println("--------------------------------------------------------------------------------Looked up Initial Context");

       

        factory = (QueueConnectionFactory) ctx.lookup("java:/System/Vendor/ApplicationType/OrderStuff/Application/4-3;1-0;GMP/Comp/QueueConnectionFactory");

        System.out.println("--------------------------------------------------------------------------------Looked up QueueConnectionFactory");

                 

        queueConnection = factory.createQueueConnection("myUserId", "myPassword");

        System.out.println("--------------------------------------------------------------------------------Created QueueConnection");

                 

        session2 = queueConnection.createQueueSession(false, Session.AUTO_ACKNOWLEDGE);

        System.out.println("--------------------------------------------------------------------------------Created QueueSession");

       

        QueueReceiver receiver = session2.createReceiver(queue1);

        System.out.println("--------------------------------------------------------------------------------Created Queue Receiver");

                 

        queueConnection.start();

                  System.out.println("--------------------------------------------------------------------------------Created QueueSession");

       

        while (true) {

        TextMessage msg = (TextMessage) receiver.receive();

        if (msg != null) {

        if (msg instanceof TextMessage) {

        System.out.println("Received:***********************\r\n" + msg.getText() + "\r\n");

        }else{

        break;

        }

        }

        }

       

      Any help, suggestions and comments is greatly appreciated!

       

      Thank you in advance for your time!

       

      Jay

        • 1. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
          jbertram

          Couple of things to keep in mind:

          • JNDI and JMS are 100% independent.  Making configuration changes to enable NATted JNDI lookup work won't necessarily help with JMS.
          • When a client looks up a JMS connection factory it just gets back a stub with a few bits of information on how to connect.  The information in that stub is configured on the server which is why you can, for example, set a different hostname and port on the connector used by the connection factory than what is used for the corresponding acceptor which actually receives the connection from the client.  There's nothing magic in the connector.  It's just a stub of information used by a client to connect.
          • In an environment with network address translation the connection factory used by the client must be configured with the host and port that is accessible to it (i.e. the "external" host/port).  The the NAT layer will take care of translating that to the host/port required to actually reach the server.
          • JMS connections are stateful unlike HTTP connections.  They can't be pushed through a load-balancer without taking this into account.

           

          It's not clear to me why the domain use-case wouldn't work when the single node use-case works.  Are you using the exact same configuration on every node such that each node doesn't actually have a connector with the specific host/port that the client needs to use?

           

          What logging does your client code emit?  How far does it get?

          1 of 1 people found this helpful
          • 2. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
            jasonglass

            Hi Justin - thank you for getting back to me on this!  Very much appreciated!  One huge thing to keep in mind for the time being is that most of my testing is being done from our local network.  I can get someone to perform tests from the Saas server but its a bit more troublesome.  Also my apologies as I couldnt figure out how to quote you but break it into chunks for easier readability.

             

            Justin Bertram wrote:

             

            Couple of things to keep in mind:

            • JNDI and JMS are 100% independent. Making configuration changes to enable NATted JNDI lookup work won't necessarily help with JMS.

            J.G.: Understood Justin, thank you.  I saw you had pointed that out a few times before in other posts...

            • When a client looks up a JMS connection factory it just gets back a stub with a few bits of information on how to connect. The information in that stub is configured on the server which is why you can, for example, set a different hostname and port on the connector used by the connection factory than what is used for the corresponding acceptor which actually receives the connection from the client. There's nothing magic in the connector. It's just a stub of information used by a client to connect.

            J.G.: Thank you for the very thorough explanation as well as the clarification on what I was  calling the *advertised* port and IP/Name - the stub sent back  clarifies things much more clearly. So it looks to be the connector where I need to try and configure things but all my configuration attempts failed either by not sending anything back other than the four servers local, non-nat'ed IP addresses.  Note, each server is with the bind address of it's own hostname.  When I instead changed the bind address to be all IP's like 0.0.0.0 - I did see that instead of the stub send back containing local IP addresses for all four nodes it instead contained their names - however that looks to have broken clustering as I received the following exceptions:

            ERROR 13:37:57 (ServerService Thread Pool -- 39) org.jboss.msc.service.fail> MSC000001: Failed to start service jboss.infinispan.ejb.repl: org.jboss.msc.service.StartException in service jboss.infinispan.ejb.repl: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport

                    at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:87)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

                    at org.jboss.threads.JBossThread.run(JBossThread.java:122)

            Caused by: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.remoting.transport.jgroups.JGroupsTransport.start() on object of type JGroupsTransport

                    at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)

                    at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)

                    at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)

                    at org.infinispan.factories.AbstractComponentRegistry.registerComponentInternal(AbstractComponentRegistry.java:226)

                    at org.infinispan.factories.AbstractComponentRegistry.registerComponent(AbstractComponentRegistry.java:175)

                    at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:296)

                    at org.infinispan.factories.ComponentRegistry.getOrCreateComponent(ComponentRegistry.java:158)

            .

            <removed for brevity>

            .

                    at org.infinispan.factories.InternalCacheFactory.bootstrap(InternalCacheFactory.java:101)

                    at org.infinispan.factories.InternalCacheFactory.createAndWire(InternalCacheFactory.java:80)

                    at org.infinispan.factories.InternalCacheFactory.createCache(InternalCacheFactory.java:64)

                    at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:682)

                    at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)

                    at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)

                    at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:559)

                    at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:109)

                    at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:100)

                    at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)

                    at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:82)

                    ... 4 more

            Caused by: org.infinispan.CacheException: Unable to start JGroups Channel

                    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:209)

                    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.start(JGroupsTransport.java:198)

                    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

                    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

                    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

                    at java.lang.reflect.Method.invoke(Method.java:606)

                    at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)

                    ... 91 more

            Caused by: java.lang.Exception: connecting to channel "null" failed

                    at org.jgroups.JChannel._connect(JChannel.java:542)

                    at org.jgroups.JChannel.connect(JChannel.java:283)

                    at org.jgroups.JChannel.connect(JChannel.java:268)

                    at org.infinispan.remoting.transport.jgroups.JGroupsTransport.startJGroupsChannelIfNeeded(JGroupsTransport.java:207)

                    ... 97 more

            Caused by: java.lang.IllegalArgumentException: failed to start server socket

                    at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:308)

                    at org.jgroups.protocols.FD.down(FD.java:290)

                    at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:80)

                    at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:569)

                    at org.jgroups.protocols.UNICAST2.down(UNICAST2.java:544)

                    at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:329)

                    at org.jgroups.protocols.pbcast.GMS.down(GMS.java:931)

                    at org.jgroups.protocols.FlowControl.down(FlowControl.java:351)

                    at org.jgroups.protocols.FlowControl.down(FlowControl.java:351)

                    at org.jgroups.protocols.FRAG2.down(FRAG2.java:147)

                    at org.jgroups.protocols.RSVP.down(RSVP.java:143)

                    at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)

                    at org.jgroups.JChannel.down(JChannel.java:722)

                    at org.jgroups.JChannel._connect(JChannel.java:536)

                    ... 100 more

            Caused by: java.net.BindException: bind_addr /0.0.0.0 is not a valid interface: java.net.BindException: Address already in use

                    at org.jgroups.util.Util.createServerSocket(Util.java:3404)

                    at org.jgroups.protocols.FD_SOCK.startServerSocket(FD_SOCK.java:568)

                    at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:305)

                    ... 113 more

            • In an environment with network address translation the connection factory used by the client must be configured with the host and port that is accessible to it (i.e. the "external" host/port). The the NAT layer will take care of translating that to the host/port required to actually reach the server.

            J.G.: Understood on this Justin, its basically I believe what I am trying to accomplish. With a Single node in domain mode (I know that doesnt make much sense but its a different environment (QA) meant to mimic UAT but a single node in the domain instead of four ), manually setting/overriding the single netty connector and adding the host as the FQDN worked perfectly, it was sent back in the stub and the external client and my local client code was able to connect as it had the FQDN mapped to the NAT/external IP in its /etc/hosts file and the JBoss server itself could connect fine as it maps the FQDN to its local IP.  Its just for some reason, with four nodes, even just overriding/manually setting the netty connector (more on this below as I performed some experimentation) to be that same or similar FQDN, the stub sent back instead contained the locall IP's of all four nodes

            • JMS connections are stateful unlike HTTP connections. They can't be pushed through a load-balancer without taking this into account.

            J.G.: Fully understood on this, this would be considered a last resort and state would be maintained on the load balancer unless the current node fully went down in which case the client would hopefully try again and the load balancer would switch the load to an active and up server.  Again, a last resort.

             

            It's not clear to me why the domain use-case wouldn't work when the single node use-case works. Are you using the exact same configuration on every node such that each node doesn't actually have a connector with the specific host/port that the client needs to use?
            J.G.: I'm not sure on this either I did try creating four netty connectors, each named netty and netty1, netty2 and netty3.  This didnt have any effect, then I tried adding the four as connector ref's to the RemoteConnection factory, this also didnt help. I also tried adding them to the jms-connection-factories->connection-factory..... (more on this to follow as I did do some more experimenting and had different results now)
            J.G.:
            I believe all the nodes have the same configuration.  They use the socket group information as provided by the domain controllers domain.xml - but I'v attached one of their host.xml files if you can identify anything that might be overriding things!

             

            What logging does your client code emit? How far does it get?

            J.G.: my example client code works when the FQDN's are used, it also works with the local IP's of the servers being sent in the stub.  When the external IP's are manually put in (as was tested when there was a single node) and returned in the stub, then my local client cant connect as theres no route to that IP and the server cant even connect to itself - the remote client was able to connect however as it has a route to that network.  When FQDN's are used, everything works perfectly for everyone involved - I just cant get the domain and nodes to send out the FQDN's or server names instead of IP's - frustrating ;-(

             

            Update:  so with more testing of additional netty connectors I ended up with better results, e.g I was able to get the clustered nodes hostnames returned in the stub, but then clustering apparently became broken.  Heres some of the things I tried, I have also attached the domain.xml - a few parts are commented out as you can see in the broadcast, discovery groups and clustering section as its what I had tried that partially worked but I ended up switching back to the normal


            Heres some of the things I tried, I got closer but "no cigar" and my apologies if it seems a little disjointed as I did a ton of testing following your response!

             

            The nodes do pull configuration from the Domain/Admin node...  they have their own host.xml...


            For now, and to save time restarting servers, I just have two nodes turned on, the first node lists this in the hornetq Info trace (this is with three netty connectors defined):

            INFO 08:49:56 (Thread-19 (HornetQ-server-HornetQServerImpl::serverUUID=695e9545-5ce2-11e7-9cc4-8b965ec64abc-2104874164)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@4a802436 [name=sf.my-cluster.9561ad1d-5ce2-11e7-ae99-7f6c99db2876, queue=QueueImpl[name=sf.my-cluster.9561ad1d-5ce2-11e7-ae99-7f6c99db2876, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=695e9545-5ce2-11e7-9cc4-8b965ec64abc]]@22c97b70 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@4a802436 [name=sf.my-cluster.9561ad1d-5ce2-11e7-ae99-7f6c99db2876, queue=QueueImpl[name=sf.my-cluster.9561ad1d-5ce2-11e7-ae99-7f6c99db2876, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=695e9545-5ce2-11e7-9cc4-8b965ec64abc]]@22c97b70 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-154&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@2142069593[nodeUUID=695e9545-5ce2-11e7-9cc4-8b965ec64abc, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-153&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=695e9545-5ce2-11e7-9cc4-8b965ec64abc])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-154&ssl-enabled=true], discoveryGroupConfiguration=null]] is connected

             

            The second node shows this:

            INFO 08:49:56 (Thread-5 (HornetQ-server-HornetQServerImpl::serverUUID=9561ad1d-5ce2-11e7-ae99-7f6c99db2876-1162798272)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@1c2c877 [name=sf.my-cluster.695e9545-5ce2-11e7-9cc4-8b965ec64abc, queue=QueueImpl[name=sf.my-cluster.695e9545-5ce2-11e7-9cc4-8b965ec64abc, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=9561ad1d-5ce2-11e7-ae99-7f6c99db2876]]@4d7ef98c targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1c2c877 [name=sf.my-cluster.695e9545-5ce2-11e7-9cc4-8b965ec64abc, queue=QueueImpl[name=sf.my-cluster.695e9545-5ce2-11e7-9cc4-8b965ec64abc, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=9561ad1d-5ce2-11e7-ae99-7f6c99db2876]]@4d7ef98c targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-153&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@727567710[nodeUUID=9561ad1d-5ce2-11e7-ae99-7f6c99db2876, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-154&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=9561ad1d-5ce2-11e7-ae99-7f6c99db2876])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-153&ssl-enabled=true], discoveryGroupConfiguration=null]] is connected

             

            Note how there's no reference to netty1 and netty2?

             

            My netty connectors looked like this at the time, note nothing was altered in the broadcast, clustering or domain discovery yet (but I believe I had added the connector-ref's to the connection factories)

                                   <netty-connector name="netty" socket-binding="messaging">

                                        <param key="ssl-enabled" value="true"/>

                                    </netty-connector>

             

                                    <connector name="netty1">

                                    <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                    <param key="host" value="an00sigap001u"/>

                                    <param key="port" value="10206"/>

                                    <param key="ssl-enabled" value="true"/>

                                    </connector>

             

                                    <connector name="netty2">

                                    <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                    <param key="host" value="an00sigap002u"/>

                                    <param key="port" value="10206"/>

                                    <param key="ssl-enabled" value="true"/>

             

             

            an00sigap001u (10.140.40.153) is the first node, an00sigap002u(10.140.40.154) is the second node.  10206 is the messaging port

             

             

            The client receives this stub (just provided the pertinent part I believe) with javax.net debug on...

            Note:  if the IP could be replaced with the NAT/external IP (less desirable) it might work if the server has a virtual IP for that NAT/external IP but more desirable would be the FQDN of the server so a VIP on the servers wouldnt be needed.

            Padded plaintext after DECRYPTION:  len = 288

            ...r............

            $.$695e9545-5ce2

            -11e7-9cc4-8b965

            ec64abc...\..F>.

            .....n.e.t.t.y..

            .:.:org.hornetq.

            core.remoting.im

            pl.netty.NettyCo

            nnectorFactory..

            .......p.o.r.t..

            ....1.0.2.0.6...

            ..h.o.s.t.......

            10.140.40.153...

            ...ssl-enabled..

            ....t.r.u.e.....

            ....undefined.N.

            ].y.>..2u.....I'

            V...............

            [Raw read (bb)]: length = 37

            <omitted for brevity>

            Padded plaintext after DECRYPTION:  len = 32

            <omitted for brevity>

            [Raw read (bb)]: length = 293

            <omitted for brevity>

            Padded plaintext after DECRYPTION:  len = 288

            ...r............

            $.$9561ad1d-5ce2

            -11e7-ae99-7f6c9

            9db2876...\..q..

            .....n.e.t.t.y..

            .:.:org.hornetq.

            core.remoting.im

            pl.netty.NettyCo

            nnectorFactory..

            .......p.o.r.t..

            ....1.0.2.0.6...

            ..h.o.s.t.......

            10.140.40.154...

            ...ssl-enabled..

            ....t.r.u.e.....

            ....undefined...

            #..p0.9...q.....

            B...............

             

            when I try to add the two netty connectors to cluster-connections->cluster-connection as connector-refs, I get an exception that one already exists

            <cluster-connections>

                                    <cluster-connection name="my-cluster">

                                        <address>jms</address>

                                        <!--connector-ref>netty</connector-ref-->

                                        <connector-ref>netty1</connector-ref>

                                        <connector-ref>netty2</connector-ref>

                                        <discovery-group-ref discovery-group-name="dg-group1"/>

                                    </cluster-connection>

                                </cluster-connections>

             

            After duplicating the netty connectors and commenting out "netty", then adding where needed, e.g. connectionFactories, clustering, etc.  I got this on the second node

            INFO 10:14:24 (ServerService Thread Pool -- 37) org.jboss.as.messaging> JBAS011601: Bound messaging object to jndi name java:/System/MyCompany/ApplicationType/OrderManagement/Application/4-3;1-0;DGP/Comp/QueueConnectionFactory

             

             

            INFO 10:14:24 (Thread-9 (HornetQ-server-HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb-1036412157)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@61b22585 [name=sf.my-cluster.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@2bcab6bb targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@61b22585 [name=sf.my-cluster.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@2bcab6bb targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@944742520[nodeUUID=61b4303c-5cee-11e7-a626-2552f6e426cb, connector=TransportConfiguration(name=netty1, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap001u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb])) [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], di

             

            INFO 10:14:26 (Thread-19 (HornetQ-server-HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb-1036412157)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@1468c040 [name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@6925b437 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1468c040 [name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@6925b437 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1960919995[nodeUUID=61b4303c-5cee-11e7-a626-2552f6e426cb, connector=TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb])) [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]] is connected


            Which seems better, its using the hostnames, but the trace below the first one looks strange in that it only refers to netty/node2 while the first one has netty1 and netty2 as well as the two nodes?

             

            Which looks good, but then I get exception on both servers of

            Server1:

            ERROR 10:17:11 (Thread-11 (HornetQ-client-global-threads-769191995)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue61b4303c-5cee-11e7-a626-2552f6e426cb

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerClosed(ClusterConnectionImpl.java:1570)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1288)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

             

             

            ERROR 10:17:10 (Thread-11 (HornetQ-client-global-threads-769191995)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue61b4303c-5cee-11e7-a626-2552f6e426cb on ClusterConnectionImpl@1456404745[nodeUUID=1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, connector=TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=1d7f8ea2-5cee-11e7-82fe-9752c08c42dd]

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:1510)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1282)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

             

             

            Server2:

            ERROR 10:17:08 (Thread-7 (HornetQ-client-global-threads-218225804)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue61b4303c-5cee-11e7-a626-2552f6e426cb

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerClosed(ClusterConnectionImpl.java:1570)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1288)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

             

             

            ERROR 10:17:08 (Thread-7 (HornetQ-client-global-threads-218225804)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue61b4303c-5cee-11e7-a626-2552f6e426cb on ClusterConnectionImpl@1960919995[nodeUUID=61b4303c-5cee-11e7-a626-2552f6e426cb, connector=TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:1510)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1282)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744

             

            Note, my test client is currently running on the local network

             

            I can see connections from my test client as well as node 2 on 10206/messaging on node 1 using netstat.  On node 2, there arent any connections from my test client but connections from node 2 itself as well as node1

             

            The client now receives... just prior to creating the queue connection (a little weird, again same node listed twice, ditto same netty connector name - but hey theres names in there!)...

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FC 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 31 64 37 66 38   65 61 32 2D 35 63 65 65  $.$1d7f8ea2-5cee

            0020: 2D 31 31 65 37 2D 38 32   66 65 2D 39 37 35 32 63  -11e7-82fe-9752c

            0030: 30 38 63 34 32 64 64 00   00 01 5C F4 D7 0D 48 FF  08c42dd...\...H.

            0040: 00 00 00 06 00 6E 00 65   00 74 00 74 00 79 00 32  .....n.e.t.t.y.2

            0050: 00 00 00 3A 00 3A 6F 72   67 2E 68 6F 72 6E 65 74  ...:.:org.hornet

            0060: 71 2E 63 6F 72 65 2E 72   65 6D 6F 74 69 6E 67 2E  q.core.remoting.

            0070: 69 6D 70 6C 2E 6E 65 74   74 79 2E 4E 65 74 74 79  impl.netty.Netty

            0080: 43 6F 6E 6E 65 63 74 6F   72 46 61 63 74 6F 72 79  ConnectorFactory

            0090: 00 00 00 03 00 00 00 04   00 70 00 6F 00 72 00 74  .........p.o.r.t

            00A0: 03 00 00 00 05 00 31 00   30 00 32 00 30 00 36 00  ......1.0.2.0.6.

            00B0: 00 00 04 00 68 00 6F 00   73 00 74 03 00 00 00 0D  ....h.o.s.t.....

            00C0: 00 0D 61 6E 30 30 73 69   67 61 70 30 30 32 75 00  ..an00sigap002u.

            00D0: 00 00 0B 00 0B 73 73 6C   2D 65 6E 61 62 6C 65 64  .....ssl-enabled

            00E0: 03 00 00 00 04 00 74 00   72 00 75 00 65 00 00 01  ......t.r.u.e...

            00F0: 00 00 00 09 00 09 75 6E   64 65 66 69 6E 65 64 F0  ......undefined.

            0100: 66 1A 29 61 A5 42 D7 0F   43 35 F4 79 09 D4 DC CD  f.)a.B..C5.y....

            0110: 1A 0C 55 0C 0C 0C 0C 0C   0C 0C 0C 0C 0C 0C 0C 0C  ..U.............

            [Raw read (bb)]: length = 37

            0000: 17 03 01 00 20 26 05 44   A5 26 55 95 CC 03 6F 32  .... &.D.&U...o2

            0010: BA 03 A7 4D 9D DC 51 96   78 21 B1 E9 CA 4E A3 1A  ...M..Q.x!...N..

            0020: 8C 04 BF 6A 58                                     ...jX

            Padded plaintext after DECRYPTION:  len = 32

            0000: 00 13 F7 E0 00 D9 1A 1C   F5 8E F0 E7 D8 3A 90 7F  .............:..

            0010: 6F 5B DD AD 10 0A 0A 0A   0A 0A 0A 0A 0A 0A 0A 0A  o[..............

            [Raw read (bb)]: length = 293

            0000: 17 03 01 01 20 CD 7B 8A   0A 8D 73 78 BA 78 F1 5E  .... .....sx.x.^

            0010: 59 D8 3B 55 72 F0 50 9E   BC D4 B7 86 D6 B3 80 37  Y.;Ur.P........7

            0020: 54 17 1C 1E 87 C7 A8 9B   4D E4 F9 67 3F D2 1B 4A  T.......M..g?..J

            0030: DA 8E FC 50 FB 02 00 2F   C1 FE 64 6E F9 72 82 40  ...P.../..dn.r.@

            0040: F0 F3 2C C3 78 B0 7C 70   80 97 3F 50 DE 4F 1D 34  ..,.x..p..?P.O.4

            0050: 46 1F 0D FB BB 36 8B 2E   F5 93 D6 29 C3 A0 23 79  F....6.....)..#y

            0060: 86 A9 0B BC 5B 2D 0F 95   8D FB 12 D6 C8 D0 6D 43  ....[-........mC

            0070: 79 4F 46 7F FF FD A3 A3   C7 BA 3C 70 FD 5C 46 E4  yOF.......<p.\F.

            0080: EE 0D 90 9C 16 C5 3C 39   83 2A 07 E6 44 B9 A4 B7  ......<9.*..D...

            0090: 22 C7 19 E4 0F 3D 39 BC   AC EC 8F 47 45 89 4D E8  "....=9....GE.M.

            00A0: C8 D5 4B 33 19 63 F9 CF   92 68 CA 2B 6E 84 33 23  ..K3.c...h.+n.3#

            00B0: 93 E6 F8 43 07 67 88 5C   E2 9C E3 05 4B 75 BB 5C  ...C.g.\....Ku.\

            00C0: 06 AE 84 18 BF BA 4A 9A   CC 03 CE 8B FD 23 5A ED  ......J......#Z.

            00D0: CC F1 00 C6 0B FC 57 E9   EE 77 79 5D 1E 8E 36 C0  ......W..wy]..6.

            00E0: 06 30 C6 36 68 FE F9 87   59 D8 55 D7 C1 95 1B A7  .0.6h...Y.U.....

            00F0: 1A 26 06 74 1E D1 27 A5   CF AC C0 E2 03 2B B4 4B  .&.t..'......+.K

            0100: EB A2 1F 99 4A C7 9F 3E   C5 73 D1 2D 6A CB B6 A5  ....J..>.s.-j...

            0110: AD CA C0 C0 94 C9 CD F6   21 F8 B2 F0 08 64 0D CE  ........!....d..

            0120: DA 5F 55 78 99                                     ._Ux.

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FC 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 36 31 62 34 33   30 33 63 2D 35 63 65 65  $.$61b4303c-5cee

            0020: 2D 31 31 65 37 2D 61 36   32 36 2D 32 35 35 32 66  -11e7-a626-2552f

            0030: 36 65 34 32 36 63 62 00   00 01 5C F4 D8 C5 90 FF  6e426cb...\.....

            0040: 00 00 00 06 00 6E 00 65   00 74 00 74 00 79 00 32  .....n.e.t.t.y.2

            0050: 00 00 00 3A 00 3A 6F 72   67 2E 68 6F 72 6E 65 74  ...:.:org.hornet

            0060: 71 2E 63 6F 72 65 2E 72   65 6D 6F 74 69 6E 67 2E  q.core.remoting.

            0070: 69 6D 70 6C 2E 6E 65 74   74 79 2E 4E 65 74 74 79  impl.netty.Netty

            0080: 43 6F 6E 6E 65 63 74 6F   72 46 61 63 74 6F 72 79  ConnectorFactory

            0090: 00 00 00 03 00 00 00 04   00 70 00 6F 00 72 00 74  .........p.o.r.t

            00A0: 03 00 00 00 05 00 31 00   30 00 32 00 30 00 36 00  ......1.0.2.0.6.

            00B0: 00 00 04 00 68 00 6F 00   73 00 74 03 00 00 00 0D  ....h.o.s.t.....

            00C0: 00 0D 61 6E 30 30 73 69   67 61 70 30 30 32 75 00  ..an00sigap002u.

            00D0: 00 00 0B 00 0B 73 73 6C   2D 65 6E 61 62 6C 65 64  .....ssl-enabled

            00E0: 03 00 00 00 04 00 74 00   72 00 75 00 65 00 FF 01  ......t.r.u.e...

            00F0: 00 00 00 09 00 09 75 6E   64 65 66 69 6E 65 64 3D  ......undefined=

            0100: 73 1F B7 8F 57 43 82 4F   6C 9A 51 64 71 99 69 DE  s...WC.Ol.Qdq.i.

            0110: 56 4D B6 0C 0C 0C 0C 0C   0C 0C 0C 0C 0C 0C 0C 0C  VM..............

             

            It seems to just be listing node2...but the client JMS connection was indeed made to node1, when I stopped node1, the client caught the consumer close event and moved to node2.

             

            When I restarted node1, I received the following on node2

            WARN 11:01:39 (hornetq-discovery-group-thread-dg-group1) org.hornetq.core.client> HQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=1d7f8ea2-5cee-11e7-82fe-9752c08c42dd

             

            WARN 11:01:39 (hornetq-discovery-group-thread-dg-group2) org.hornetq.core.client> HQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=1d7f8ea2-5cee-11e7-82fe-9752c08c42dd

             

            INFO 11:01:40 (Incoming-15,shared=udp) org.jboss.as.clustering> JBAS010225: New cluster view for partition web (id: 3, delta: 1, merge: false) : [an00sigap002u:DGP_svr2/web, an00sigap001u:DGP_svr1/web]

            INFO 11:01:40 (Incoming-15,shared=udp) org.infinispan.remoting.transport.jgroups.JGroupsTransport> ISPN000094: Received new cluster view: [an00sigap002u:DGP_svr2/web|3] [an00sigap002u:DGP_svr2/web, an00sigap001u:DGP_svr1/web]

             

             

            And my client was notified with

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FC 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 31 64 37 66 38   65 61 32 2D 35 63 65 65  $.$1d7f8ea2-5cee

            0020: 2D 31 31 65 37 2D 38 32   66 65 2D 39 37 35 32 63  -11e7-82fe-9752c

            0030: 30 38 63 34 32 64 64 00   00 01 5C F5 04 0A E6 FF  08c42dd...\.....

            0040: 00 00 00 06 00 6E 00 65   00 74 00 74 00 79 00 31  .....n.e.t.t.y.1

            0050: 00 00 00 3A 00 3A 6F 72   67 2E 68 6F 72 6E 65 74  ...:.:org.hornet

            0060: 71 2E 63 6F 72 65 2E 72   65 6D 6F 74 69 6E 67 2E  q.core.remoting.

            0070: 69 6D 70 6C 2E 6E 65 74   74 79 2E 4E 65 74 74 79  impl.netty.Netty

            0080: 43 6F 6E 6E 65 63 74 6F   72 46 61 63 74 6F 72 79  ConnectorFactory

            0090: 00 00 00 03 00 00 00 04   00 70 00 6F 00 72 00 74  .........p.o.r.t

            00A0: 03 00 00 00 05 00 31 00   30 00 32 00 30 00 36 00  ......1.0.2.0.6.

            00B0: 00 00 04 00 68 00 6F 00   73 00 74 03 00 00 00 0D  ....h.o.s.t.....

            00C0: 00 0D 61 6E 30 30 73 69   67 61 70 30 30 31 75 00  ..an00sigap001u.

            00D0: 00 00 0B 00 0B 73 73 6C   2D 65 6E 61 62 6C 65 64  .....ssl-enabled

            00E0: 03 00 00 00 04 00 74 00   72 00 75 00 65 00 00 01  ......t.r.u.e...

            00F0: 00 00 00 09 00 09 75 6E   64 65 66 69 6E 65 64 C9  ......undefined.

            0100: 6B FA 1C 4C B8 21 62 43   E7 6D A8 5D 72 F9 B1 B6  k..L.!bC.m.]r...

            0110: D1 FB AF 0C 0C 0C 0C 0C   0C 0C 0C 0C 0C 0C 0C 0C  ................

            [Raw read (bb)]: length = 37

            0000: 17 03 01 00 20 A0 DE 9F   B8 8A 18 64 CB EE D8 9F  .... ......d....

            0010: 95 E3 84 68 2B AC 38 F8   40 34 12 39 20 7A C4 A4  ...h+.8.@4.9 z..

            0020: A6 70 3F 67 41                                     .p?gA

            Padded plaintext after DECRYPTION:  len = 32

            0000: 00 F9 5A 3C 33 2E B6 C5   6A E8 55 54 5A 5A C5 BF  ..Z<3...j.UTZZ..

            0010: 57 53 86 CF B5 0A 0A 0A   0A 0A 0A 0A 0A 0A 0A 0A  WS..............

            [Raw read (bb)]: length = 293

            0000: 17 03 01 01 20 5B F1 4E   7E 26 90 C1 7D 63 BC E1  .... [.N.&...c..

            0010: 4D 69 BF 31 CB F8 D3 05   9A ED AE 67 21 9E DC 26  Mi.1.......g!..&

            0020: 21 43 CB 62 72 52 BB F7   A4 DD 63 DD 34 B8 1D 6A  !C.brR....c.4..j

            0030: 5C E8 F2 F2 F9 11 E3 2A   20 00 9E C7 95 0A 72 B2  \......* .....r.

            0040: 4B DE 8C D8 71 48 91 85   86 DF 81 AD 5D BD D2 E2  K...qH......]...

            0050: A6 9B C7 76 56 A0 84 0A   30 41 3A A6 47 4E 70 BB  ...vV...0A:.GNp.

            0060: 67 4C D2 A5 45 03 8C 8D   CC 4F 93 DC 24 90 A1 48  gL..E....O..$..H

            0070: 8F CF AC F8 90 AB 07 06   0F DC 7F 5C 66 BB AE 15  ...........\f...

            0080: A5 BC C5 17 8E F2 D2 3A   7C 9E 8C 0D C2 59 71 92  .......:.....Yq.

            0090: 9B 27 89 62 78 FB BD 08   5B E1 6F 74 D9 15 27 9D  .'.bx...[.ot..'.

            00A0: 64 32 04 AC 8F 1B 40 14   63 CB 35 CB D8 E8 D9 0A  d2....@.c.5.....

            00B0: BD 2D 26 8C 8C 9E 97 DA   33 FF BB 00 00 ED 5A 6A  .-&.....3.....Zj

            00C0: 21 EE 8B 27 5B F5 C6 38   78 60 8C DA BD 98 7D FC  !..'[..8x`......

            00D0: 45 2A 6B 06 C1 4A 3E E7   D0 EE E5 94 8B 3A 21 28  E*k..J>......:!(

            00E0: 2E 60 8E BF 10 82 06 8A   65 7D 06 8D D7 89 53 72  .`......e.....Sr

            00F0: 39 DD BC 49 09 5B E4 C4   D1 64 76 A0 BA A3 FF 07  9..I.[...dv.....

            0100: 11 DA 8E 23 C8 CD DC C5   3F C1 84 D5 F8 DE E5 32  ...#....?......2

            0110: 27 7C 93 EC 8C 39 55 6B   8B 8B 82 D1 94 7D AA 82  '....9Uk........

            0120: A4 0C 0B F7 F0                                     .....

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FC 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 31 64 37 66 38   65 61 32 2D 35 63 65 65  $.$1d7f8ea2-5cee

            0020: 2D 31 31 65 37 2D 38 32   66 65 2D 39 37 35 32 63  -11e7-82fe-9752c

            0030: 30 38 63 34 32 64 64 00   00 01 5C F5 04 0A FD FF  08c42dd...\.....

            0040: 00 00 00 06 00 6E 00 65   00 74 00 74 00 79 00 32  .....n.e.t.t.y.2

            0050: 00 00 00 3A 00 3A 6F 72   67 2E 68 6F 72 6E 65 74  ...:.:org.hornet

            0060: 71 2E 63 6F 72 65 2E 72   65 6D 6F 74 69 6E 67 2E  q.core.remoting.

            0070: 69 6D 70 6C 2E 6E 65 74   74 79 2E 4E 65 74 74 79  impl.netty.Netty

            0080: 43 6F 6E 6E 65 63 74 6F   72 46 61 63 74 6F 72 79  ConnectorFactory

            0090: 00 00 00 03 00 00 00 04   00 70 00 6F 00 72 00 74  .........p.o.r.t

            00A0: 03 00 00 00 05 00 31 00   30 00 32 00 30 00 36 00  ......1.0.2.0.6.

            00B0: 00 00 04 00 68 00 6F 00   73 00 74 03 00 00 00 0D  ....h.o.s.t.....

            00C0: 00 0D 61 6E 30 30 73 69   67 61 70 30 30 32 75 00  ..an00sigap002u.

            00D0: 00 00 0B 00 0B 73 73 6C   2D 65 6E 61 62 6C 65 64  .....ssl-enabled

            00E0: 03 00 00 00 04 00 74 00   72 00 75 00 65 00 00 01  ......t.r.u.e...

            00F0: 00 00 00 09 00 09 75 6E   64 65 66 69 6E 65 64 59  ......undefinedY

            0100: F4 40 45 44 8A 0F CF F3   4E 80 F9 54 E4 00 D9 AD  .@ED....N..T....

            0110: 44 27 F9 0C 0C 0C 0C 0C   0C 0C 0C 0C 0C 0C 0C 0C  D'..............

             

             

            And my client was still connected to node2.  So interesting, the above shows both netty connectors and the corresponding hostnames.

             

            I then created two cluster entries one for netty1 and one for netty2, specifying two new discovery groups and finally changed the broadcast-groups single entry to have two connector-refs of netty1 and netty2.  Things were better, but the nodes couldnt communicate with each other.

             

            When I removed the clustering values for netty1 and netty2, setting them back to the default of "netty", the stub sent to the client changed back to the IP addresses of the servers (basically broke with regards to nat'ing) functionality so possibly they problem area to now look at would be to get the cluster communication working while still being able to specify other netty connectors?

             

            I do see this, so apparently related

            About Server Discovery

             

            Servers use a mechanism called "server discovery" to:

            -Forward their connection details to messaging clients: Messaging clients intend to connect to servers of a cluster without specific details on the servers which are up and running at a given point of time

            -Connect to other servers: Servers in a cluster want to establish cluster connections with other servers without specific details on of all other servers in a cluster

             

            Information about servers is sent to messaging clients via normal HornetQ connections and to other servers via cluster connections.

             

            Altering BroadCast Groups back to netty1 and netty2...

            Servers start up fine

            Receive this on node 1 (which was 100% started first)

            INFO 12:17:16 (Incoming-1,shared=udp) org.jboss.as.clustering> JBAS010225: New cluster view for partition web (id: 1, delta: 1, merge: false) : [an00sigap001u:DGP_svr1/web, an00sigap002u:DGP_svr2/web]

             

             

            INFO 12:17:16 (Incoming-1,shared=udp) org.infinispan.remoting.transport.jgroups.JGroupsTransport> ISPN000094: Received new cluster view: [an00sigap001u:DGP_svr1/web|1] [an00sigap001u:DGP_svr1/web, an00sigap002u:DGP_svr2/web]

             

            INFO 12:17:20 (Thread-19 (HornetQ-server-HornetQServerImpl::serverUUID=5d58dc73-5cff-11e7-89c1-97151e7bb4dd-1261542558)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@3233bb3f [name=sf.my-cluster.8e467f22-5cff-11e7-a98c-1d9673d66c09, queue=QueueImpl[name=sf.my-cluster.8e467f22-5cff-11e7-a98c-1d9673d66c09, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=5d58dc73-5cff-11e7-89c1-97151e7bb4dd]]@69d51f1a targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@3233bb3f [name=sf.my-cluster.8e467f22-5cff-11e7-a98c-1d9673d66c09, queue=QueueImpl[name=sf.my-cluster.8e467f22-5cff-11e7-a98c-1d9673d66c09, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=5d58dc73-5cff-11e7-89c1-97151e7bb4dd]]@69d51f1a targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@26246010[nodeUUID=5d58dc73-5cff-11e7-89c1-97151e7bb4dd, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=10-140-40-153&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=5d58dc73-5cff-11e7-89c1-97151e7bb4dd])) [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]] is connected

             

            INFO 12:17:20 (Incoming-3,shared=udp) org.jboss.as.clustering> JBAS010225: New cluster view for partition ejb (id: 1, delta: 1, merge: false) : [an00sigap001u:DGP_svr1/ejb, an00sigap002u:DGP_svr2/ejb]

             

            INFO 12:17:20 (Incoming-3,shared=udp) org.infinispan.remoting.transport.jgroups.JGroupsTransport> ISPN000094: Received new cluster view: [an00sigap001u:DGP_svr1/ejb|1] [an00sigap001u:DGP_svr1/ejb, an00sigap002u:DGP_svr2/ejb]

             

             

            And JMS local client gets (broken again)

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FA 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 35 64 35 38 64   63 37 33 2D 35 63 66 66  $.$5d58dc73-5cff

            0020: 2D 31 31 65 37 2D 38 39   63 31 2D 39 37 31 35 31  -11e7-89c1-97151

            0030: 65 37 62 62 34 64 64 00   00 01 5C F5 48 07 23 FF  e7bb4dd...\.H.#.

            0040: 00 00 00 05 00 6E 00 65   00 74 00 74 00 79 00 00  .....n.e.t.t.y..

            0050: 00 3A 00 3A 6F 72 67 2E   68 6F 72 6E 65 74 71 2E  .:.:org.hornetq.

            0060: 63 6F 72 65 2E 72 65 6D   6F 74 69 6E 67 2E 69 6D  core.remoting.im

            0070: 70 6C 2E 6E 65 74 74 79   2E 4E 65 74 74 79 43 6F  pl.netty.NettyCo

            0080: 6E 6E 65 63 74 6F 72 46   61 63 74 6F 72 79 00 00  nnectorFactory..

            0090: 00 03 00 00 00 04 00 70   00 6F 00 72 00 74 03 00  .......p.o.r.t..

            00A0: 00 00 05 00 31 00 30 00   32 00 30 00 36 00 00 00  ....1.0.2.0.6...

            00B0: 04 00 68 00 6F 00 73 00   74 03 00 00 00 0D 00 0D  ..h.o.s.t.......

            00C0: 31 30 2E 31 34 30 2E 34   30 2E 31 35 33 00 00 00  10.140.40.153...

            00D0: 0B 00 0B 73 73 6C 2D 65   6E 61 62 6C 65 64 03 00  ...ssl-enabled..

            00E0: 00 00 04 00 74 00 72 00   75 00 65 00 00 01 00 00  ....t.r.u.e.....

            00F0: 00 09 00 09 75 6E 64 65   66 69 6E 65 64 36 A6 9F  ....undefined6..

            0100: AA 7A 17 49 71 57 30 C6   AC 8E 3C C8 BD D8 A7 37  .z.IqW0...<....7

            0110: B5 0E 0E 0E 0E 0E 0E 0E   0E 0E 0E 0E 0E 0E 0E 0E  ................

            [Raw read (bb)]: length = 37

            0000: 17 03 01 00 20 41 F8 36   43 FA F0 9A A0 3B E4 B3  .... A.6C....;..

            0010: DA 4F CC 33 07 44 D0 EE   F5 F3 85 FB 80 62 43 AC  .O.3.D.......bC.

            0020: 5B 32 CA 6C 83                                     [2.l.

            Padded plaintext after DECRYPTION:  len = 32

            0000: 00 77 8C E3 C0 58 8F 21   92 17 D8 33 73 2F 30 84  .w...X.!...3s/0.

            0010: 96 BE 27 8F 6E 0A 0A 0A   0A 0A 0A 0A 0A 0A 0A 0A  ..'.n...........

            [Raw read (bb)]: length = 293

            0000: 17 03 01 01 20 7E 03 C7   53 FF D6 F2 AC B2 EC 20  .... ...S......

            0010: B5 64 EE 88 DB EE D3 53   99 6F 02 FF 0D 4E 7F DA  .d.....S.o...N..

            0020: C6 42 B6 36 02 B7 7D FB   DD 13 4B 3C 39 B1 3B 2E  .B.6......K<9.;.

            0030: 06 7D 71 6E 28 A9 40 F3   EC C9 71 03 31 C9 3D 87  ..qn(.@...q.1.=.

            0040: A7 6C B3 4F 43 F2 B8 E5   A6 02 40 23 22 37 FC A4  .l.OC.....@#"7..

            0050: 3C E8 62 A1 F1 D8 C7 2D   38 CF 5C 11 B4 BA 41 68  <.b....-8.\...Ah

            0060: FF E9 8D 9E 3A 68 8E 71   7A 1B 5D F2 52 29 0C 1C  ....:h.qz.].R)..

            0070: F5 F4 18 3E A7 5B B0 D7   A2 FB 30 33 C7 69 D4 CD  ...>.[....03.i..

            0080: 2D 13 48 9F 61 A4 2A CE   F3 C6 4B 4A 66 36 6B 5B  -.H.a.*...KJf6k[

            0090: 14 3D 24 53 DF 4C 09 52   0C AB B6 8F 7E C1 21 D9  .=$S.L.R......!.

            00A0: B0 EA 89 6F 76 60 8A F3   1E 3D 64 86 92 F7 A9 BB  ...ov`...=d.....

            00B0: 44 DD B3 15 97 83 4C E5   B3 C8 58 A7 65 17 E0 74  D.....L...X.e..t

            00C0: 81 98 E4 AC 67 A7 8C 1E   EB F8 4E 80 FA DB 5A A3  ....g.....N...Z.

            00D0: 99 CD 05 52 48 9D 93 EF   57 A4 42 E4 84 B0 DF 2D  ...RH...W.B....-

            00E0: FE 6B 08 49 C6 CE D6 60   CA 6C CD 6E E0 52 3A E9  .k.I...`.l.n.R:.

            00F0: 6D F9 21 13 CC 7C FB 2A   3B 27 E1 64 9B CD 1F 4E  m.!....*;'.d...N

            0100: 66 3C 6D CC 03 30 80 02   FE 06 CD 65 B3 C9 89 72  f<m..0.....e...r

            0110: D7 B8 C5 A5 25 D8 63 A6   6F 4A 10 44 DA 38 B9 00  ....%.c.oJ.D.8..

            0120: A8 CD A3 8C 52                                     ....R

            Padded plaintext after DECRYPTION:  len = 288

            0000: 00 00 FA 72 00 00 00 00   00 00 00 00 00 00 00 00  ...r............

            0010: 24 00 24 38 65 34 36 37   66 32 32 2D 35 63 66 66  $.$8e467f22-5cff

            0020: 2D 31 31 65 37 2D 61 39   38 63 2D 31 64 39 36 37  -11e7-a98c-1d967

            0030: 33 64 36 36 63 30 39 00   00 01 5C F5 49 52 20 FF  3d66c09...\.IR .

            0040: 00 00 00 05 00 6E 00 65   00 74 00 74 00 79 00 00  .....n.e.t.t.y..

            0050: 00 3A 00 3A 6F 72 67 2E   68 6F 72 6E 65 74 71 2E  .:.:org.hornetq.

            0060: 63 6F 72 65 2E 72 65 6D   6F 74 69 6E 67 2E 69 6D  core.remoting.im

            0070: 70 6C 2E 6E 65 74 74 79   2E 4E 65 74 74 79 43 6F  pl.netty.NettyCo

            0080: 6E 6E 65 63 74 6F 72 46   61 63 74 6F 72 79 00 00  nnectorFactory..

            0090: 00 03 00 00 00 04 00 70   00 6F 00 72 00 74 03 00  .......p.o.r.t..

            00A0: 00 00 05 00 31 00 30 00   32 00 30 00 36 00 00 00  ....1.0.2.0.6...

            00B0: 04 00 68 00 6F 00 73 00   74 03 00 00 00 0D 00 0D  ..h.o.s.t.......

            00C0: 31 30 2E 31 34 30 2E 34   30 2E 31 35 34 00 00 00  10.140.40.154...

            00D0: 0B 00 0B 73 73 6C 2D 65   6E 61 62 6C 65 64 03 00  ...ssl-enabled..

            00E0: 00 00 04 00 74 00 72 00   75 00 65 00 FF 01 00 00  ....t.r.u.e.....

            00F0: 00 09 00 09 75 6E 64 65   66 69 6E 65 64 F0 F4 AB  ....undefined...

            0100: EB 96 C7 73 6F 94 C6 25   C0 F2 55 8C F5 6E E4 CD  ...so..%..U..n..

            0110: D8 0E 0E 0E 0E 0E 0E 0E   0E 0E 0E 0E 0E 0E 0E 0E  ................

             

            When I try and set Broadcast group back to netty, and set the cluster connections back to use the two discovery groups specifying netty1 and netty2 - then node1 (started first) starts throwing these exceptions

            ERROR 12:42:20 (Thread-6 (HornetQ-client-global-threads-775765468)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue69442259-5d02-11e7-b19f-45effa48faaf

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerClosed(ClusterConnectionImpl.java:1570)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1288)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

             

             

            ERROR 12:42:20 (Thread-11 (HornetQ-client-global-threads-775765468)) org.hornetq.core.server> HQ224037: cluster connection Failed to handle message: java.lang.IllegalStateException: Cannot find binding for jms.queue.JSR264XmlRequestQueue69442259-5d02-11e7-b19f-45effa48faaf on ClusterConnectionImpl@752363295[nodeUUID=b5876822-5d01-11e7-8bc7-cd0a8ca44de2, connector=TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=b5876822-5d01-11e7-8bc7-cd0a8ca44de2]

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:1510)

                    at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:1282)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1085)

                    at org.hornetq.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:57)

                    at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1220)

                    at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106)

                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                    at java.lang.Thread.run(Thread.java:744)

             

             

            And the cluster only seem to see the second cluster discovery group as the above shows an00sigap002u and its netty2/the second connector and listing

            The JMS client goes back to only receiving stub entries for the the netty2 connector

             

             

            And I also see this

            Discovery groups are mainly used by cluster connections and Java Messaging Service (JMS) clients to obtain initial connection information in order to download the required topology.

             

            NOTE

            You must configure each discovery group with an appropriate broadcast endpoint which matches its broadcast group counterpart (UDP or JGroups).

             

            So would I need to create a discovery group, add it to the pertinent connecting factories, then also add new socket bindings with said sockets being bound to the external IP?  If so would I need custom code for the clients specifying the DiscoveryGroup?

             

            But when this is done with two discovery groups as the cluster-connections element sub element cluster-connection doesnt accept two connector refs, thats when it seems things get broken

             

            Thanks Again for the help Justin and anyone else!

             

            Jay

            • 3. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
              jasonglass

              I also then tried to add another broadcast-group seeing as how discovery-groups seem reliant and ditto the cluster connections but then I got the following exception on the master...

              JBAS014775:    New missing/unsatisfied dependencies:

                    service jboss.binding.messaging-group2 (missing) dependents: [service jboss.messaging.default.bindings.broadcast.bg-group2, service jboss.messaging.default.bindings.discovery.dg-group2]

               

              And that was with the following

              <broadcast-groups>

                                      <broadcast-group name="bg-group1">

                                          <socket-binding>messaging-group</socket-binding>

                                          <broadcast-period>5000</broadcast-period>

                                          <connector-ref>netty1</connector-ref>

                                      </broadcast-group>

                                      <broadcast-group name="bg-group2">

                                          <socket-binding>messaging-group2</socket-binding>

                                          <broadcast-period>5000</broadcast-period>

                                          <connector-ref>netty2</connector-ref>

                                      </broadcast-group>

                                  </broadcast-groups>

               

                                  <discovery-groups>

                                      <discovery-group name="dg-group1">

                                          <socket-binding>messaging-group</socket-binding>

                                          <refresh-timeout>10000</refresh-timeout>

                                      </discovery-group>

                                      <discovery-group name="dg-group2">

                                          <socket-binding>messaging-group2</socket-binding>

                                          <refresh-timeout>10000</refresh-timeout>

                                      </discovery-group>

                                  </discovery-groups>

               

                                  <cluster-connections>

                                      <cluster-connection name="my-cluster">

                                          <address>jms</address>

                                          <connector-ref>netty1</connector-ref>

                                          <discovery-group-ref discovery-group-name="dg-group1"/>

                                      </cluster-connection>

                                      <cluster-connection name="my-cluster2">

                                          <address>jms</address>

                                          <connector-ref>netty2</connector-ref>

                                          <discovery-group-ref discovery-group-name="dg-group2"/>

                                      </cluster-connection>

                                  </cluster-connections>

               

              And I figured I needed to add the following socket binding(I just tweaked the port by a couple, basically subtracted 5)

              <socket-binding name="messaging-group2" port="0" multicast-address="${jboss.messaging.group.address:231.7.7.7}" multicast-port="${jboss.messaging.group.port:9869}"/>

               

              Hmm... actually put it in the wrong socket binding group, once it was added to the "master" socket binding group instead of the full-ha - no more error....

               

              But now when starting one of the apps on the nodes I get the same or a similar error

              INFO 15:22:54 (Controller Boot Thread) org.jboss.as.controller> JBAS014774: Service status report

              JBAS014775:    New missing/unsatisfied dependencies:

                    service jboss.binding.messaging-group2 (missing) dependents: [service jboss.messaging.default.bindings.broadcast.bg-group2, service jboss.messaging.default.bindings.discovery.dg-group2]

               

              Which basically is fatal, can no longer login as one of the deployed apps mbeans handles login ;-(  Rollback.

              • 4. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                jbertram

                I'm not sure how much help I'm going to be able to provide you with the current problem-solving and communication strategy.  There's just too much information to be able to clearly respond.  For every paragraph you write about an experiment you've conducted and results you've gathered I could respond with 2 or 3 paragraphs of my own.  Unfortunately there's just not time to weed through all of it and figure out what really matters.  If we were in a room together face to face then we could sort through all this stuff quickly, but this kind of forum is not nearly as efficient.  Combine that with the fact that I have a shallow understanding of EAP domain mode configuration and semantics (I work on the Artemis message broker) and I think you might understand why we might need a different strategy.

                 

                I think Gall's Law can help here.  Previously you had a domain of one node working, and your goal a domain of 4 nodes.  However, rather than jumping directly to a 4 node domain I would suggest you simply get 2 standalone, clustered nodes working (which will eliminate the domain complexity from the equation).

                 

                Also, it would be good to clarify the desired client semantics now that the client is connecting to a cluster vs. a single node.  Are you looking for connection load-balancing?  Do you want fail-over?

                 

                In summary, let's step back and clarify the goal and simplify the configuration.  Once we get a clear, simple use-case working then we can move to something more complex.

                1 of 1 people found this helpful
                • 5. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                  jasonglass

                  Hi Justin!  Thank you for the reply!  I fully understand on the "current problem-solving and communication strategy" and my apologies on the length, I was just trying to convey all the testing and configuration I had tried.  It does seem 100% cluster related now as once the netty connectors were created and a hostname added as a host instead of the IP - the stub was 100% correct but then the cluster appeared to be broken and no longer able to communicate.

                   

                  I like you suggestion on two clustered standalone servers, my only problem though is these are three live environments with a vendor that communicates to the JBoss servers having its system moved to Saas and hence the NAT/Fire wall issue we're now encountering.  The three JBoss environments are actually another vendors and the vendor has them configured in domain mode with one admin node and 4 app nodes, so I cant really break them apart.  I also understand on the "shallow understanding of EAP domain mode configuration and semantics" and really appreciate your willingness to even try and help a bit!

                   

                  On the clarification for the client, the client is our vendor (but I also test a "client" java test code from my laptop on the corporate network so no Saas and Firewalls involved) code.  They perform the JNDI naming lookup to the 4 remote hosts with the connection string of:
                  "java.naming.provider.url=remote://10.250.40.153:10202,remote://10.250.40.154:10202,remote://10.250.40.155:10202,remote://10.250.40.156:10202\n"

                   

                  Thats got the NAT/External IP's in it.  The connection is made fine and when the stub with the connection factory information is returned, it unfortunately lists the internal non-nat'd messaging IP's and port of
                  10.140.40.153:10206, 10.140.40.154:10206, 10.140.40.155:10206, and 10.140.40.156:10206

                   

                  Its at this point that the connection(s) fail as the local/internal IPs are not routeable.  That's basically why I'm trying to get the host names in their instead as then JBoss has no issues with connecting to itself as a client and the vendor doesnt have a problem as the hostnames on their side map to the external NAT'd IP's.

                   

                  All four servers share the same copies of the JMS queues and when an item is placed on a single nodes queue, its replicated since theyre clustered to the other nodes.  In testing, even with the JNDI lookup above, a connection is only made to the first node that an connection is attempted, if it connects it stays connected to that node, if that node goes down, it catches the exception and moves on to the next available nodes until a working one is found - so thats where the clustering comes in. I'm almost there but just missing something ;-(  frustrating!

                   

                  Thanks again Justin - again I really appreciate the feedback!

                   

                  Jay

                  • 6. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                    jbertram

                    When a client connects to a cluster it will receive a "topology" update to let it know about the other nodes in the cluster (e.g. for things like automatic fail-over and connection load-balancing).  The topology information the client receives is based on the connectors that the cluster nodes broadcast among each other (i.e. the connector-ref in the node's cluster-connection configuration).  In your situation this connector-ref must refer to the "internal" host/port so that cluster nodes can actually find each other and make valid connections.  However, this "internal" host/port topology information is then sent to the client even though the client actually needs to use the "external" host/port information in order to make a valid connection.  This may end up being a design weakness with clustering in NATted environments because I'm not sure how to deal with it at this point.

                     

                    That said, I don't understand why the client's initial connection fails.  When it looks up the connection factory in JNDI it should receive the stub with the proper configuration (I assume you've configured this properly since you've done this before with a single node domain), make the connection, and only then should it receive the topology information from the cluster and it should only use that information for additional connections from that same connection factory (which would normally be load-balanced across the cluster nodes) or in the case of fail-over (although by all appearances you're not actually configuring live/backup pairs so fail-over shouldn't actually ever happen).  In other words, even with this apparent design weakness I think a single connection should work.  It's possible there's a moving piece that I'm simply overlooking that would explain the issue.  I think this is where the investigation should focus.

                     

                    Let's hit the reset button and start from scratch to build up a configuration that's simple to understand and test.  I know you've done a lot of this configuration already, but resetting will help us get on the same page and should clarify your configuration.

                     

                    Since what you want on the internal network is simply a 4-node cluster then you should start with the "full-ha" profile.  If you have a 4-node domain using the full-ha profile that should form a cluster out of the box.  Next, you want to access that cluster from a client on an "external" network via NAT.  To facilitate that you should create a new connection factory that references a new connector that's configured with the host/port information that the client on the external network needs, e.g.:

                     

                    <connector name="external-connector">

                       <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                       <param key="host" value="externalHost"/>

                       <param key="port" value="10206"/>

                    </connector>

                     

                    <connection-factory name="ExternalConnectionFactory">

                       <connectors>

                          <connector-ref connector-name="external-connector"/>

                       </connectors>

                       <entries>

                          <entry name="java:jboss/exported/ExternalConnectionFactory"/>

                       </entries>

                    </connection-factory>

                     

                    The value of "externalHost" on the connector would obviously be different on every node since it will need to resolve to that specific node via NAT.

                     

                    Do whatever you did before (if anything) to get NAT working for the JNDI lookup.

                     

                    Don't configure SSL or any other stuff that isn't absolutely necessary to test this use-case.  Once that's all done give it a shot and let me know how it goes.

                     

                    Now to address something else you mentioned...

                     

                    All four servers share the same copies of the JMS queues and when an item is placed on a single nodes queue, its replicated since theyre clustered to the other nodes.

                    I don't see any evidence that you've actually configured replication.  Perhaps this is a misunderstanding about exactly what "replication" means in this context.  In HornetQ "replication" refers to the process where a "live" broker replicates its message data to it's "backup" broker across the network to enable message high-availability (i.e. HA).  Clustering and HA are separate concepts in HornetQ.  I can only see where you've configured clustering, not HA.  To be clear, if a node in the cluster goes down without a backup then the messages on that node are lost (at least until that failed node can be recovered/restarted).

                    • 7. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                      jasonglass

                      Hi Justin, thanks again for your time!  And if you want to let me know how to quote your responses in the basic editor for easier readability I'd be more than happy to!

                       

                      When a client connects to a cluster it will receive a "topology" update to let it know about the other nodes in the cluster (e.g. for things like automatic fail-over and connection load-balancing).  The topology information the client receives is based on the connectors that the cluster nodes broadcast among each other (i.e. the connector-ref in the node's cluster-connection configuration).  In your situation this connector-ref must refer to the "internal" host/port so that cluster nodes can actually find each other and make valid connections.  However, this "internal" host/port topology information is then sent to the client even though the client actually needs to use the "external" host/port information in order to make a valid connection.  This may end up being a design weakness with clustering in NATted environments because I'm not sure how to deal with it at this point.

                       

                      J.G.:  Yes, I see that, I didnt know it was called that ;-) but now that you mention it I saw references to it elsewhere in my research, so thats similar or slightly similar to the stub?  I've also seen the information sent with different available nodes when stopping one or more and starting one or more, on this "However, this "internal" host/port topology information is then sent to the client even though the client actually needs to use the "external" host/port information in order to make a valid connection" that seems to be exactly my problem.

                       

                       

                      That said, I don't understand why the client's initial connection fails.  When it looks up the connection factory in JNDI it should receive the stub with the proper configuration (I assume you've configured this properly since you've done this before with a single node domain), make the connection, and only then should it receive the topology information from the cluster and it should only use that information for additional connections from that same connection factory (which would normally be load-balanced across the cluster nodes) or in the case of fail-over (although by all appearances you're not actually configuring live/backup pairs so fail-over shouldn't actually ever happen).  In other words, even with this apparent design weakness I think a single connection should work.  It's possible there's a moving piece that I'm simply overlooking that would explain the issue.  I think this is where the investigation should focus.

                       

                       

                      J.G.:  So, sorry if I said this wrong... the JNDI lookup does work, then when the code tries to create the connectionFactory - thats what fails in the NAT'ed environment because as you indicated the clusters advertising the internal IP and not an external IP or DNS name, a dns name would 100% work if I could figure out how to get it in there without breaking the clustering, like I said when I bind to 0.0.0.0 it did use the name instead of the IP, but then that broke a number of toher things.

                       

                      Let's hit the reset button and start from scratch to build up a configuration that's simple to understand and test.  I know you've done a lot of this configuration already, but resetting will help us get on the same page and should clarify your configuration.

                       

                       

                      J.G.:  I'll try and work on this a bit over the weekend and just so you know, since its a rather pressing matter I did open a ticket with red hat support through my company so I dont want to waste your time, but I think you seem a little intrigued as I am now!  You're also probably like me and dont like to give up!  Also, as a test.  My boss had me just start up the admin node with the netty connector specifying a single hostname in it, I then started that specific node/host and the client test code had no issues connecting to it, and the node had no issues connecting to itself.  So it really does seem like its clustering related as you indicated and if the clustering could just use and understand the host names instead of the IP addresses, everything would just work!

                       

                      Since what you want on the internal network is simply a 4-node cluster then you should start with the "full-ha" profile.  If you have a 4-node domain using the full-ha profile that should form a cluster out of the box.  Next, you want to access that cluster from a client on an "external" network via NAT.  To facilitate that you should create a new connection factory that references a new connector that's configured with the host/port information that the client on the external network needs, e.g.:

                       

                      <connector name="external-connector">

                         <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                         <param key="host" value="externalHost"/>

                         <param key="port" value="10206"/>

                      </connector>

                       

                      <connection-factory name="ExternalConnectionFactory">

                         <connectors>

                            <connector-ref connector-name="external-connector"/>

                         </connectors>

                         <entries>

                            <entry name="java:jboss/exported/ExternalConnectionFactory"/>

                         </entries>

                      </connection-factory>

                       

                      The value of "externalHost" on the connector would obviously be different on every node since it will need to resolve to that specific node via NAT.

                       

                      Do whatever you did before (if anything) to get NAT working for the JNDI lookup.

                       

                      Don't configure SSL or any other stuff that isn't absolutely necessary to test this use-case.  Once that's all done give it a shot and let me know how it goes.

                       

                      Now to address something else you mentioned...

                       

                      All four servers share the same copies of the JMS queues and when an item is placed on a single nodes queue, its replicated since theyre clustered to the other nodes.

                      I don't see any evidence that you've actually configured replication.  Perhaps this is a misunderstanding about exactly what "replication" means in this context.  In HornetQ "replication" refers to the process where a "live" broker replicates its message data to it's "backup" broker across the network to enable message high-availability (i.e. HA).  Clustering and HA are separate concepts in HornetQ.  I can only see where you've configured clustering, not HA.  To be clear, if a node in the cluster goes down without a backup then the messages on that node are lost (at least until that failed node can be recovered/restarted).

                       

                       

                      J.G.:  I could be wrong on this, but in the non-nat'ed environment I can test with.  When I had four nodes up and was say connected to node1 through the JMS client, If I triggered a JMS produce event on node4, e.g. adding a message - the message was received by my client that was connected to node1.  When I created a JMS message event on Node3, it appeared on the connection to node1 and ditto with node 2.  Also since they all seemed to be broken at various times and unable to commuinicate with each other in certian configurations, I had likely wrongly assumed that my being clustered and having messaging-groups setup that they were somehow replicating the JMS messages between themselves.  I also got this exception which made me think that, well not an exception but its a cluster connection JMS bridge?

                      NFO 10:14:26 (Thread-19 (HornetQ-server-HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb-1036412157)) org.hornetq.core.server> HQ221027: Bridge ClusterConnectionBridge@1468c040 [name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@6925b437 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@1468c040 [name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, queue=QueueImpl[name=sf.my-cluster2.1d7f8ea2-5cee-11e7-82fe-9752c08c42dd, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb]]@6925b437 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1960919995[nodeUUID=61b4303c-5cee-11e7-a626-2552f6e426cb, connector=TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true, address=jms, server=HornetQServerImpl::serverUUID=61b4303c-5cee-11e7-a626-2552f6e426cb])) [initialConnectors=[TransportConfiguration(name=netty2, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=10206&host=an00sigap002u&ssl-enabled=true], discoveryGroupConfiguration=null]] is connected

                      • 8. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                        jasonglass

                        Hey Justin jbertram  Hope you had a great weekend!  Hey, I didnt realize that HornetQ now appears to be falling under Artemis and that you were one of the developers?  OR maybe its more of an ActiveMQ related thing?  Anyways I had an idea to post directly in the HornetQ forums when I realized you were one of the top contributors! ;-)

                         

                        Anyways, you dont think something like this might work do you?
                        HornetQ clustering issues when binding JBoss EAP to 0.0.0.0 - Red Hat Customer Portal

                         

                        Also, if you familiar or slight familiar with the code would it be possible to see if theres anything in the code that alludes to being able to use a hostname in the toplology discovery instead of IP's?  I know its supported as when I had the normal netty connecter, then 4 others all by hostname, the stub that was sent contained the test netty connector FQDN and there were also listed netty1 through netty4 each with hostnames.  Just wondering if theres a way to infer from the code the setting that might be missing to get the topology updates and ConnectionFactory proxy/stub code to use hostnames instead of IP's.   thank

                         

                        Your thoughts?  And thanks again!

                         

                        Jay

                        • 9. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                          jbertram

                          I'm catching back up on this after two weeks of vacation.  Let me first address a few questions from your last two comments...

                           

                          • The solution you linked (i.e. HornetQ clustering issues when binding JBoss EAP to 0.0.0.0) is just another way to configure the same thing I specified in my last comment.  It's a "cleaner" way to specify a custom host and port on a connector but the method explained in the linked document was added more recently and I can never remember which versions it works on so it's usually simpler to just explain the method that should always work (even if it isn't as "clean").
                          • Cluster nodes establish connections with each other and form "bridges" so messages can be exchanged (not replicated) between nodes.  The logging you see saying, "HQ221027: Bridge...is connected," is that bridge being formed when the node joins the cluster.
                          • The HornetQ code-base was donated to the Apache ActiveMQ community a couple of years back and is now continuing life as the Apache ActiveMQ Artemis broker.  I was a developer on HornetQ and am now an Apache committer working on Artemis.
                          • I'm not aware of any code that really cares about IP vs. hostname.

                           

                          Again, the thing that puzzles me is that the initial connection (after JNDI lookup) fails when the connection factory has a connector statically configured with the proper host and port.  Of course, this may be moot by now.  Have you made any progress in the last 2 weeks?

                          • 10. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                            jasonglass

                            Hi Justin!   Hope you had a great vacation!

                             

                            Sorry I didnt get back to you sooner as well, I also had some things come up.

                             

                            On the bullet point one.  Understood on the cleaner way of doing it.  I've tried out out and below is the configuration, I did try a few different host entries as you can see in that theyre commented out.  I still cant get it to work.

                             

                            On bullet two.  Oh, understood on the replicated/vs. exchanged, I'm still not exactly sure if the messages are being exchanged between the servers, e.g. say you have a JMS order object, server four handles the order, puts it on its queue.  Is that order also "exchanged" with the other four server until a remote client consumes it?

                             

                            On bullet three, okay, now I understand the hornetQ/Vs active Mq!  Thanks!

                             

                            On bullet four.... yeah.... ;-)

                             

                            On your last line, again, no progress.  And this is what I last tried.  Note I added the nat acceptor.  When I use the host names entry, the connectionFactory stub sent back perfectly contains all the hostnames of the servers - but then the servers cannot connect (form bridges)between to themselves!  Again, thanks for all your help on this, works starting to panic a bit as this is supposed to be in prod ASAP.

                             

                            Config:

                            Note:  oh, and the client does correctly connect to the new connectionFactory, e.g. QueueConnectionFactory2, but then fails on a topology timeout exception, likely because the JMS crashed trying to come up on the servers, its almost like the servers are trying to use the NAT as well and its causing them to be able to not bridge to each other.
                            Note:  oh, and also I havent been able to get rid of the ssl, I keep trying but the client fails to connect then when the sssl related lines below are removed

                             

                                                <connectors>

                                                    <netty-connector name="netty" socket-binding="messaging">

                                                        <param key="ssl-enabled" value="true"/>

                                                        <param key="key-store-path" value="${smp.home}/ssl/smp_cert_key_store.jks"/>

                                                        <param key="key-store-password" value="pass"/>

                                                    </netty-connector>

                             

                                    <connector name="netty-nat1">

                                      <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                      <param key="host" value="an00sigap001u"/>

                                      <param key="port" value="30000"/>

                                    </connector>

                             

                                    <connector name="netty-nat2">

                                    <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                      <param key="host" value="an00sigap002u"/>

                                      <param key="port" value="30000"/>

                                     </connector>

                             

                                    <connector name="netty-nat3">

                                      <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                      <param key="host" value="an00sigap003u"/>

                                      <param key="port" value="30000"/>

                                    </connector>

                             

                                    <connector name="netty-nat4">

                                      <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>

                                      <param key="host" value="an00sigap004u"/>

                                      <param key="port" value="30000"/>

                                    </connector>

                             

                                                <acceptors>

                                                    <netty-acceptor name="netty" socket-binding="messaging">

                                                        <param key="ssl-enabled" value="true"/>

                                                        <param key="key-store-path" value="${smp.home}/ssl/smp_cert_key_store.jks"/>

                                                        <param key="key-store-password" value="pass"/>

                                                    </netty-acceptor>

                             

                             

                            <acceptor name="netty-nat">

                            <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                            <param key="ssl-enabled" value="true"/>

                            <param key="key-store-path" value="${smp.home}/ssl/smp_cert_key_store.jks"/>

                            <param key="key-store-password" value="pass"/>

                            <param key="host" value="10.250.240.25,10.250.240.26,10.250.240.27,10.250.240.28"/>

                            <!--param key="host" value="an00sigap001u,an00sigap002u,an00sigap003u,an00sigap004u"/-->

                            <param key="host" value="an00sigap001u,an00sigap002u,an00sigap003u,an00sigap004u,10.250.240.25,10.250.240.26,10.250.240.27,10.250.240.28,10.140.40.153,10.140.40.154,10.140.40.155,10.140.40.156"/>

                            <param key="port" value="30000"/>

                            </acceptor>

                             

                                                    <connection-factory name="RemoteConnectionFactory">

                                                        <connectors>

                                                            <connector-ref connector-name="netty"/>

                                                        </connectors>

                                                        <entries>

                                                            <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/>

                                                        </entries>

                                                        <ha>true</ha>

                                                        <compress-large-messages>true</compress-large-messages>

                                                        <block-on-acknowledge>true</block-on-acknowledge>

                                                        <retry-interval>1000</retry-interval>

                                                        <retry-interval-multiplier>1.0</retry-interval-multiplier>

                                                        <reconnect-attempts>-1</reconnect-attempts>

                                                    </connection-factory>

                                                    <connection-factory name="MyCoQueueConnectionFactory">

                                                        <connectors>

                                                            <connector-ref connector-name="netty"/>

                                                        </connectors>

                                                        <entries>

                                                            <entry name="java:jboss/exported/System/myco/ApplicationType/OrderManagement/Application/4-3;1-0;SMP/Comp/QueueConnectionFactory"/>

                                                            <entry name="java:/System/myco/ApplicationType/OrderManagement/Application/4-3;1-0;SMP/Comp/QueueConnectionFactory"/>

                                                        </entries>

                                                        <compress-large-messages>true</compress-large-messages>

                                                    </connection-factory>

                             

                             

                            <connection-factory name="MyCoQueueConnectionFactory2">

                              <connectors>

                              <connector-ref connector-name="netty-nat1"/>

                              <connector-ref connector-name="netty-nat2"/>

                              <connector-ref connector-name="netty-nat3"/>

                              <connector-ref connector-name="netty-nat4"/>

                              </connectors>

                            <entries>

                            <entry name="java:jboss/exported/System/myco/ApplicationType/OrderManagement/Application/4-3;1-0;SMP/Comp/QueueConnectionFactory2"/>

                            <entry name="java:/System/myco/ApplicationType/OrderManagement/Application/4-3;1-0;SMP/Comp/QueueConnectionFactory2"/>

                            </entries>

                            <compress-large-messages>true</compress-large-messages>

                            </connection-factory>

                             

                            Starting nodes I get (I also tried other ports and nothing is bound to those ports):

                            Node1:

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221000: live server is starting with configuration HornetQ Configuration (clustered=true,backup=false,sharedStore=true,journalDirectory=/home/jboss/domains/myco/servers/smp_svr1/data/messagingjournal,bindingsDirectory=/home/jboss/domains/myco/servers/smp_svr1/data/messagingbindings,largeMessagesDirectory=/home/jboss/domains/myco/servers/smp_svr1/data/messaginglargemessages,pagingDirectory=/home/jboss/domains/myco/servers/smp_svr1/data/messagingpaging)

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221006: Waiting to obtain live lock

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221013: Using NIO Journal

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221034: Waiting to obtain live lock

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221035: Live Server Obtained live lock

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221020: Started Netty Acceptor version 3.6.2.Final-redhat-1-c0d783c 10.140.40.153:10208 for CORE protocol

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221020: Started Netty Acceptor version 3.6.2.Final-redhat-1-c0d783c 10.140.40.153:10206 for CORE protocol

                            ERROR 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ224000: Failure in initialisation: org.jboss.netty.channel.ChannelException: Failed to bind to: an00sigap002u/10.140.40.154:30000

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)

                                    at org.hornetq.core.remoting.impl.netty.NettyAcceptor.startServerChannels(NettyAcceptor.java:525)

                                    at org.hornetq.core.remoting.impl.netty.NettyAcceptor.start(NettyAcceptor.java:478)

                                    at org.hornetq.core.remoting.server.impl.RemotingServiceImpl.start(RemotingServiceImpl.java:256)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.initialisePart2(HornetQServerImpl.java:1604)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.access$1400(HornetQServerImpl.java:169)

                                    at org.hornetq.core.server.impl.HornetQServerImpl$SharedStoreLiveActivation.run(HornetQServerImpl.java:2073)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServerImpl.java:425)

                                    at org.hornetq.jms.server.impl.JMSServerManagerImpl.start(JMSServerManagerImpl.java:483)

                                    at org.jboss.as.messaging.jms.JMSService.start(JMSService.java:112)

                                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811)

                                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746)

                                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                                    at java.lang.Thread.run(Thread.java:744)

                            Caused by: java.net.BindException: Cannot assign requested address

                                    at java.net.PlainSocketImpl.socketBind(Native Method)

                                    at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)

                                    at java.net.ServerSocket.bind(ServerSocket.java:376)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.bind(OioServerSocketPipelineSink.java:128)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.handleServerSocket(OioServerSocketPipelineSink.java:79)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.eventSunk(OioServerSocketPipelineSink.java:53)

                                    at org.jboss.netty.channel.Channels.bind(Channels.java:561)

                                    at org.jboss.netty.channel.AbstractChannel.bind(AbstractChannel.java:189)

                                    at org.jboss.netty.bootstrap.ServerBootstrap$Binder.channelOpen(ServerBootstrap.java:382)

                                    at org.jboss.netty.channel.Channels.fireChannelOpen(Channels.java:170)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannel.<init>(OioServerSocketChannel.java:78)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannelFactory.newChannel(OioServerSocketChannelFactory.java:127)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannelFactory.newChannel(OioServerSocketChannelFactory.java:87)

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bindAsync(ServerBootstrap.java:329)

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:266)

                                    ... 14 more

                             

                             

                            INFO 12:06:04 (MSC service thread 1-9) org.hornetq.core.server> HQ221001: HornetQ Server version 2.3.1.Final (Wild Hornet, 123) [9f3d6523-72fe-11e7-88c6-ffa1275a95de]

                             

                            Node2 (slightly different in that it talks about a live lock):

                            INFO 12:16:30 (MSC service thread 1-12) org.hornetq.core.server> HQ221034: Waiting to obtain live lock

                            INFO 12:16:30 (MSC service thread 1-12) org.hornetq.core.server> HQ221035: Live Server Obtained live lock

                            ERROR 12:16:30 (MSC service thread 1-12) org.hornetq.core.server> HQ224000: Failure in initialisation: org.jboss.netty.channel.ChannelException: Failed to bind to: an00sigap001u/10.140.40.153:30000

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)

                                    at org.hornetq.core.remoting.impl.netty.NettyAcceptor.startServerChannels(NettyAcceptor.java:525)

                                    at org.hornetq.core.remoting.impl.netty.NettyAcceptor.start(NettyAcceptor.java:478)

                                    at org.hornetq.core.remoting.server.impl.RemotingServiceImpl.start(RemotingServiceImpl.java:256)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.initialisePart2(HornetQServerImpl.java:1604)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.access$1400(HornetQServerImpl.java:169)

                                    at org.hornetq.core.server.impl.HornetQServerImpl$SharedStoreLiveActivation.run(HornetQServerImpl.java:2073)

                                    at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServerImpl.java:425)

                                    at org.hornetq.jms.server.impl.JMSServerManagerImpl.start(JMSServerManagerImpl.java:483)

                                    at org.jboss.as.messaging.jms.JMSService.start(JMSService.java:112)

                                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811)

                                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746)

                                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

                                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

                                    at java.lang.Thread.run(Thread.java:744)

                            Caused by: java.net.BindException: Cannot assign requested address

                                    at java.net.PlainSocketImpl.socketBind(Native Method)

                                    at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)

                                    at java.net.ServerSocket.bind(ServerSocket.java:376)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.bind(OioServerSocketPipelineSink.java:128)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.handleServerSocket(OioServerSocketPipelineSink.java:79)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketPipelineSink.eventSunk(OioServerSocketPipelineSink.java:53)

                                    at org.jboss.netty.channel.Channels.bind(Channels.java:561)

                                    at org.jboss.netty.channel.AbstractChannel.bind(AbstractChannel.java:189)

                                    at org.jboss.netty.bootstrap.ServerBootstrap$Binder.channelOpen(ServerBootstrap.java:382)

                                    at org.jboss.netty.channel.Channels.fireChannelOpen(Channels.java:170)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannel.<init>(OioServerSocketChannel.java:78)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannelFactory.newChannel(OioServerSocketChannelFactory.java:127)

                                    at org.jboss.netty.channel.socket.oio.OioServerSocketChannelFactory.newChannel(OioServerSocketChannelFactory.java:87)

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bindAsync(ServerBootstrap.java:329)

                                    at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:266)

                                    ... 14 more

                             

                             

                            INFO 12:16:30 (MSC service thread 1-12) org.hornetq.core.server> HQ221001: HornetQ Server version 2.3.1.Final (Wild Hornet, 123) [14af0cd9-7300-11e7-af10-d17fc8cc263c]

                            • 11. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                              jasonglass

                              Hi Justin!  Also, I did try this "HornetQ clustering issues when binding JBoss EAP to 0.0.0.0 - Red Hat Customer Portal " and thats not working, likely you were right about it not being supported anymore, the netty connector apparently doesnt understand what an outbound socket binding is.

                               

                              On another note and if you can think of how it might help.  I've finally manged with the help of our IT guys to bind those NAT IP's to the servers as virtual IP's.  The virtual IP of one server can be pinged from the other.  One thing I've been trying with out much success is binding the netty connector, acceptor and connectionFactory to those NAT IP's thinking maybe I could just get the JMS messaging to use those NAT IP's and thus the four servers could connect to themselves over the NAT-IP's and the remote client would then receive those NAT-IP's in the factory and topology updates.

                               

                              Another thought is to try and add another interface with it being the NAT-IP's, then bind just messaging to it through the socket bindings/groups, but not sure if for the nodes if different socket binding groups can be specified, e.g. one socket binding group for the normal stuff like http, then another for Messaging, possibly naming on the NAT interface?

                              • 12. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                                jbertram

                                I'm still not exactly sure if the messages are being exchanged between the servers, e.g. say you have a JMS order object, server four handles the order, puts it on its queue. Is that order also "exchanged" with the other four server until a remote client consumes it?

                                Say, for example you have a cluster of 2 nodes - A and B.  A producer sends a message to queue X on node A.  A consumer connects and starts listening to queue X on node B.  Since there are 0 messages in queue X on node B and there are 0 consumers on queue X on node A the cluster will "redistribute" the message from node A to node B across the cluster bridge so that the consumer on queue X can receive it.

                                 

                                Note: oh, and also I havent been able to get rid of the ssl, I keep trying but the client fails to connect then when the sssl related lines below are removed

                                You'll need to remove the SSL configuration from both the connector and acceptor.

                                 

                                <acceptor name="netty-nat">

                                <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>

                                <param key="ssl-enabled" value="true"/>

                                <param key="key-store-path" value="${smp.home}/ssl/smp_cert_key_store.jks"/>

                                <param key="key-store-password" value="pass"/>

                                <param key="host" value="10.250.240.25,10.250.240.26,10.250.240.27,10.250.240.28"/>

                                <!--param key="host" value="an00sigap001u,an00sigap002u,an00sigap003u,an00sigap004u"/-->

                                <param key="host" value="an00sigap001u,an00sigap002u,an00sigap003u,an00sigap004u,10.250.240.25,10.250.240.26,10.250.240.27,10.250.240.28,10.140.40.153,10.140.40.154,10.140.40.155,10.140.40.156"/>

                                <param key="port" value="30000"/>

                                </acceptor>

                                I wouldn't expect this configuration to work, and I'm not clear on why you would need it in the first place.  An acceptor listens for incoming network connections on a host:port combo.  You can't give it an arbitrary number of hosts to listen on.  The only thing that should be using the NAT information are the connectors for the clients which will be going through the NAT layer.  Those values should be translated via NAT to the host:port combo where the acceptor is actually listening.

                                • 13. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                                  jbertram

                                  One thing I've been trying with out much success is binding the netty connector, acceptor and connectionFactory to those NAT IP's thinking maybe I could just get the JMS messaging to use those NAT IP's and thus the four servers could connect to themselves over the NAT-IP's and the remote client would then receive those NAT-IP's in the factory and topology updates.

                                  I'm not a network expert by any means, but I know I can create virtual loop-back interfaces on my (Linux) box and have JBoss EAP bind to that interface.  I used to use that functionality all the time when testing local clusters.

                                   

                                  Another thought is to try and add another interface with it being the NAT-IP's, then bind just messaging to it through the socket bindings/groups, but not sure if for the nodes if different socket binding groups can be specified, e.g. one socket binding group for the normal stuff like http, then another for Messaging, possibly naming on the NAT interface?

                                  Individual socket-binding elements can specify which "interface" to use.

                                   

                                   

                                  At this stage it still seems that the configuration is changing rapidly with you experimenting and reporting multiple results at a time.  It's hard for me to keep track of where we are with the "simplified" configuration.

                                  • 14. Re: JMS HornetQ NAT'ed IP's issue with clustered environment
                                    jbertram

                                    At this point here's what I'd like to see:

                                    • Simplest possible EAP configuration for a 2-node messaging cluster
                                    • A remote-connection-factory configured to use a single host:port combo which the client can use through NAT

                                     

                                    My understanding is that in this scenario the cluster will form properly but the client will not be able to connect.  I'm mainly interested in why the client cannot connect in this scenario.  Once you've got this configured and the client is failing to connect let me know.  Obviously if you have any trouble configuring this let me know as well.  Once it's configured and failing then we can gather more detailed logs from the client to hopefully see what's going on.