14 Replies Latest reply on Jul 1, 2010 8:15 AM by instanceof

    Core Bridge Target Server isn't forming cluster.

    artp

      Thanks for your help on the issue below, upgrading to Trunk fixed the that issue. However, during more testing I noticed that if our target server of the core bridge is part of a static configured cluster it will not re-form with the cluster if the target sever is taken down while the other nodes in the cluster stay up. I see the exception below in the other node in the cluster(not the target server of the bridge)from the ClusterConnectionImpl and only see messages on the target server of the core bridge.

       

      Environment Jboss 5.1 and attached in the config on both Cluster B servers...

       

      Here are the steps i take to get into this state. I have a source cluster A that has bridge to one machine in Cluster B, in Cluster B i have two nodes one that is the target(B1) of the core bridge and another node(B2) that is simply clustered with static cluster connectors. I start with all nodes up in a good state. I can send messages from the source Cluster A and receive them on B1 and B2. Next, I take B1 down leaving B2 up. Now If I bring B1 up the core bridge connects and I send messages from cluster A, but they only go to B1, never to B2, I just see the exceptions below. My queues and topics on B2, don't seem to be clustered.

       

      Thanks for your help...

       

       

      Other issue

      https://community.jboss.org/thread/152231?tstart=30

       

       

      ERROR [org.hornetq.core.server.cluster.impl.ClusterConnectionImpl] (Thread-17 (group:HornetQ-client-global-threads-34275781)) Failed to handle message
      java.lang.IllegalStateException: Cannot find binding for jms.queue.HeartbeatEventQueueddfefa33-7296-11df-bd36-d00d4b3608ce
          at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:774)
          at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:568)
          at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:809)
          at org.hornetq.core.client.impl.ClientConsumerImpl.access$100(ClientConsumerImpl.java:46)
          at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:927)
          at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:96)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:619)
      2010-06-08 17:31:53,956 ERROR [org.hornetq.core.server.cluster.impl.ClusterConnectionImpl] (Thread-17 (group:HornetQ-client-global-threads-34275781)) Failed to handle message
      java.lang.IllegalStateException: Cannot find binding for jms.queue.OTSEventQueueddfefa33-7296-11df-bd36-d00d4b3608ce
          at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreated(ClusterConnectionImpl.java:774)
          at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.onMessage(ClusterConnectionImpl.java:568)
          at org.hornetq.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:809)
          at org.hornetq.core.client.impl.ClientConsumerImpl.access$100(ClientConsumerImpl.java:46)
          at org.hornetq.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:927)
          at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:96)
          at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
          at java.lang.Thread.run(Thread.java:619)

        • 1. Re: Core Bridge Target Server isn't forming cluster.
          clebert.suconic

          first things first.. what version of HQ are you using?

           

           

          If you're not using 2.1.0.GA, the first thing we will ask you is to try the latest version.

          • 2. Re: Core Bridge Target Server isn't forming cluster.
            artp

            Using HornetQ Server version 2.1.0.CR1 (auraria, 118), I built it from trunk May 21

            • 3. Re: Core Bridge Target Server isn't forming cluster.
              clebert.suconic

              As we talked on IRC, it would be nice to have a testcase to replicate the issue.

              • 4. Re: Core Bridge Target Server isn't forming cluster.
                artp

                I'll put together something that i can send you as a test case sometime tomorrow....

                • 5. Re: Core Bridge Target Server isn't forming cluster.
                  artp

                  Hi, here are the steps to reproduce.

                  1) Deploy HornetQ on two machines HostA and HostB. In my case I used HornetQ Trunk on May 26th with Jboss 5.1, using the Quick Start guide to install hornetq.

                  2) Use the hornetq-configuration.xml on both boxes. Change the host names from HostA and HostB.

                  3)Start the servers in this order HostA, then HostB

                  4)Use the DirectConsumerExample file to start the consumer on HostB. To start it just put it under one of the example(ie javaee/hajndi) directories and add an ant task to start it.

                  5)In the DirectProducerExample file change the host to point to HostA. Start it the same way you did the DirectConsumerExample, in a different terminal.

                  6)At this point you'll be publishing messages to HostA and HostB should receive, the topic messages, and some queue messages. For some reason, the messages aren't redistributed since HostA doesn't have a consumer. Another issue possibly.

                  7) Now, take down HostA. Leaving HostB and the DirectConsumerExample running the entire time.

                  8) Bring HostA back up.

                  9) Use DirectProducerExample to publish some messages to HostA. You should notice that no topic messages are sent to HostB.

                  • 6. Re: Core Bridge Target Server isn't forming cluster.
                    artp

                    To be clear, i was able to reproduce the issue without a core bridge being used. It appears to be a clustering issue, when the nodes are taken down.

                    • 7. Re: Core Bridge Target Server isn't forming cluster.
                      instanceof

                      Hi,

                       

                      I've been affected by this issue also. It's exactly the same in terms of the behaviour.

                       

                      We have 5 nodes each using cluster-connections to establish core bridges to target nodes. If we take down one of the nodes, when it's brought back up it doesn't re-establish the core bridge and messages published reach all other bridge targets except the one that was brought down.

                       

                      It's very reliably re-producable and I've replicated this using 2.10-Final and this morning 2.1.1 checked out from the anonymous SVN trunk last night.

                       

                      May I ask whether there is a bug raised for this issue and whether it's scheduled for any work?

                       

                      Thanks for your help.

                      • 8. Re: Core Bridge Target Server isn't forming cluster.
                        mcleanl

                        Hello Geoff,

                         

                        Are you using a static or dynamic cluster (UDP/Multicast)? I appear to be having a very similar issue with a static cluster and trying to work out what my error is.

                         

                        https://community.jboss.org/thread/153541?tstart=0

                         

                        I note that Art mentioned a static cluster above and wondered whether you also run static. Thank you.

                        • 9. Re: Core Bridge Target Server isn't forming cluster.
                          artp

                          I had two issues. 1) with the static configuration i had every node in my cluster in the configuration deployed to all nodes. When i took out the localhost in all configurations for the cluster connections, things started working better(the current node can't be in the cluster connections connectors). 2) My deployment was running inside jboss 5.1 with tcp configured for jgroups channels(ie jboss cache), that appeared to be colliding with the hornetq ports. After i changed jgroups to use udp, everything worked properly.

                           

                          I have tried my test case, yet, but I will and let you know the outcome. Hope this helps.

                           

                          A

                          • 10. Re: Core Bridge Target Server isn't forming cluster.
                            artp

                            I was unable to reproduce my test case. It works now that I have the correct config in place in hornetq-configuration.xml and I'm NOT using Jgroups tcp. I found out Jgroups was causing issues because we have another cluster that was using hornetq with udp discovery and jgroups udp and we never saw these issues. I hope this helps others because it took some time to figure out jgroups tcp was colliding with hornetq.

                             

                            A

                            • 11. Re: Core Bridge Target Server isn't forming cluster.
                              artp

                              Also, I did notice an issue with setting NettyAcceptors to 0.0.0.0. In my case it was causing my static connectors to not build the cluster connections correctly. Once I set the netty acceptor host to listen on the IP of the host, the cluster formed correctly.

                              • 12. Re: Core Bridge Target Server isn't forming cluster.
                                instanceof

                                Thank you Art.

                                 

                                I read your post and removed localhost from each of our nodes cluster configuration and then the cluster start reforming perfectly!

                                 

                                Regards,

                                 

                                Geoff

                                • 13. Re: Core Bridge Target Server isn't forming cluster.
                                  timfox

                                  Geoff Pole wrote:

                                   

                                  Thank you Art.

                                   

                                  I read your post and removed localhost from each of our nodes cluster configuration and then the cluster start reforming perfectly!

                                   

                                  Regards,

                                   

                                  Geoff

                                  http://hornetq.sourceforge.net/docs/hornetq-2.1.0.Final/user-manual/en/html/configuring-transports.html#d0e3015

                                   

                                  See the yellow box

                                  • 14. Re: Core Bridge Target Server isn't forming cluster.
                                    instanceof

                                    Thanks Tim,

                                     

                                    although, I didn't literally remove localhost from the cluster configuration. What I mean to say is that I removed each nodes own connector reference (not a reference to locahost)

                                     

                                    ie. on node 3,  I commented out the connector ref for itself (

                                     

                                    <cluster-connection name="my-topic-cluster">
                                                <address>jms.topic.myTopic</address>
                                                <retry-interval>5000</retry-interval>
                                                <use-duplicate-detection>true</use-duplicate-detection>
                                                <forward-when-no-consumers>true</forward-when-no-consumers>
                                                <max-hops>1</max-hops>

                                     

                                                <connector-ref connector-name="netty-app1"/>
                                                <connector-ref connector-name="netty-app2"/>

                                                <!--connector-ref connector-name="netty-app3"/-->

                                    </cluster-connection>

                                     

                                    The connectors are all defined with real IP's, not using localhost which I think makes this issue different to the warning in that article (Unless I'm mistaken which isn't unknown