11 Replies Latest reply on Sep 7, 2015 2:22 AM by werrenmi

    Wrong routing result

    werrenmi

      Hello

       

      First off all i have to say we run some of the HornetQ clients in an OSGi environment. The HornetQ server is a common JBoss AS 8.1.Final. Now in the scenario we restart a client then when he comes up again, from time to time messages will be routed wrong they comes from producers at the OSGi clients. Debugging has shown that the producer seems to be correct configured, but on the server the HornetQ trace logs says "Message after routed=ServerMessage...". The message will be routed to the wrong address, and therefore the wrong consumer invoked.

       

      Has anyone the same behavior or a suggestion?

       

      Thanks in advance.

       

      Regards

      Michel

        • 1. Re: Wrong routing result
          jbertram

          I haven't seen that behavior before.  What version of HornetQ are you using?  Also, are you using Wildfly 8.1.Final?  I ask because there's really no such thing as JBoss AS 8.1.Final.

           

          Finally, do you have a test-case that reproduces the problem?

          • 2. Re: Wrong routing result
            werrenmi

            Sorry for the misspelling yes we use Wildfly 8.1.Final.

             

            Unfortunately there is no test-case perse, but it's reproducible (most times) when the client goes down like unexpected without unsubscription. After the client becomes connected again, then this behavior may occur. The solution is to restart the server also. We have defined the connection ttl to 1 hour.

            The confusing thing is that also a producer exists on this client in which address the failed routing results. So in the first moment it looked like that just the wrong producer was used.

            • 3. Re: Wrong routing result
              clebert.suconic

              TBH I didn't even understand what is the issue.

               

              with a git grep on "Message after routed" I see that this happens every time a message is routed (It's a trace message, so not an issue whatosever).

               

               

              A common issue with consumers, since you're running with OSGI (maybe you have two instances of the consumer, or maybe a consumer leak) is to have two consumers and getting only half of the messages, on which case just close your consumer properly. I'm not saying this is the issue since I have to data here, but just giving you a hint on what could be the issue.

               

               

              We can look at it If you provide us some more data.

               

               

              Just a parentheses: (Since you are an OSGI user we are in the process of designing OSGI on ActiveMQ Artemis (which is now the upstream for Hornetq... hornetq being the old / legacy version). maybe you could be part of the initial help on at least help us understand what users need... ARTEMIS-93)

              1 of 1 people found this helpful
              • 4. Re: Wrong routing result
                jbertram

                I'm not sure I really understand the problem, either.  Can you clarify the use-case a bit more?  What kind of clients do you have (e.g. producers or consumers) and how many of each type?  How should they be interacting and what are you actually observing?

                 

                Lastly, a connection TTL of 1 hour seems a bit high to me.  That means that the subscription for any non-durable subscriber that disconnects without explicitly unsubscribing will still be valid and collecting messages for up to 1 hour after the client has disconnected.  This could cause performance problems on the server as messages accumulate in defunct subscriptions without any valid consumers.  Can you elaborate on why you set the connection TTL so high?

                • 5. Re: Wrong routing result
                  werrenmi

                  Hello together

                   

                  First i have to say we use the core API and the HornetQ version is 2.4.1.Final.

                   

                  We have amongst others two addresses / queues: address.a (queue.a), address.b (queue.b). The consumer for queue.a is deployed on one Wildfly 8.1.Final (also HornetQ server) and the consumer for queue.b on another Wildfly 8.1.Final. There are just one queue (Consumer) per address.

                   

                  The producers for both addresses are deployed one time on multiple Apache Karaf 2.3.5 instances (One producer for each address on each Karaf). Each producer is deployed in a separate bundle. The HornetQ client session is registered as an OSGi service (Blueprint singelton) and so shared between both producers. The session and producers have a threadsafe design, so no access to the session or message will be send concurrent. The Karaf instances are running on ARMv5 embedded machines.

                   

                  The issue is now, when a Karaf goes down unexpectelly and comes up again. Than it can occur in our case, that messages send by the producer which is registered to address.a are routed and delivered to the consumer for queue.b. The messages they are send by the producer on address.b are routed and delivered correct to the consumer for queue.b.

                  When we restart the Wildlfy that act as HornetQ server, then it would work again as expect.

                   

                  We have a so high ttl because this embedded machines (Karaf clients) are running dedicated customer LAN's. So the risk for longer connection lost's is relatively high.

                   

                  I will make more investigations about this next weekend. Maybe after that more informations are available.

                   

                  Also important is, that we not have any additional configurations like diverts and all queues are durable.

                   

                   

                  <subsystem xmlns="urn:jboss:domain:messaging:2.0">

                     <hornetq-server>

                     <bindings-directory path="${hornetq.bindings.directory}"/>

                     <journal-directory path="${hornetq.journal.directory}"/>

                     <paging-directory path="${hornetq.paging.directory}"/>

                     <large-messages-directory path="${hornetq.large-messages.directory}"/>

                     <persistence-enabled>true</persistence-enabled>

                     <security-domain>other</security-domain>

                     <security-enabled>false</security-enabled>

                     <async-connection-execution-enabled>true</async-connection-execution-enabled>

                     <journal-type>ASYNCIO</journal-type>

                     <journal-file-size>102400</journal-file-size>

                     <journal-min-files>2</journal-min-files>

                     <persist-id-cache>true</persist-id-cache>

                     <message-expiry-scan-period>10000</message-expiry-scan-period>

                    

                     <core-queues>

                  ...

                   

                     </core-queues>

                   

                     <connectors>

                     <netty-connector name="netty" socket-binding="messaging">

                     <param key="use-nio" value="true"/>

                     </netty-connector>

                     <netty-connector name="netty-ssl" socket-binding="messaging-ssl">

                     <param key="use-nio" value="true"/>

                     <param key="ssl-enabled" value="true"/>

                     </netty-connector>

                     <netty-connector name="netty-throughput" socket-binding="messaging-throughput">

                     <param key="batch-delay" value="50"/>

                     </netty-connector>

                     <in-vm-connector name="in-vm" server-id="0"/>

                     </connectors>

                   

                     <acceptors>

                     <netty-acceptor name="netty" socket-binding="messaging">

                     <param key="use-nio" value="true"/>

                     </netty-acceptor>

                     <netty-acceptor name="netty-throughput" socket-binding="messaging-throughput">

                     <param key="batch-delay" value="50"/>

                     <param key="direct-deliver" value="false"/>

                     </netty-acceptor>

                     <netty-acceptor socket-binding="messaging-ssl" name="netty-ssl">

                     <param key="use-nio" value="true"/>

                     <param key="ssl-enabled" value="true"/>

                     <param key="key-store-path" value="${jboss.home.dir}/hornetq-keystore.jks"/>

                     <param key="key-store-password" value="${VAULT::msg-keystore::password::0}"/>

                     <param key="trust-store-path" value="${jboss.home.dir}/hornetq-dev-truststore.jks"/>

                     <param key="trust-store-password" value=""${VAULT::msg-truststore::password::0}""/>

                     <param key="need-client-auth" value="true"/>

                     </netty-acceptor>

                     <in-vm-acceptor name="in-vm" server-id="0"/>

                     </acceptors>

                   

                     <security-settings>

                     ...

                     </security-settings>

                   

                     <address-settings>

                     <address-setting match="#">

                     <dead-letter-address>jms.queue.DLQ</dead-letter-address>

                     <expiry-address>jms.queue.ExpiryQueue</expiry-address>

                     <redelivery-delay>0</redelivery-delay>

                     <max-size-bytes>10485760</max-size-bytes>

                     <page-size-bytes>7864320</page-size-bytes>

                     <page-max-cache-size>3</page-max-cache-size>

                     <address-full-policy>PAGE</address-full-policy>

                     <message-counter-history-day-limit>10</message-counter-history-day-limit>

                     </address-setting>

                     </address-settings>

                     </hornetq-server>

                  </subsystem>





                  On client the following addtional configurations are defined:


                  - blockOnAcknowledge = true

                  - blockOnDurableSend = false

                  - blockOnNonDurableSend = false

                  - preAck = false

                   

                   

                  Regards

                  Michel

                  • 6. Re: Wrong routing result
                    jbertram

                    I recommend you change your producers so that they don't share a session at all.  I realize you said the session is shared in a thread-safe manner, but my hunch is that sharing the session is still causing this problem.

                    • 7. Re: Wrong routing result
                      jbertram

                      For what it's worth, sharing a session is never recommended because they weren't designed to be shared.  Giving your application the ability to share sessions is just extra work and complexity that is simply unnecessary (and could potentially hurt performance significantly) .  Just share the connection object and create non-shared sessions from that.

                      1 of 1 people found this helpful
                      • 8. Re: Wrong routing result
                        werrenmi

                        Thanks Justin for this hint!

                         

                        I will make it so. For clearness ... The best way is to obtain a session for each producer? ... and therefore also for consumers? I ask this because that may result in x000 of sessions in our case.

                         

                        Regards

                        Michel

                        • 9. Re: Wrong routing result
                          werrenmi

                          Clebert ... thanks for this information about the OSGi integration. As we have no further issues atm with HornetQ in OSGi im agree with your comment on this issue about the uber jar. When i see other possible improvements, i will let you know them.

                          • 10. Re: Wrong routing result
                            jbertram

                            The best way is to obtain a session for each producer? ... and therefore also for consumers?

                            Yes.

                             

                            I ask this because that may result in x000 of sessions in our case.

                            I don't see a problem with that at this point.

                            • 11. Re: Wrong routing result
                              werrenmi

                              Hello Justin

                               

                              The refactoring has successful fixed the entire issue as i can say until now.

                               

                              Thanks a lot!

                               

                              Regards

                              Michel