6 Replies Latest reply on Feb 6, 2008 6:07 AM by beve

    Messaging Cluster issue

    beve

      Hi,

      we are using JBM 1.4.0.SP3 configured in a cluster. We have a four node cluster and use custom correlation ids to correlate messages.

      Our messaging clients post a message to a queue and wait a specified amount of time for a message to appear on a response queue with the correlation id they expect.

      Now the problem we are experiencing is that when several concurrent calls are made sometimes we are not able to retrieve the message from the clustered queue. We have verified that the message is infact there, with the correct correlation id.

      We have tried to simulate this behaviour with the test class below.

      public class DestinationPeeker
      {
      
       private static final String QUEUE_NAME = "queue/clusteredQueue";
       private static final String JNDI_SERVER = "hostname:1100";
      
       private static final String CORRELATION_ID = "12345";
      
       private static String messageSelector = "JMSCorrelationID = \'" + CORRELATION_ID + "\'";
      
       @Test
       public void peek() throws NamingException, JMSException
       {
       Context ctx = getContext();
       Queue queue = (Queue) ctx.lookup( QUEUE_NAME );
       QueueConnectionFactory factory = (QueueConnectionFactory) ctx.lookup( "ConnectionFactory" );
       QueueConnection cnn = factory.createQueueConnection();
       QueueSession session = cnn.createQueueSession( false, QueueSession.AUTO_ACKNOWLEDGE );
      
       QueueBrowser browser = session.createBrowser( queue, messageSelector );
       String messageSelector = browser.getMessageSelector();
      
       Enumeration enumeration = browser.getEnumeration();
       while ( enumeration.hasMoreElements() ) {
       Message jmsMsg = (Message) enumeration.nextElement();
       System.out.print( "JMSMessageID : " + jmsMsg.getJMSMessageID() );
       System.out.print( ", JMSCorrelelationID : " + jmsMsg.getJMSCorrelationID() );
       System.out.print( ", JMSExpiration : " + jmsMsg.getJMSExpiration() );
       System.out.println("");
       }
       browser.close();
       session.close();
       cnn.close();
       }
      
       @Test
       @Ignore
       public void putMessageOnQueue() throws NamingException, JMSException
       {
       Context ctx = getContext();
       Queue queue = (Queue) ctx.lookup( QUEUE_NAME );
       QueueConnectionFactory factory = ( QueueConnectionFactory ) ctx.lookup( "/ClusteredConnectionFactory" );
       QueueConnection cnn = factory.createQueueConnection();
       QueueSession session = cnn.createQueueSession( false, QueueSession.AUTO_ACKNOWLEDGE );
       MessageProducer producer = session.createProducer( queue );
       TextMessage msg = session.createTextMessage();
       msg.setJMSCorrelationID( CORRELATION_ID );
       producer.send( msg );
       producer.close();
       session.close();
       cnn.close();
       ctx.close();
       }
      
       private Context getContext() throws NamingException
       {
       Hashtable<String, String> env = new Hashtable<String, String>();
       env.put( Context.INITIAL_CONTEXT_FACTORY, "org.jnp.interfaces.NamingContextFactory" );
       env.put( Context.URL_PKG_PREFIXES, "org.jboss.naming" );
       env.put( Context.PROVIDER_URL, JNDI_SERVER );
       return new InitialContext(env);
       }
      }
      

      Note that we are using a QueueBrowser to peek a the queue. I've tried this by consuming from the queue and seen the same behaviour.
      When I run the above (having run once with only executing putMessageOnQueue()) I sometimes get a messages back and sometimes don't. It's not deterministic.

      Is this a valid way to verfiy the functionality of clustering with message correlation id's?

      Has anyone see this sort of behaviour before?

      Any comments or suggestions are welcome.

      Thanks,

      /Daniel

        • 1. Re: Messaging Cluster issue
          timfox

          Can you explain your topology in more detail - i.e., where are the clients that put messages on the queue and where are the clients that remove messages from the queue? (It's important to know what node they're on).

          Also can you post your message consumer code? Thanks

          • 2. Re: Messaging Cluster issue
            beve

            Hi Tim,

            thanks for your quick response!

            The clients that put messages on the queue are Web Services that exist on two nodes in our messaging cluster.
            Their responsibility is to send the SOAP message to a queue that our ESB servers listen to.
            The ESB service performs it's actions, and one of these is to send a response message to a response queue.

            It's a little difficult for me to post the actual code. But the "test" class in my previous post can simulate the behaviour. This can be done with at two node messaging cluster.

            Are there any test in the messaging project that I could run against our configuration to verify that we have not incorrectly configured something. The system has been running in production for several month without any warnings or errors. We upgraded to 1.4.0.SP3 right before Christmas.

            Thanks,

            Daniel

            • 3. Re: Messaging Cluster issue
              timfox

              So you have a clustered response queue, and, say two consumers on it on different nodes....

              A response message gets posted to the queue. Clearly the response message is destined for a specific consumer, but if you have two consumers on the queue, you can' be sure it gets to the "right" consumer (how would JBM know what is the "right" consumer?).

              Clustering will make sure it gets to one of the consumers, but not necessarily the one you expect. Am I missing something here, or misunderstanding what you are trying to achieve?

              • 4. Re: Messaging Cluster issue
                beve

                Yep, that is correct. The response queue is clustered and we have two consumers listening to that queue.

                I'm sorry but I forgot to mention that these consumers are using a message selector (like the example code below). They are using the correlation id to make sure that they only take response messages that correlate to the message they have sent.

                I might have misunderstood this but I thought that if I publish a message to a clustered queue and then use a message selector to receive messages from the queue, I would get back the message regardless of where the message phisically exists in the cluster.

                Does this make sense?

                Regards,

                Daniel

                • 5. Re: Messaging Cluster issue
                  timfox

                  In most cases, allowing selectors on a JMS queue is an anti-pattern since it can cause the queue to be scanned frequently - i.e. give poor performance.

                  Also JMS selectors only work on the *local* queue - i.e. each clustered queue is made up of n local partial queues - one on each node. If your consumer has a selector then that does not determine whether or not messages are pulled to or from that node.

                  This can result in messages being pulled from one node to another, where they won't be consumed because the selector doesn't match.

                  Making message redistribution cluster aware would be extremely difficult. Think about it. Imagine messages are pulled to one node based on the selectors on that node, then the consumer changes on that node, and another one starts on another node that matches. We would have to maintain a global view of what selectors were on each node and messages would be shifted en-masse back and forth every time a selector changed!

                  If you want to do clustered request-response, then you could either
                  a) Use a *topic* with selectors. (I general if you ever see yourself using selectors with queues it's always a good idea to see if you can refactor to use topics).
                  b) Use a temporary request/response queue - in this case you don't need selectors since the response queue is only used by you.

                  • 6. Re: Messaging Cluster issue
                    beve

                    Hi Tim,

                    thanks for the detailed explaination on this, it is much appreciated!

                    I'll refactor our code to use temporary queues instead. Is there any perfomance loss compared to using Topics with selectors?

                    Regards,

                    Daniel