5 Replies Latest reply on Nov 8, 2011 12:16 AM by clebert.suconic

    Selector + 3 node cluster causing lost messages

    dskiles

      I am encountering the following issue with HornetQ JMS 2.2.5.GA on JBoss 5.1. 

      When using a three node cluster, some messages are lost when using multiple JMS text properties in different messages, across different producers, when they are sent to the same consumer.


      I currently have a three node JBoss cluster configured.  JBoss is using the tcp JGroups networking stack.  Discovery is handled using TCPPING.

      On this JBoss cluster, I have HornetQ running as a JBoss service.  HornetQ is using TCP networking with static connectors between all three nodes.  We are not using multicast discovery at all. 

      Each instance of HornetQ is running a non-persistent queue.

      We have a copy of our internally-developed application running on each JBoss node as well.

      Each instance of the application contains a producer (using Spring Framework's JmsTemplate) that adds items to the queue.  Before it adds them, the producer sets a text property named LEGAL_NODES.  LEGAL_NODES is a comma-separated list of instances of our application that are allowed to process the message.

      Each instance of the application also uses the Spring Framework to create a JMS consumer that listens for items in the HornetQ queue.  Each listener bean has a selector defined that restricts whether or not it can process a given JMS message.  The selector is defined as follows, where ${cluster_identifier} is a system property that defines each node.

      LEGAL_NODES LIKE '%${cluster_identifier}%' OR LEGAL_NODES = ''


      When we have a cluster composed of two nodes, this configuration works in all scenarios that we have tested.  When using a cluster of three nodes, this configuration will break in one scenario that we have found.

      When we are running multiple producers, and the producers have different, non-blank values for LEGAL_NODES, then some messages are not processed by the consumers.

      • If we use a three node cluster with a single producer, all messages are processed are processed on the correct cluster node.
      • If we use a three node cluster with multiple producers, but LEGAL_NODES is blank on all producers, all messages are processed.
      • If we use a three node cluster with multiple producers where LEGAL_NODES is set to the same value for all producers, all messages are processed on the correct cluster node.
      • If we use a three node cluster with multiple producers, but LEGAL_NODES is set on one and blank on another, all messages are processed on the correct cluster nodes.
      • If we use a three node cluster with multiple producers where LEGAL_NODES is set to one value for one producer and another value for a second producer, a varying number of messages are not processed.

       

       

      Any ideas on what's going on here?  I'm pretty well stumped.  Is there any information in terms of logging or configuration that would make it easier to get to the bottom of it?