8 Replies Latest reply on Oct 1, 2007 4:05 PM by fuzzybinary

    Clustering appears okay, but no clustered messages

    fuzzybinary

      I'm running a pretty standard JBoss AS server (4.2.1.GA) with JBoss Messaging 1.4.0.CR3 installed as per the documentation, basing the server on the "all" profile supplied with JBoss 4.2.1.GA, working against a MySQL server on a remote machine. I've tested the straight queue installation and I'm working on the clustered installation, but I can't get the distributed-queue test to pass.

      It looks like clustering is working from the following log items (servers running on 10.214.162.204/206):
      17:31:46,715 INFO [TreeCache] viewAccepted(): [10.214.162.204:1044|1] [10.214.162.204:1044, 10.214.162.206:1077]
      17:31:46,815 INFO [TreeCache] TreeCache local address is 10.214.162.206:1077
      17:31:46,979 INFO [TreeCache] received the state (size=1024 bytes)
      17:31:47,110 INFO [TreeCache] state was retrieved successfully (in 290 milliseconds)
      17:31:47,110 INFO [TreeCache] parseConfig(): PojoCacheConfig is empty
      17:31:47,534 WARN [FD_SOCK] I was suspected by 10.214.162.204:1044; ignoring the SUSPECT message
      17:31:50,040 INFO [ServiceEndpointManager] jbossws-1.2.1.GA (build=200704151756)
      17:31:52,392 INFO [SnmpAgentService] SNMP agent going active
      17:31:53,017 INFO [DefaultPartition] Initializing
      17:31:53,073 INFO [STDOUT]
      -------------------------------------------------------
      GMS: address is 10.214.162.206:1080
      -------------------------------------------------------
      17:31:55,489 INFO [DefaultPartition] Number of cluster members: 2
      17:31:55,490 INFO [DefaultPartition] Other members: 1
      17:31:55,495 INFO [DefaultPartition] Fetching state (will wait for 30000 milliseconds):
      17:31:55,543 WARN [FD_SOCK] I was suspected by 10.214.162.204:1047; ignoring the SUSPECT message
      17:31:55,755 INFO [DefaultPartition] state was retrieved successfully (in 256 milliseconds)

      But when I go to run the distributed-queue example I get:
      [java] Distributed queue /queue/testDistributedQueue exists
      [java] java.lang.RuntimeException: Assertion failed, 1 == 1
      [java] at org.jboss.example.jms.common.ExampleSupport.assertNotEquals(ExampleSupport.java:85)
      [java] at org.jboss.example.jms.distributedqueue.DistributedQueueExample.example(DistributedQueueExample.java:83)
      [java] at org.jboss.example.jms.common.ExampleSupport.run(ExampleSupport.java:147)
      [java] at org.jboss.example.jms.distributedqueue.DistributedQueueExample.main(DistributedQueueExample.java:167)

      [java] #####################
      [java] ### FAILURE! ###
      [java] #####################

      I've tested JGroups against the properties and it seems to be working fine.

      Is there anywhere I can look on the jmx console to check the status of the cluster? Am I missing anything in the installation here?

        • 1. Re: Clustering appears okay, but no clustered messages
          clebert.suconic

          Did you create the two instances as required by the test?


          You need to configure two instances, and use the BindingManager to ports, as described on the documentation.


          You can try that with 1.4.0.GA already. I just did from scratch to validate the release and it worked.

          • 2. Re: Clustering appears okay, but no clustered messages
            fuzzybinary

            Does that mean I have to set up two instances on the same machine at the same IP? If so, does one of the examples allow me to test a distributed queue on two separate machines?

            (Sorry, not at work and not close to the documentation at the moment. Will double check everything when I go in later today.)

            • 3. Re: Clustering appears okay, but no clustered messages
              timfox

               

              "fuzzybinary" wrote:
              Does that mean I have to set up two instances on the same machine at the same IP? If so, does one of the examples allow me to test a distributed queue on two separate machines?

              (Sorry, not at work and not close to the documentation at the moment. Will double check everything when I go in later today.)


              There is a file: jndi.properties in the etc directory of the relevant example. This gives the address of the server used to do the first lookup. As long as this points to one of the servers in your cluster you should be ok.

              You should validate that the example first works as specified in the user guide first though.

              • 4. Re: Clustering appears okay, but no clustered messages
                fuzzybinary

                Example does indeed work out of the box when I run it the with two instances on the same machine (as described in the user guide). But shifting over to two boxes different machines is still broken, even when changing to an IP address bind.

                I noticed when starting up two instances on the same machine I didn't get any SUSPECT messages. Is there a reason I would be getting those in a two machine situation? Would it affect the outcome of the test?

                • 5. Re: Clustering appears okay, but no clustered messages
                  clebert.suconic

                  UDP routing between the two machines? (Network config?)

                  • 6. Re: Clustering appears okay, but no clustered messages
                    fuzzybinary

                    Probably not a UDP problem since I've checked that JGroups is actually working through the examples provided on the wiki, but I think it may be a network config problem somewhere.

                    I changed log4j to TRACE for the examples and the connection to the IP is being refused. I have discovered that I should be using port 1100 (because I'm using HAJNDI), but even with that change, connection is still refused. I have to figure out exactly what that's happening, since netstat shows that 1100 is actually listening for connections.

                    • 7. Re: Clustering appears okay, but no clustered messages
                      fuzzybinary

                      looking at it, it looks like 1100 is only listening for connections coming from localhost (binding is localhost:1100 not *:1100) Is this correct? Should I be trying to connect to a different port?

                      • 8. Re: Clustering appears okay, but no clustered messages
                        fuzzybinary

                        run.sh -b IPaddress works, and I've narrowed this down to a clustering problem, not a messaging problem. Thanks for all your help guys!