7 Replies Latest reply on Dec 4, 2009 5:35 AM by timfox

    Tests status

    timfox

      Looks like most things are passing now.

      The only intermittent failures I can see are in NettyMultiThreadRandomReattachTest

        • 1. Re: Tests status
          clebert.suconic

          NettySymmetricClusterWithDiscoveryTest::testStartStop still fails:

          Error Message

          Timed out waiting for bindings (bindingCount = 22, totConsumers = 22)

          Stacktrace

          java.lang.IllegalStateException: Timed out waiting for bindings (bindingCount = 22, totConsumers = 22)
          at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.waitForBindings(ClusterTestBase.java:290)
          at org.hornetq.tests.integration.cluster.distribution.SymmetricClusterTest.testStartStopServers(SymmetricClusterTest.java:1498)



          • 2. Re: Tests status
            timfox

            This is because of the bindings reset I added to clear remote bindings on bridge failure.

            I have commented it out for now, to make the test pass, someone needs to look at it tomorrow.

            • 3. Re: Tests status
              clebert.suconic

              Just for the record, still failed after that change:

              Error Message
              
              Timed out waiting for bindings (bindingCount = 17, totConsumers = 17)
              
              Stacktrace
              
              java.lang.IllegalStateException: Timed out waiting for bindings (bindingCount = 17, totConsumers = 17)
               at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.waitForBindings(ClusterTestBase.java:290)
               at org.hornetq.tests.integration.cluster.distribution.SymmetricClusterTest.testStartStopServers(SymmetricClusterTest.java:1498)
              


              • 4. Re: Tests status
                clebert.suconic

                Another update on the tests:

                Out of Order over failover:

                So far, the "easiest" way for me to replicate the out of order issue during failover was using RandomReattachStressTest::testF

                I've changed doTestF to use 500 messages and 50 messages. But even so the out of order happened after 303 iterations.

                I've added some logging to the method to differentiate between message loss and out of order. The test didn't miss any messages (at least at the times I could replicate).. so it is receiving them out of order.

                I have seen those tests also timing on sendPacket in some of the runs. I'm not sure if that would be the same issue or not.



                MultiThreadConsumerStressTest:

                I will need to take a look on a possible race with transaction callbacks and compacting. I have tried to spot that today. I have a good idea about what it could be.. so I will be able to fix it tomorrow.

                • 5. Re: Tests status
                  clebert.suconic

                  Another failure:

                  GroupingFailoverSharedServerTest.testGroupingLocalHandlerFailsMultipleGroups

                  Error Message
                  
                  queue queue0ab8fd17b-e078-11de-a9b9-fd395404ae6f has been removed cannot deliver message, queues should not be removed when grouping is used
                  
                  Stacktrace
                  
                  HornetQException[errorCode=100 message=queue queue0ab8fd17b-e078-11de-a9b9-fd395404ae6f has been removed cannot deliver message, queues should not be removed when grouping is used]
                   at org.hornetq.core.remoting.impl.ChannelImpl.sendBlocking(ChannelImpl.java:268)
                   at org.hornetq.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:259)
                   at org.hornetq.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:134)
                   at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.sendInRange(ClusterTestBase.java:505)
                   at org.hornetq.tests.integration.cluster.distribution.ClusterTestBase.sendWithProperty(ClusterTestBase.java:475)
                   at org.hornetq.tests.integration.cluster.failover.GroupingFailoverTestBase.testGroupingLocalHandlerFailsMultipleGroups(GroupingFailoverTestBase.java:236)
                  


                  • 6. Re: Tests status
                    timfox

                     

                    "clebert.suconic@jboss.com" wrote:

                    I will need to take a look on a possible race with transaction callbacks and compacting. I have tried to spot that today. I have a good idea about what it could be.. so I will be able to fix it tomorrow.


                    That's not very helpful.

                    If you have an idea and know how to fix it please say what it is.

                    This allows someone else to look at it in the morning before you arrive, and not sit around for hours waiting for you, or worse duplicating time coming up with the same fix.

                    This is especially important considering we have a release to do today.

                    • 7. Re: Tests status
                      timfox

                      This reminds of Fermat's famous last "proof", when he wrote:

                      "I've found a remarkable proof of this fact, but there is not enough space in the margin [of the book] to write it."

                      He never did write it down, and it took mathematicians about 350 years before they finally came up with the proof ;)

                      http://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem