0 Replies Latest reply on Jan 12, 2009 12:19 PM by gaohoward

DistributedTopicTest failures

gaohoward Jan 12, 2009 12:19 PM

Hi sometimes I got test failures in running DistributedTopicTest class. The testClusteredTopicSharedDurableNoLocalSubPersistent and clusteredTopicSharedDurableNoLocalSub. The tests do the following:

1. create three connections (con1, con2, con3) on node 0, 1, and 2.
2. create three session (sess1, sess2 and sess3) on con1, 2 and 3 respectively
3. create 2 durable subscribers (cons1 and cons2) on the topic at node 1 and 2 respectively
4. create a producer (prod) on the topic at node 0.
5. after starting connections, prod sends 100 messages to topic at node 0.
6. as there is no receivers at node 0, all the messages will be received by cons1 and cons2 on the other two nodes.

The tests assumes that the messages will be sucked over to the two nodes in a round robin way, which is not always the case. See ServerSessionEndpoint.promptDelivery(Channel channel):

this.executor.execute(new Runnable() { public void run() { channel.deliver();} } );

The deliver() on node1 and node2 is triggered when con2 and con3 is started. Each deliver() will cause the sucker began to suck messages from node0. If the sucker on node 1 (sucker-node-1) and on node2 (sucker-node-2) both become available when node 0 starts to deliver messages, then the message will be distributed to sucker-node-1 and sucker-node-2 in round robin fashion. However because the deliver() is executed aynchronously, you cannot guarantee that they are both ready when node 0 starts to deliver the first message. It is possible that the first several messages are delivered to only one of the two nodes while the other is still not available (thread scheduling).

If we insert some time delay before sending messages, that'll give more time for both nodes become ready and the test will be more stable.