-
1. Re: Duplicated Messages during failover and NullPersistence
clebert.suconic Mar 19, 2009 9:19 PM (in response to clebert.suconic)Just to complete the thread, this failure I was talking about is related to an intermitent failure that has happened on hudson.
It doesn't fail if using real files (*probably* duplicate detection behaves different when using real files).
This is the diff to replicate the issue.Index: tests/src/org/jboss/messaging/tests/integration/cluster/failover/AutomaticFailoverWithDiscoveryTest.java =================================================================== --- tests/src/org/jboss/messaging/tests/integration/cluster/failover/AutomaticFailoverWithDiscoveryTest.java (revision 6120) +++ tests/src/org/jboss/messaging/tests/integration/cluster/failover/AutomaticFailoverWithDiscoveryTest.java (working copy) @@ -63,6 +63,19 @@ // Constructors -------------------------------------------------- // Public -------------------------------------------------------- + + public void testRepeat() throws Exception + { + for (int i = 0; i < 100; i++) + { + if (i > 0) + { + tearDown(); + setUp(); + } + testFailover(); + } + } public void testFailover() throws Exception { @@ -173,7 +186,7 @@ protected void setUp() throws Exception { super.setUp(); - setupGroupServers(true, "bc1", 5432, groupAddress, groupPort); + setupGroupServers(false, "bc1", 5432, groupAddress, groupPort); } @Override
-
3. Re: Duplicated Messages during failover and NullPersistence
clebert.suconic Mar 20, 2009 11:36 AM (in response to clebert.suconic)Actually.. this doesn' t have anything to do with Persistence & NullPersistence
If I wait 17ms (as done on MultiThreadfailoverTest), the test never fails. -
4. Re: Duplicated Messages during failover and NullPersistence
clebert.suconic Mar 20, 2009 11:40 AM (in response to clebert.suconic)I mean...
If I wait 17 ms between the backup and live start, this issue never happens. -
5. Re: Duplicated Messages during failover and NullPersistence
clebert.suconic Mar 20, 2009 7:40 PM (in response to clebert.suconic)The issue was related to the time-components on the IDs for sure.
I added a test PreserveOrderDuringFailoverTest, which is based on AutomaticFailoverWithDiscoveryTest.
If you uncomment some code on PreserveOrderDuringFailoverTest, this issue will aways happen:// This test would fail if both servers have the same time component // NullStorageManager storageManagerLive = (NullStorageManager)liveService.getServer().getStorageManager(); // TimeAndCounterIDGenerator idgeneratorlive = (TimeAndCounterIDGenerator)storageManagerLive.getIDGenerator(); // // NullStorageManager storageManagerBackup = (NullStorageManager)backupService.getServer().getStorageManager(); // TimeAndCounterIDGenerator idgeneratorBackup = (TimeAndCounterIDGenerator)storageManagerBackup.getIDGenerator(); // // idgeneratorBackup.setInternalDate(0); // idgeneratorlive.setInternalDate(0);
I would expect the IDs not affecting failover any more, so I would debug this but since this part is already changed at Tim's workspace, I will leave this alone.
Tim: If you could please remove the wait on FailoverTestBase during your commit. Since you have changed the ID logic, we won't need the wait any more.backupService.start(); - Thread.sleep(20); Configuration liveConf = new ConfigurationImpl();