4 Replies Latest reply on Jan 27, 2011 11:52 PM by jalandip

    java.lang.IllegalStateException: On loading Journal after restart

    jalandip

      Hi

      We are using Jboss 4.2.3 with hornetq2.1.2 in a clustered setup with no backups on a two node setup. When one node is shut down the live node detects the shutdown. Now when the remaining live node is rebooted(system reboot we are using linux) and after it boots up and jboss starts HornetQ does not initialize. It errors with an illegal state exception while loading one of the journal files. The journal files are not shared among the cluster members and each cluster runs in a seperate linux server. On looking at the exception happens when the journal storage manager is trying to set the scheduled delivery time on one of the messages. None of the our message producers are using sechduled delivery time feature so its strange to see the exception. The server has multiple topics/queues and MDB's. Now the only way to recover from this failure is to clean up the data directory.

       

      The exceptions is a follows.

       

      Problem starting service org.hornetq:service=HornetQJMSStarterService

         java.lang.IllegalStateException: Cannot find queue messages 1426

                 at org.hornetq.core.persistence.impl.journal.JournalStorageManager.loadMessageJournal(JournalStorageManager.java:932)

                at org.hornetq.core.server.impl.HornetQServerImpl.loadJournals(Hornet        QServerImpl.java:1220)

                 at org.hornetq.core.server.impl.HornetQServerImpl.initialisePart2(Hor        netQServerImpl.java:1070)

                at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServer        Impl.java:313)

                 at org.hornetq.jms.server.impl.JMSServerManagerImpl.start(JMSServerMa        nagerImpl.java:235)

                 at org.hornetq.service.HornetQJMSStarterService.start(HornetQJMSStart        erService.java:34.

       

      My question is

      1) Why is it failing at that point when we are not using the schdeuled devlivery time feature? Does the clustered core bridges use them?

      2) How to recover from such failures ?

       

      I can provide the journal files if needed.