-
1. Re: IllegalStateException when loading journal on startup
timfox May 25, 2010 5:46 AM (in response to dfisher)Are you sure you don't have more than one server instances pointing at the same data?
-
2. Re: IllegalStateException when loading journal on startup
dfisher May 25, 2010 12:25 PM (in response to timfox)There is only one java process running on the host.
We do use a separate partition for data, but I don't think that is related.
I've had a hard time reliably replicating this problem.
I think we've seen it once or twice a week over two months of testing.
The stacktrace seems to indicate that the journal loaded delivery count data but found no corresponding messages.
Perhaps this is expected behavior? with:
<persist-delivery-count-before-delivery>false</persist-delivery-count-before-delivery>
-
3. Re: IllegalStateException when loading journal on startup
clebert.suconic May 25, 2010 7:27 PM (in response to dfisher)Did you have any errors on the server before you shut the instance?
If you see it happening again.. it would be wise to keep the data directory. Maybe I could evaluate what happened by looking at the data.
-
4. Re: IllegalStateException when loading journal on startup
clebert.suconic May 25, 2010 7:54 PM (in response to dfisher)Did you delete any queues before shutdown?
Or did you delete the bindings journal while restarting it?
The message is saying it can't find the queue during startup:
if (queueMessages == null){throw new IllegalStateException("Cannot find queue messages " + encoding.queueID);}Maybe you could try replicating the procedure you did right before it failed.
-
5. Re: IllegalStateException when loading journal on startup
dfisher May 25, 2010 10:24 PM (in response to clebert.suconic)Did you delete any queues before shutdown?
No.
Or did you delete the bindings journal while restarting it?
No.
The message is saying it can't find the queue during startup:
if (queueMessages == null){throw new IllegalStateException("Cannot find queue messages " + encoding.queueID);}Specifically, doesn't the source indicate that it cannot find the queue in order to update the delivery count for a message?
Would this be indicative of a message having been ack'd, removed from the journal, and then shutdown before the delivery count could be updated?
Maybe you could try replicating the procedure you did right before it failed.
I'll continue trying to replicate this problem in a simplfied setup.
I'll be sure to watch logs before and after the event.
Regardless of what's causing this problem with the journal, is it reasonable to ignore this type of error on startup?
I certainly wouldn't mind discarding delivery count updates if it meant I could continue processing the journal and get the server started.
-
6. Re: IllegalStateException when loading journal on startup
clebert.suconic May 26, 2010 10:38 AM (in response to dfisher)I will make the whole switch to this (at the end of the post)...
You could try it out. However there should be other records also missing the queue that would be critical missing.
You could try trunk now if you want. It's almost CR1 at this point. (we should retag it this week.. planned for tomorrow ATM)
https://jira.jboss.org/browse/HORNETQ-394
case UPDATE_DELIVERY_COUNT: { long messageID = record.id; DeliveryCountUpdateEncoding encoding = new DeliveryCountUpdateEncoding(); encoding.decode(buff); Map<Long, AddMessageRecord> queueMessages = queueMap.get(encoding.queueID); if (queueMessages == null) { log.warn("Cannot find queue " + encoding.queueID + " to update delivery count"); } else { AddMessageRecord rec = queueMessages.get(messageID); if (rec == null) { throw new IllegalStateException("Cannot find message " + messageID); } rec.deliveryCount = encoding.count; } break; }
-
7. Re: IllegalStateException when loading journal on startup
clebert.suconic May 26, 2010 10:48 AM (in response to clebert.suconic)fyi: I have made another change on the method after I posted. Look at trunk if you want.
-
8. Re: IllegalStateException when loading journal on startup
timfox May 26, 2010 11:00 AM (in response to clebert.suconic)Can you explain what the problem is here first, before jumping in with the solutions?
-
9. Re: IllegalStateException when loading journal on startup
clebert.suconic May 26, 2010 12:24 PM (in response to timfox)We don't know what's the problem is yet... we're still investigating.
So far the only thing I'm doing is making a Critical error into a Warning. As we could just throw warnings on logs for UPDATE_DELIVERY_COUNT.
-
10. Re: IllegalStateException when loading journal on startup
timfox May 26, 2010 12:55 PM (in response to clebert.suconic)I don't get it.
How can you change the error to a warning if you don't know what the problem is? How is this appropriate. Sounds like you'rejust supressing the issue and masking the true problem.
-
11. Re: IllegalStateException when loading journal on startup
clebert.suconic May 26, 2010 1:05 PM (in response to timfox)We must find ACKs and ADD. We should interrupt the loading if that happened.
However on Update-Count... we can't ignore the error.. but we could throw a warning on logs and keep loading the rest of the information.
If the issue is just on update-count.. I don't see a reason to interrupt starting the server.
-
12. Re: IllegalStateException when loading journal on startup
dfisher May 26, 2010 4:30 PM (in response to clebert.suconic)I was able to find a backup of a corrupted journal.
Testing with latest trunk produced the following error for a journal created with 2.1.0Beta3:
Deployment "JMSServerManager" is in error due to the following reason(s): HornetQException[errorCode=6 message=Journal data belong to a different version]
I then applied your patch to Beta3 and confirmed that you are correct.
There are other critical records missing.
I then changed the other cases that throw IllegalStateException to log warnings instead and got the server to load the journal.
What we saw is a lot of logs indicating that delivery count could not be updated with some cases where the ACKNOWLEDGE_REF could not be found either.
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message 118 to update delivery count
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message 119 to update delivery count
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message 120 to update delivery count
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message (ACKNOWLEDGE_REF) 120
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message 120 to update delivery count
WARN [org.hornetq.core.persistence.impl.journal.JournalStorageManager] (main) Cannot find message 107 to update delivery countMy hope was that there was viable records in the journal that could be recovered, this turned out not to be the case.
However, once hornetq was started the journal was usable.
While I think the journal load on startup should be as robust as possible, I'll hold off on submitting a feature request until I can produce a journal that has recoverable records in it.
We'll continue trying to reproduce this....
-
13. Re: IllegalStateException when loading journal on startup
clebert.suconic May 26, 2010 5:00 PM (in response to dfisher)do you have confidential data on the journal?
Would you be ok on sharing it with me at least? I just wanted to take a peek on what could possibly cause it.
you could use my private email if you need to share it privately only. csuconic at redhat dot com
-
14. Re: IllegalStateException when loading journal on startup
clebert.suconic May 27, 2010 12:28 AM (in response to dfisher)What kind of load do you have?
what is the average size of your messages?
What partition type are you using where the journal is located?