3 Replies Latest reply on Sep 9, 2010 10:17 AM by adrian.trenaman

KahaDB Disaster Recovery

lufe May 31, 2010 9:01 AM

Some questions:

- What would be a likely scenario for kahaDB to get corrupted (i.e., partially or completely not being recoverable anymore)

- Assuming there is a possibility of recovering messages, and the existence of 250.000 messages in a kahaDB store, how much time it would take to recover them all?

- Any information or links to cases related to this topic would be greatly appreciated.

1. Re: KahaDB Disaster Recovery

ade Jun 3, 2010 6:41 AM (in response to lufe)
Hi there!

I'm not sure of any explicit scenarios that would cause the KahaDB to fail it's recovery. Here's some thoughts though:

Writing is done atomically at the end of a Journal file. If there is a sudden process crash, then you should only loose information that has not yet been sync'd to the file system. If you need maximum reliability, then you need to ensure that KahaDB is writing to the journal in sync mode; thankfully, this is the case by default with Kaha DB in ActiveMQ 5.3.x.

KahaDB writes the journal files in 'chunked' files; the default size of each journal file is 32Mb. So, if there was a physical problem with the disk, then you would end up perhaps only loosing one journal file. If, on the other hand, you loose all the files due to a disk crash, then you're out of luck. If your data is important, then you really need to make sure you're using good hardware - and that means reliable disks, e.g. RAID.

I know that in AMQ Persistence the index file can always be recreated from scratch. I believe that this is also the case for KahaDB (you should check this!). In terms of how long it takes to recreate, I know of no performance benchmarks published. Perhaps you might run some simple investigations? For example: put 250,000 messages onto a queue, then delete the index file, and then see how long it takes to recreate?

Hope that helps.
Actions
2. Re: KahaDB Disaster Recovery

lufe Jun 8, 2010 4:35 AM (in response to ade)

Hi Adrian,

thanks for the answer. It surely helps. The only question we have is: which is the index file to delete so we can make the test? By looking on the directory, this is not obvious. Maybe you know or know someohe who could answer this question?
Actions
3. Re: KahaDB Disaster Recovery

adrian.trenaman Sep 9, 2010 10:17 AM (in response to lufe)

Sorry about the late reply blush.

The index file is the db.data file, and the other file related to indexing is db.redo. ActiveMQ can recover successfully if these files are deleted.
Actions

Go to original post