1 2 Previous Next 28 Replies Latest reply on Jul 5, 2009 10:35 AM by clebert.suconic

Journal Cleanup and Journal Compactor

clebert.suconic Jun 9, 2009 4:16 PM

As documented, and as all the JBM developers known, the journal is append-only. When we delete a record we actually append a delete record. When every record of a file has a delete somewhere else, that file is marked as reclaimable. It could then be reused or deleted depending on the number of files in use.

However, there is an issue when a record is not deleted. Say..the consumer will need a couple days to come and consume his messages. (DLQs... or I could list a few usecases for this)

If a file has any number of records that will never or take a long time to be deleted, the relationship between files will create what I called the "linked-list-effect" on the journal.

You would have something like:

JournalFile 1:
- Add Record1
- AddRecord2

JournalFile 2:
- DeleteRecord2
- AddRecord3

JournalFile 3:
- DeleteRecord3
- AddREcord4

....

JournalFile 1000:
- DeleteRecord 1000
- AddRecord 10001

We need to somehow remove the dependencies between the files for when this happens. My current suggestion is to Update JournalFile1, in a way you don't need any records from other files to process the delete. (I call this as journal cleanup).

So you would have:

JournalFile 1:
- Add Record 1
- AddRecord 2 (XXXXX... removed .... XXXX)

By doing this.. you are ready to delete 999 files that were dependent on the JournalFile1.

There is another problem also.

Say.. that you now have 100 records pending.. but one record on each journal File. You would be using 1GiB of your disk to store about just 10KiB of real data. For that I would need to compact all the 100 records in a single journal-file and saving space this way. (I call this as journal compacting).

You may think that journal compacting would solve cleanup, but there are a few tricky situations:

I - you may need to move more data than required.
II - It doesn't solve the linked-list effect. As soon as you compact, the records will still be there. The linked-list effect will happen again for sure. You are in an infinite loop of compacting.. compacting.. compacting.

So.. my conclusion: Both solutions are equally necessary.
I - We need to fix the linked-list effect anyway. If not by updating the file.. we need to find some other way.
II - We need to fix the compacting issue. It's not acceptable using 1GiB or any number higher than that to store just a few records.

1. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 9, 2009 4:20 PM (in response to clebert.suconic)

Also: It looks like Compacting will be a heavier process thatn what Cleanup would be.

If we have both solutions in place. Cleanup would save a few compactations giving a performance boost.
Actions
2. Re: Journal Cleanup and Journal Compactor

timfox Jun 10, 2009 5:41 AM (in response to clebert.suconic)

As far as I am concerned, compacting solves all these issues.

We *definitely* need to do compacting, and *maybe* we need to do linked list (although I think that is doubtful).

So we should do compacting first, then, if there are problems with that approach we can look at linked list, but linkedlist is not a priority right now.

Let me describe how I think the compacting should work.

Let's say we have a set of files in the journal:

F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10

min files is set to 10

The current file is F6 and F7, F8, F9, F10 are unused files waiting to be used

F0, F1, F2, F3, F4, F5 are files full of data and not open any more, however, data about their contents are stored in memory.

We need a compacting thread that intermittently (say every 30 seconds) scans the set of journal files in memory.

It scans F0... F5, and when it has completed that scan, it can decide which records are no longer needed in each file.

A record is no longer needed in a file if it has already been deleted in any file F0.. F5.

Once the scanner thread has computed which records are needed and which are not, it can compute a percentage of which records in total are dead space.

E.g. the scanner might compute that 72% of the data in files F0.. .F5 are dead space.

We then have a parameter that the user can configure compactorDeadSpaceThresholdPercentage, e.g. this might have a default of 75%.

If the amount of dead space computed by the scanner >= compactorDeadSpaceThresholdPercentage, then the scanner will compact those files.

The actual compacting approach goes as follows:

The scanner knows how many new files it will need for the compacted records, let's say it needs two new files - it can get these from the unused files (e.g. F7, F8, F9 or F10) if they are available.

It then opens the old files F0..F5 loads the wanted records into memory in blocks and copies them to the new files.

When this process is finished it will end up with, say, two new files containing the compacted records.

We can then save a marker file (empty in the journal directory) which says we are starting the rename.

The old files F0..F5 then need to be renamed so they are no longer used by the journal (but can be recovered in case of a crash).

Then the two new files need to be renamed so they will be picked up by the journal.

When that process completes the marker file can be deleted.

If the server crashes after renaming the old files but before renaming the new files, then the marker file will still exist in the journal. The journal can detect this on startup, and finish the process.

This ensures the journal will still startup after such a crash.

Also the JournalFile objects also need to be updated in memory when the compact is complete.
Actions
3. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 10, 2009 10:22 PM (in response to clebert.suconic)
Most databases will have an internalID and an userID (AKA Primary Key).

If we change the journal to work this way, we could move IDs from one file to another.

This would be a very simple solution, and very easy to implement. Also, we wouldn't need to do any external synchronizations. All would be controlled within the journal.

This way, the compactor would just:

List listOfIDs = .... find the list according the way you (Tim) specified, based on file usage.... for (InternalID id: listOfIDs) { journal.startTransaction(); journal.delete(internalID); journal.append(PrimaryKey, data); journal.commitTransaction(); }

Doing this, we would eliminate all the empty files like you said. And it would also fix the linked-list issue.
Actions
4. Re: Journal Cleanup and Journal Compactor

timfox Jun 11, 2009 3:15 AM (in response to clebert.suconic)

Can you explain a bit more? I'm not really sure what you're proposing based on this explanation.

I _hope_ you're not proposing we write a database! ;)
Actions
5. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 12:26 PM (in response to clebert.suconic)

No.. I don' t want to write a database :-)

Instead of using external files to control the compacting, I suggest we do it using the current transaction support we already have on the journal.

One way of doing it would be to create a MOVE operation that would delete it on its original file, and re-add it on the current used file. That operation could be done inside a journal-transactional, so we would have all the control of crashes for free on that operation.

On my previous post I was thinking about changing the internal ID of the record when moving it, but I could do it without changing it.

You might say we would have an issue with Ordering... but compacting the way you said would be the same thing.. since the file would get a new ID and be on the top of the list. But compacting could give you the same issue since the new file receiving the compacting would get a newID.

Maybe sort by ID would fix the issue?
Actions
6. Re: Journal Cleanup and Journal Compactor

timfox Jun 11, 2009 12:35 PM (in response to clebert.suconic)

"clebert.suconic@jboss.com" wrote:

You might say we would have an issue with Ordering... but compacting the way you said would be the same thing.. since the file would get a new ID and be on the top of the list. But compacting could give you the same issue since the new file receiving the compacting would get a newID.

Compacting as I envision it would not change the order of any records, it simply removes dead records and pushes them all up to remove the dead space. No ordering changes occur.
Actions
7. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 12:47 PM (in response to clebert.suconic)

What about a case where?

File1 - Needs compacting
File2 - Full
File3 - Needs compacting
File4 - Full
....
File(n-1) Needs Compacting
File(n) - Full
File(n+1) in use

If we could change the phisical order of the record on the journal... we would just move record from 1, 3, ... n-1 to the top of the journal. Reclaim would take care of them after moved.

I'm suggesting the logical order of the record could be different than the phisical order.
Actions
8. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 1:02 PM (in response to clebert.suconic)

It then opens the old files F0..F5 loads the wanted records into memory in blocks and copies them to the new files.

This is what I'm calling as moving here. However. when you move them to the new file.. the new file will also be on the top of the list... so the order could change.

If you really want phisicalOrder == logicalOrder, I could open a "new-file" at an earlier position. However during the write time... you will have two files being written.. so the disk head will be jumping back and forth these two poistions.

Besides.. the interchanged example would still suffer if you can't change phisical order. Unless you also compact the full files in between. (what would be a waste of processing).

I would still do the Move-operation, as I could have the transactional control within the journal files.
Actions
9. Re: Journal Cleanup and Journal Compactor

timfox Jun 11, 2009 1:06 PM (in response to clebert.suconic)

"clebert.suconic@jboss.com" wrote:
It then opens the old files F0..F5 loads the wanted records into memory in blocks and copies them to the new files.

This is what I'm calling as moving here. However. when you move them to the new file.. the new file will also be on the top of the list... so the order could change.

The new files won't be at the top of the list, they'll be in the same place the old files were.

Like I say, no order will change.
Actions
10. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 1:10 PM (in response to clebert.suconic)

What about the interchanged example.

File1 full,
file2.. compact...
file 3 full...
file 4 compact

would you compact (file1, 2, 3 and 4)?

compacting 2 and 4 would be a waste of processing.
Actions
11. Re: Journal Cleanup and Journal Compactor

timfox Jun 11, 2009 1:17 PM (in response to clebert.suconic)

When you compact the journal you can't choose which files you compact, you either compact the whole journal or none of it.

If you just choose to compact just one file then you would delete that file but you're just deleting the delete records so when you reload the journal the messages come back!
Actions
12. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 1:22 PM (in response to clebert.suconic)

I wouldn't do a full compacting during normal operation. Performance would be terrible.

Besides.. how can we ensure new records could still be added while in operation?

A delete record would enter on the account also for compacting.

And besides.. I still want to do the cleanup.
Actions
13. Re: Journal Cleanup and Journal Compactor

clebert.suconic Jun 11, 2009 1:23 PM (in response to clebert.suconic)

I mean.. I would move the delete records also.. and they would be accounted for freespace or not. a valid dete record is a valid record for the point of view of compacting calculation.
Actions
14. Re: Journal Cleanup and Journal Compactor

timfox Jun 11, 2009 1:27 PM (in response to clebert.suconic)

If you can explain in more detail your move suggestion. right now I don't think you have given enough information for us to make any decisions.
Actions

1 2 Previous Next

Go to original post