I will be removing cleanup from the journal...
I had to make a change on compacting.. before compacting was converting any addTX, deleteTX or updateTX into regular add delete and update.
this was leading to duplicates in case of a rollback with the same ID was used on add. or always on update cases (think of ACK and rollback).
That created an edge case for cleanup, as I would need to read other files in order to make it properly.
But I can't just remove cleanup.. as any long living records would force the journal to be constantly compacted (what I called the linked list effect on the journal files).
I found a solution which is to mark how many times a record was compacted.
This way records that were never compacted before won't be in the same files as records that were previously compacted. That should solve the issue with the linked-list effect.
Also, I'm king of making a decision in one thing I don't want to change the file format.. so I'm thinking about adding 3 bits in from of the first byte of the records (where we identify if it's an AddRecord, AddRecordTX... Commit or Rollback). This way we won't need to force users to change any format. (what would be an issue with the EAP).
Everyone: Let me know if you see any problems here.
I was actually concerned with the hack (using the recordType to hold the counter).
I will just change the data format. Users will be able to export and import their data before upgrading the version.