12 Replies Latest reply on Jul 31, 2008 9:43 AM by Tim Fox

    Sweet life on the journal...

    Clebert Suconic Master

      Life is being good here... :-)

      I'm doing some changes on the journal, what will supposedly make Recovery rock solid to failures, and what will avoid us from filling the file when we reuse a file, what will bring us the performance we wanted.

      Every record will look similar to this layout:

      Byte - recordType
      Integer - journalSequenceID (the sequence ID used by the current journal)
      Integer - variableSizeLength (used on Add/Updates only)
      .... Body of the Record ....
      Integer - checkSize (Number of bytes written at this record).


      The journalSequenceID is to make sure the record is not garbage from the previous file usage. This would avoid us from refilling the file. I have also made that a sequence and an integer instead of Long as it was before (I wanted to save 4 bytes on every record).

      The checkSize is to make sure the record is complete. ON tests I have seen situations where the variableSizeLength got damaged what was making recovery impossible.


      I'm also changing how the load is being done. On the event of any damage.. I just skip that record and keep going until I can find the next valid record on the file. I'm also adding some logic to throw away uncomplete transactions.

      The numbers I'm getting are really impressive. As I don't need to fill the journals, I'm being able to perform 22K records / second on perfListener/perfSender (PERSISTENT). And the number of files never goes beyond 10.

        • 1. Re: Sweet life on the journal...
          Clebert Suconic Master

          Just an update...


          That an increase of 40% in performance... Oh wow!

          I just run the same test before the changes what would be 15K on my computer... After the changes 22K.

          • 2. Re: Sweet life on the journal...
            Tim Fox Master

            Sounds great on the perf! :)

            Question: on the record sequence number - how can you be sure on old sequence number from a reclaimed file doesn't by chance match the correct sequence number - in this case you wouldn't be able to detect failure right?

            • 3. Re: Sweet life on the journal...
              Clebert Suconic Master

               

              Question: on the record sequence number - how can you be sure on old sequence number from a reclaimed file doesn't by chance match the correct sequence number - in this case you wouldn't be able to detect failure right?



              I keep a variable called nextOrderingId (on JournalImpl).

              When I reload the journal, I set nextOrderingID to the max existent, so new files will keep after the nextOrderingId.

              • 4. Re: Sweet life on the journal...
                Clebert Suconic Master

                Why I have removed the Reclaim Thread:

                Reclaim should aways (even when you have a high number of files) to finish much faster than a file could be consumed or produced.

                The needForFile should aways fire the Reclaim event (instead of a timer). With that you never grow the number of existent files.

                If you create a new file when you could reuse one from reclaim, you will have createFile competing disk resources with the appendFile, what will increase latency and harm performance while the 10M file is being created.

                Reusing more files will improve latency.

                • 5. Re: Sweet life on the journal...
                  Clebert Suconic Master

                   

                  Question: on the record sequence number - how can you be sure on old sequence number from a reclaimed file doesn't by chance match the correct sequence number - in this case you wouldn't be able to detect failure right?


                  Also... at the end of every record I'm also writing the used size, to validate the "health" of the record.

                  A record to be considered healthy needs the record type between 10 and 19, the currentFileId at the second Position to match the currentFileId being used (to validate it was not from a previous usage) and the checkSize at the end matching to the exact number of bytes written. This is also an extra check for APPEND and Update records.

                  If those conditions are not met, the record is considered broken, and I keep moving byte by byte on the journal file till I can find another byte between 10 and 19 where all the above conditions are met.

                  One possibility: We could "maybe" replace the checkSize by a hash calculation over the byte array content. That would eliminate any small possibility of coincidences like messages being magically created by crashed files.

                  And BTW: One thing I was wondering.. just fixing terminology. From now on.. I will say Reload to reading the journal files back to memory. And I will keep the word Recovery for the transaction only.

                  • 6. Re: Sweet life on the journal...
                    Tim Fox Master

                    Can you write this up on a wiki page, since it's all getting quite complex. (alarm bells start to ring)

                    • 7. Re: Sweet life on the journal...
                      Clebert Suconic Master

                      Sure...


                      Before I do that.. what you think about the idea of hashCalculation on every record?

                      • 8. Re: Sweet life on the journal...
                        Clebert Suconic Master

                        http://wiki.jboss.org/wiki/JBossMessaging2Journal

                        It still under work.. but that will be the page when I'm done.

                        • 9. Re: Sweet life on the journal...
                          Clebert Suconic Master

                          I was dealing with one of the most obscure bugs I had ever seen on my 20 years programming. After I found it, as aways it was a dumb and obvious bug, but it was a little hard to find this one:

                          When the journal was creating a byte buffer, it was sending it to the JNI layer and it was immediately releasing the reference to the ByteBuffer. As the write was being done asynchronously, the ByteBuffer could be eventually released (GC) at the same time it was being written, what was creating gargabe on the journal.

                          Well.. at least that was a good exercise during my tests, so I'm pretty sure the journal is loading data properly and dealing with holes on transactions. I have written a few Stress Tests to catch this bug. I can't reproduce this on a Unit Test.

                          I will write a few more tests on Reload and incomplete transactions, and maybe make few other improvements on the journal. (Maybe improve the collections on the journal making it use less memory. I have a few ideas that I will input on the forums tomorrow).

                          BTW: Most of the information is on the WIKI already. I have made a few modifications on the code regarding Transactions counter to deal with holes and I will put that on the WIKI tomorrow. (I'm writing the counter per journal file, as reclaiming could legally get part of a transaction).

                          • 10. Re: Sweet life on the journal...
                            Clebert Suconic Master

                            An almost accidental feature result of the loading work I'm doing (considering possible holes and losses) is that now you can reload the files on either journal implementation (AIO / NIO).

                            For example.. it is possible to add records using AIO, and if you migrate your journal to NIO your messages are loaded without any problem.
                            (for instance.. if your backup box is a Windows (blah) machine).

                            • 12. Re: Sweet life on the journal...
                              Tim Fox Master

                              Good stuff Clebert! :)