13 Replies Latest reply on Dec 30, 2005 12:01 AM by Ovidiu Feodorov

    Messaging Core Transactional Behavior

    Ovidiu Feodorov Master

       


      Ok, it sounds like we are at cross purposes.
      But let me finish the argument before I explain why that is not necessarily so.

      The "transaction log" is the core of the system.

      If you don't get this correct, you might as well not bother and implement JMS with arraylists and threads :-)

      It exists as a server wide object you can't have two transaction logs and get reliablity without doing external 2PC with an additional recovery log

      There is no such thing as a subscription local transaction.

      There are such a things as subscriptions (temporary destinations/non durable subscriptions) and
      messages (non-persistent) that don't have transaction reliablity.

      i.e. they don't survive crashes and you don't need to extra work for them.

      But if they handled in a transaction they still have transaction semantics. i.e. they don't magically appear/disappear until any transaction commits.

      This transaction log should be (that is usually the best way) write ahead.

      i.e. It should say.

      * Write to disk - this is what I am going to do (prepare)
      * Commit the changes - i.e. ensure the info will survive a crash
      * Then actually do it (commit)

      This prepare/commit is regardless of any external 2PC.
      For local jms tx it can be optimized and there are all sorts of other tricks
      I want to discuss down the road, but it must behave like the protocol above.



        • 1. Re: Messaging Core Transactional Behavior
          Ovidiu Feodorov Master

          I will try to comment what you said in the last post line by line, and use code examples where possible. Since it's such an important subject, let's clarify it and bring ourselves on the same page.


          There is no such thing as a subscription local transaction.

          There are such a things as subscriptions (temporary destinations/non durable subscriptions) and
          messages (non-persistent) that don't have transaction reliablity.

          But if they handled in a transaction they still have transaction semantics. i.e. they don't magically appear/disappear until any transaction commits.


          I don't like the terminology, in the first place. We pushed transactional support at the core level. We have a "Messaging Core" exactly to separate "generic messaging" from "JMS" (actually because Adrian said so :) ) The core doesn't know about subscriptions and temporary destinations. Let's discuss in terms of channels and channel recovery. If we're not able formalize the problem using these concepts at this level, it means there is something fundamentally wrong with our architecture.

          To translate what your were saying:

          "There are such things as non-recoverable channels and unreliable messages that don't have transaction reliability i.e. they don't survive crashes and you don't need to extra work for them. But if they handled in a transaction they still have transaction semantics. i.e. they don't magically appear/disappear until any transaction commits. "

          Yes, that's exactly the case. If you take a look at ChannelSupport you'll see that we case for transactional and non-transactional handling.

          There are actually combinations of the following elements: recoverable/non-recoverable channel, reliable/non-reliable message, transactional/non-transactional context. In all 8 cases. Some of them don't make sense, for example a non-recoverable channel won't accept a reliable message, transacted or not (unless is specifically configured to do so).

          For an unreliable message, if handle() is called in a transacted context (tx != null), the message reference is "registered" with the transaction via a callback, and it actually stays in limbo in memory until the transaction commits. When the transaction commits the callback is activated and it adds the message reference to the "in-memory" channel state. No database access of all, we get atomicity, if not durability.

          For a reliable message, if handle() is called in a transacted context, the message reference is registered with the transaction via a callback, but while doing so, it is also persisted as "+" and so the transaction ID of the transaction it belongs to. When the transaction commits, all messages references belonging to that transaction transition from "+" to "C" and the transaction ID is removed from database (in the same database transaction). Then the callback is activated actually adding the message reference to the "in-memory" channel state, so even for a reliable message, the message reference will be cached in memory and will save database hits. See also
          "Selectors run in memory, not in the database http://www.jboss.org/index.html?module=bb&op=viewtopic&t=71499"


          This transaction log should be (that is usually the best way) write ahead. i.e. It should say.

          * Write to disk - this is what I am going to do (prepare)
          * Commit the changes - i.e. ensure the info will survive a crash
          * Then actually do it (commit)


          What's different in what I described above from what you say?



          • 2. Re: Messaging Core Transactional Behavior
            Ovidiu Feodorov Master

            Let's clarify this, and then I'll go back to MessageStore/CacheStore/PersistenceManager and the rest of your Wed Dec 7, 2005 19:36 PM post on http://www.jboss.org/index.html?module=bb&op=viewtopic&t=73301

            • 3. Re: Messaging Core Transactional Behavior
              Adrian Brock Master

               

              Yes, that's exactly the case. If you take a look at ChannelSupport you'll see that we case for transactional and non-transactional handling."


              It was Alex that was talking about making the channel the transactional repository. :-)


              There are actually combinations of the following elements: recoverable/non-recoverable channel, reliable/non-reliable message, transactional/non-transactional context. In all 8 cases. Some of them don't make sense, for example a non-recoverable channel won't accept a reliable message, transacted or not (unless is specifically configured to do so).


              Your example doesn't make sense? Non-durable topic subscriptions can still
              accept persistent messages. They just don't get persisted (there is no point).

              "Reliability" requires the combination of both durable destination (it is still conceptually
              there even if the client or even the jms server is not)
              and persistent message (the message survives a crash).

              Maybe that is what you mean by the parenthetical comment?

              • 4. Re: Messaging Core Transactional Behavior
                Adrian Brock Master

                It sounds correct to me (as far as it goes).

                NOTE: This update is unnecessary:

                "When the transaction commits, all messages references belonging to that transaction transition from "+" to "C"

                Just removing the transaction record it belonged is enough.

                In old fashioned logging
                tx1 begin
                tx1 add m1
                tx1 add m2
                tx1 commit

                there is no need to also do
                tx1 confirm m1
                tx1 confirm m2

                In fact, you want the "commit" in the log to be as little work as possible.
                Because it is a lot harder to recover from problems during the commit phase
                (rather than prepare).

                • 5. Re: Messaging Core Transactional Behavior
                  Adrian Brock Master

                  "Then the callback is activated actually adding the message reference to the "in-memory" channel state,"

                  And I am saying that "in memory" state should be managed by the PM.

                  If the channel (or more accurately, the PM's list for the channel)
                  already has 1000 messages in memory with a marker on the end of list, the PM may just
                  commit to disk and drop the message from memory.

                  Think of the channel like a pipe. Stuff goes in one end and comes out the other
                  in different ways. As long as it behaves as per contract,
                  what happens inside the pipe is all implementation detail and voodoo ;-)

                  Do you know about Schrodringer's cat?

                  • 6. Re: Messaging Core Transactional Behavior
                    Adrian Brock Master

                    "Think of the channel like a pipe."

                    To (ab)use the analog, you have an overflow valve for the pipe.
                    Let the end-user decide whether this goes to separate holding tank or
                    gets flushed down the drain ;-)

                    And even at point it should overflow, e.g. flow rates, level in the pipe, etc.

                    • 7. Re: Messaging Core Transactional Behavior
                      Ovidiu Feodorov Master

                       


                      Your example doesn't make sense? Non-durable topic subscriptions can still accept persistent messages. They just don't get persisted (there is no point).

                      "Reliability" requires the combination of both durable destination (it is still conceptually there even if the client or even the jms server is not) and persistent message (the message survives a crash).

                      Maybe that is what you mean by the parenthetical comment?


                      This is correct. This is exactly what I meant. By default, you want your infrastructure to be safe. So you don't want to let a non-recoverable channel to handle a reliable message, because it simply cannot insure its recoverability in case of failure. This is the default behavior. However, there are legitimate cases when it's acceptable for a non-recoverable channel to accept a reliable messages: a non-durable subscription, for example. The "non-durable" semantics of the subscription is stronger than the "reliable" property of the message. For this specific case, we need a way to configure the channel "to accept unreliable messages", hence Channel's
                      public boolean acceptReliableMessages() and State's public boolean acceptReliableMessages(). For a reliable channel it always returns "true", for a non-reliable channel it may return "true" or "false", depending on how the channel was initialized.



                      • 8. Re: Messaging Core Transactional Behavior
                        Ovidiu Feodorov Master

                         



                        When the transaction commits, all messages references belonging to that transaction transition from "+" to "C"


                        Just removing the transaction record it belonged is enough.


                        Correct. This is a minor change that will be tracked by http://jira.jboss.org/jira/browse/JBMESSAGING-194


                        • 9. Re: Messaging Core Transactional Behavior
                          Ovidiu Feodorov Master

                           



                          Then the callback is activated actually adding the message reference to the "in-memory" channel state,

                          And I am saying that "in memory" state should be managed by the PM.

                          If the channel (or more accurately, the PM's list for the channel) already has 1000 messages in memory with a marker on the end of list, the PM may just commit to disk and drop the message from memory.


                          We have a naming confusion here. I see the PersistenceManager as a dumb layer that takes method invocations, translates them into JDBC statements and executes them transactionally. The PersistenceManager *does not* maintain state. It will be eventually be replaced by a Hibernate layer.

                          The channel state maintains state. This is the "PersistenceManager" as you understand it.

                          To translate in Mess speak what you were saying, "If the channel (or more accurately the channel state list for the channel) already has 1000 messages in memory, the channel state may just commit it to disk and drop the message from the memory"

                          This could be implemeted relatively easy, since channel state is implemented as a hierarchy of two classes: org.jboss.messaging.core.NonRecoverableState / org.jboss.messaging.core.RecoverableState. I will only need to add additional logic to RecoverableState.add(MessageReference ref, Transaction tx) that performs whatever checks are needed to be performed and then dump messages to disk (delegating this to the "dumb" PersistenceManager) and clear memory buffers. All this in a transactional context, since I get the transaction as an argument of the message call.

                          We probably need to change the name of the PersistenceManager, as it is confusing and it looks like nobody likes it :)

                          • 10. Re: Messaging Core Transactional Behavior
                            Ovidiu Feodorov Master

                             


                            Think of the channel like a pipe. Stuff goes in one end and comes out the other in different ways. As long as it behaves as per contract, what happens inside the pipe is all implementation detail and voodoo ;-)




                            This is exactly how we have it now. Our pipe is currently a very simple pipe, that can get clogged and it doesn't have an overflow valve, but it's relatively straightforward to add the overflow functionality we need. The code must be added to the channel state.


                            To (ab)use the analog, you have an overflow valve for the pipe. Let the end-user decide whether this goes to separate holding tank or gets flushed down the drain ;-)

                            And even at point it should overflow, e.g. flow rates, level in the pipe, etc.


                            Yes, all clicks into place. It's just that we're not calling the the flow control mechanism PersistenceManager but channel State. The state decides when to overflow, and where. If you want it, we can call it something else than "state", but I like it as it is, it's short.

                            It is very important to stress though that while the channel accepts Messages (it actually accepts Routables, which can be Messages or MessageReferences), the state *only* maintains MessageReferences! So, somewhere in between, inside the channel, a Message must be turned into a MessageReference, and this is where the MessageCache comes into play.

                            I won't discuss MessageCache on this thread, I will wait until we all agree on what we've discussed so far, and I'll return to: "Context of MessageStore design task" http://www.jboss.org/index.html?module=bb&op=viewforum&f=153

                            • 11. Re: Messaging Core Transactional Behavior
                              Adrian Brock Master

                               

                              "ovidiu.feodorov@jboss.com" wrote:

                              We probably need to change the name of the PersistenceManager, as it is confusing and it looks like nobody likes it :)


                              I hate names :-)
                              They are a shortcut to actually thinking about what is actually going on!
                              And often confusing.

                              To play "Humpty Dumpty"
                              Perhaps it should be called "TransactionLogDelegate" or something? :)

                              • 12. Re: Messaging Core Transactional Behavior
                                Adrian Brock Master

                                 

                                "adrian@jboss.org" wrote:

                                To play "Humpty Dumpty"
                                Perhaps it should be called "TransactionLogDelegate" or something? :)


                                But even then there will be additional methods for it to implement when it is
                                acting as a "live source" of data for the lazy behaviour.

                                public HiberanateLazyDelegate implements TransactionLogDelegate, LazyLoadingDelegate
                                {
                                }
                                


                                If you even want to create a public/pluggable LazyLoading abstraction?

                                • 13. Re: Messaging Core Transactional Behavior
                                  Ovidiu Feodorov Master

                                  We seem to converge.

                                  This is the summary of the thread, so far:

                                  A channel (org.jboss.messaging.core.Channel) is a transactional mechanism that receives and reliably forwards messages to arbitrary receivers. The channel's responsibilities are to provide atomicity (the channel atomically accepts or discards a set of messages), isolation (it is thread-safe) and most important, durability for reliable messages, which insures the fact that a reliable message can be recovered even if the channel instance crashes. Ideally, a channel should implement this behavior in a smart way, i.e. not wasting extra effort to insure recoverability for non-reliable message, for performance's sake.

                                  A channel is responsible for internally managing its state. Because of that, it can apply whatever optimizations it finds fit (spill over, lazy loading, etc., more about this later). The channel's internal component responsible with maintaining state is named, naturally enough, state (org.jboss.messaging.core.State).

                                  The channel state only maintains message references. A message reference is a lightweight representative of the message that has a much smaller footprint by avoiding to embed the message payload. The message reference maintains however message attributes such as reliability flag, timestamp, redelivery count, etc. and the full set of headers. The rationale behind this arrangement is to allow the channel to free up memory by dumping message bodies on disk when it runs tight on resources, while still maintaining the message representatives (live data) in memory.

                                  The channel state is maintained in memory. However, the state implementation must have access to transactional storage for at least four reasons:

                                  1. It must maintain a transactional log to write ahead messages references as "prepared" and then switch them to a "commited" state when the core transaction commits.
                                  2. In must persist reliable messages (message reference from state and their corresponding message bodies) to insure recoverability.
                                  3. It may need need to persist non-reliable message bodies and dump them from memory in case of memory shortage.
                                  4. Ultimately, it may need to persist even parts of the state itself (message references, reliable and non-reliable) if faced with large quantities of messages. This would allow a channel to maintain an "unlimited" number of messages, regardless of memory constraints (obviously provided that it has access to "unlimited" storage).

                                  The channel and the channel state access the transactional storage via an abstraction soon-to-be-renamed called persistence manager. The persistence manager implementation maintains no state. Whatever the final name of this abstraction will be, it must offer an interface that gives access to the functionality described above: transactional log, mandatory storage/retrieval of reliable message bodies (backup data), on-demand storage/retrieval of non-reliable message bodies (spillover data) and storage/retrieval of channel state for eager dumping/lazy loading (live data).

                                  The current implementation spreads this behavior among two interfaces (org.jboss.messaging.core.PersistenceManager and org.jboss.messaging.core.MessageStore). This situation will change. Adrian started the process by proposing two new interfaces (TransactionLogDelegate and LazyDelegate). How this component should actually look is the subject of http://www.jboss.org/index.html?module=bb&op=viewtopic&t=73301 which will soon return to.

                                  However I won't close this thread until everbody agrees with the summary that I just laid out (or another one resulted from additional iterations).