1 2 Previous Next 20 Replies Latest reply on Sep 1, 2005 3:21 PM by adrian.brock

    JBAS-1445 & JBAS-2156 Discussion thread

      There is already a discussion(s) somewhere in the user forums.

      Summary (from what I can remember):
      1) The expiry processing should be a plugin (i.e. what do you want to do?) rather than a hardwired DLQ style handling.
      2) To make this even work under some plugins/policies, we need to remove the expiry processing from the client code
      3) This feature needs to include basic redelivery DLQ hooks as well.

      Jira references:
      http://jira.jboss.com/jira/browse/JBAS-2156
      http://jira.jboss.com/jira/browse/JBAS-1445

        • 1. Re: JBAS-1445 & JBAS-2156 Discussion thread
          genman

           

          "adrian@jboss.org" wrote:
          There is already a discussion(s) somewhere in the user forums.

          Summary (from what I can remember):
          1) The expiry processing should be a plugin (i.e. what do you want to do?) rather than a hardwired DLQ style handling.


          It makes the most sense to have expired messages simply moved to another queue. If JBoss wants to provide some sort of logging facility (ala WebLogic's logging) for a named queue which has an MDB, that would be okay. However, the typical use case is you want to "do something", for example notify a client that their transaction ended. An example MDB should be provided as documentation.


          To make this even work under some plugins/policies, we need to remove the expiry processing from the client code


          I did a "grep" for expired, it seems like "isOutdated" is another variation of this and is used by SpyMessageConsumer. I was optimistically hoping this was taken care of already...

          It seems like correct solution, would be to nack with a special expiration NACK. (Otherwise, the server would attempt to resend the message again.) As for serialization compatibility, currently a boolean is used as the flag, a byte flag can be used instead.


          This feature needs to include basic redelivery DLQ hooks as well.


          I do think that JBoss *should* come with some sort of default DLQ "plug-in" (MDB, I would imagine) that would allow for redelivery, log before removal, etc., messages in the DLQ state.

          A typical lifecycle of a message might be:

          0) Message fails N times, goes to DLQ. Typical failure scenario is a web site down, DB offline, etc.
          1) Message in the queue get an expiration set to something reasonable (24 hours), so that the user doesn't walk away from the system for a month and have it fill up with junk.
          2) Message gets put back onto the same queue, say every 30 minutes (schedule it) so that the odds are better the message will work.
          3) Message eventually expires, it gets copied to some offline storage, a message is logged in some sort of expiration logfile.

          I'd be happy to write such a plug-in.


          • 2. Re: JBAS-1445 & JBAS-2156 Discussion thread
            ovidiu.feodorov
            • 3. Re: JBAS-1445 & JBAS-2156 Discussion thread

               

              "genman" wrote:


              To make this even work under some plugins/policies, we need to remove the expiry processing from the client code


              I did a "grep" for expired, it seems like "isOutdated" is another variation of this and is used by SpyMessageConsumer. I was optimistically hoping this was taken care of already...


              Optimistic programming ;-)

              The main reason I can think for having a DLQ for expired messages is to
              have the guarantee that the message is either delivered or it is in the DLQ.
              If the client code is just ACKing the expired message without delivering it, that
              guarantee is gone.


              It seems like correct solution, would be to nack with a special expiration NACK. (Otherwise, the server would attempt to resend the message again.) As for serialization compatibility, currently a boolean is used as the flag, a byte flag can be used instead.


              No.

              We are not going to change the wire format across versions in incompatible ways.

              If you go back to the old discussion I had before, you'll see that I said trying to NACK
              the message from the client side just leads to cpu spin when the client/server clocks
              are out-of-sync.

              I see no reason why the server can't just be used as the "standard candle"
              which gives one clock to do the expiry processing. We could also update the
              "send time" of the message on the serverside during the add().
              i.e. if the server attempts delivery, it doesn't think the message expired
              and therefore it didn't.

              • 4. Re: JBAS-1445 & JBAS-2156 Discussion thread

                There are two other problems with this patch:

                OFF-TOPIC
                I'm ignoring some of the more obvious (unintended?) breakage,
                like removing the check that a property is java identifier.
                And the fact there are no tests for the basic behaviour, let alone error conditions and edge cases.

                1) The expiry is done inside the synchronized lock of the queue. This is just going to cause the queue to stall during an expiry.
                2) There is no attempt to make the expiry an atomic process. i.e. telling the persistence manager that the expiry is an add/remove in the same internal transaction.

                On (2) the MDB DLQ processing uses a similar routine to the patch because there is no
                access to the internal atomicity of the jms server.
                And you cannot send to the DLQ inside the MDB's JTA transaction. It would defeat the point of it.

                • 5. Re: JBAS-1445 & JBAS-2156 Discussion thread
                  genman

                   

                  "adrian@jboss.org" wrote:
                  There are two other problems with this patch:

                  OFF-TOPIC
                  I'm ignoring some of the more obvious (unintended?) breakage,
                  like removing the check that a property is java identifier.
                  And the fact there are no tests for the basic behaviour, let alone error conditions and edge cases.


                  The java identifier check was moved below, not removed.

                  I also fixed the client expiration nacking. (It's also backwards compatible with older clients, the wire format is compatible.)

                  If you're (JBoss) is interested in this patch at all (hard to say, Adrian you're hardly an affectionate person), I'd be happy to add testing, and error cases, etc. The unit tests are fairly elaborate to set up.


                  1) The expiry is done inside the synchronized lock of the queue. This is just going to cause the queue to stall during an expiry.


                  This can be easily fixed, either by doing it asynchronously or expiring outside of the blocks.


                  2) There is no attempt to make the expiry an atomic process. i.e. telling the persistence manager that the expiry is an add/remove in the same internal transaction.


                  This I need help with.

                  ...I'll upload the latest version so you can tear it up again...

                  • 6. Re: JBAS-1445 & JBAS-2156 Discussion thread

                     

                    "genman" wrote:

                    If you're (JBoss) is interested in this patch at all (hard to say, Adrian you're hardly an affectionate person), I'd be happy to add testing, and error cases, etc. The unit tests are fairly elaborate to set up.


                    I didn't know affection was a qualification for a programmer.
                    People skills are, but then not everybody's perfect. :-)

                    It is a "well known" secret that the best way to get something done in open source
                    is to post a patch for a feature. The maintainer of the project will see the patch,
                    understand where it is wrong and fix it properly.

                    But, given that JBossMQ is in bug fix mode (for me) at the moment. I'm not going to do
                    that, so I'm going to help you understand all the consequences of the patch instead.

                    It doesn't help when I've already done all this analysis before for somebody
                    who then fell off the face of the earth (probably with his own private patch that he didn't contribute back), which happens a lot.

                    • 7. Re: JBAS-1445 & JBAS-2156 Discussion thread

                       

                      "genman" wrote:
                      The unit tests are fairly elaborate to set up.


                      In my book, the tests are the most important piece.

                      Not tested == doesn't work (Adrian Brock's 2nd law of programming).

                      I usually write the tests first when it makes sense,
                      otherwise how do you know the tests are actually testing the "broken" behaviour.

                      • 8. Re: JBAS-1445 & JBAS-2156 Discussion thread

                        So given that JBossMQ is only in bug fix mode,
                        what I don't want to do is have a patch for something that doesn't work
                        and I'm left to answer all the questions and fix the problems.

                        And even worse breaks stuff (including adding overhead) for
                        people that aren't even interested in this feature.

                        That's why your comment about "optimistic" or "let's change this and *hope* we don't
                        break something", doesn't work for me.

                        • 9. Re: JBAS-1445 & JBAS-2156 Discussion thread

                           

                          "genman" wrote:



                          1) The expiry is done inside the synchronized lock of the queue. This is just going to cause the queue to stall during an expiry.


                          This can be easily fixed, either by doing it asynchronously or expiring outside of the blocks.


                          2) There is no attempt to make the expiry an atomic process. i.e. telling the persistence manager that the expiry is an add/remove in the same internal transaction.


                          This I need help with.

                          ...I'll upload the latest version so you can tear it up again...


                          So the fundamental issue is that we have (in pseudo code)

                          queue.logicallyRemove(message);
                          plugin.expireMessage(message);
                          // PROBLEM HERE
                          physicallyRemove(message);
                          


                          The problem is that a crash at this point leads to the message going to the DLQ twice.
                          Once before the crash, the second after a reboot when the recovery also expires the
                          message.

                          So a better approach is:

                          Tx tx = startTx();
                          queue.logicallyRemove(message, tx);
                          plugin.expireMessage(message, tx);
                          tx.commit();
                          


                          Of course, if expireMessage doesn't understand how to do transactions,
                          e.g. it is just doing some logging
                          then it will happen twice anyway, but at least once and only once is possible.

                          • 10. Re: JBAS-1445 & JBAS-2156 Discussion thread
                            genman


                            The move is now atomic and executes in a separate thread pool. I had to play around with the various thread pools to add a configurable thread pool to the destination manager, which now will use it to handle timeouts and client processing. (For UIL2, I thought I could pool the read/write threads, but they're long-lived.) Consequently, I had to re-write TimeoutFactory, and in a number of places expose the ThreadGroup from the BasicThreadPool.

                            I added a unit test to the patch.

                            The existing expiration unit tests stay the same and work.

                            I know you're in "bug fix mode" for the current JbossMQ, but I at least wanted to come up with a useful feature I might be able to use in the very near future.

                            The JBoss Messaging project is a bit mysterious to me, I think as soon as I see persistence (say, with Hibernate), transactions, and clustering I'll take a look at it. To me, it seems somewhat glacially paced.

                            • 11. Re: JBAS-1445 & JBAS-2156 Discussion thread

                              I don't mean to be a pain, but this patch is getting way out of control.

                              I see:
                              1) A complete re-write of TimeoutFactory (so much so that there is more diff than code)
                              this is used in many places throughout JBoss, not just JBossMQ.

                              2) Changes to default configuration

                              3) Changes to public methods: SpyMessage.isOutdated() -> isExpired()
                              What happens to user code that used that method?

                              4) Changes to the wire format (EACK)
                              Again, what happens to user code that doesn't expect this, e.g. in an interceptor?

                              5) Changing to threadpooling in UIL2 (something I already tested and reverted because such a trivial change introduces problems).

                              5) The expiry patch

                              6) Anything else?

                              • 12. Re: JBAS-1445 & JBAS-2156 Discussion thread

                                Let's split this up into its incremental parts.

                                1) Remove isOutdated() processing from client side
                                2) Changes to TimeoutFactory (I'm not sure what you are doing here?
                                If TimeoutFactory is not the tool for the job, we should write/use something that is! Not change it.)
                                3) Changes/Introduction of thread pooling config in JBossMQ
                                4) The expiry patch
                                5) Changes to UIL2 to use a configured threadpool.

                                Incidently, on (5). The main reason why UIL2 uses a thread pool is because of a potential distributed deadlock relating NACKing messages when the client does a receive() and then stops/closes the receiver before it gets the message.
                                I don't know if this deadlock still exists since we recently made all UILClientIL methods asynchronous?
                                I would rather look at fixing the deadlock and removing the transport level threadpool.
                                I just haven't had time to do this.

                                • 13. Re: JBAS-1445 & JBAS-2156 Discussion thread

                                  I can see I'll have to get a lot more involved in this work.

                                  I believe you have cvs r/w correct?

                                  Let's apply the patches I mentioned above one-by-one to jboss-head.
                                  We can't "play" with jboss-4.0 at the moment, we are too close to the 4.0.3 release for
                                  this change.

                                  • 14. Re: JBAS-1445 & JBAS-2156 Discussion thread
                                    genman

                                     

                                    "adrian@jboss.org" wrote:
                                    Let's split this up into its incremental parts.

                                    1) Remove isOutdated() processing from client side
                                    2) Changes to TimeoutFactory (I'm not sure what you are doing here?
                                    If TimeoutFactory is not the tool for the job, we should write/use something that is! Not change it.)
                                    3) Changes/Introduction of thread pooling config in JBossMQ
                                    4) The expiry patch
                                    5) Changes to UIL2 to use a configured threadpool.


                                    The reason I got carried away with this was because... Basically, to process expiration asynchronously, I needed Yet Another Thread Pool. And so I though I'd have the destination manager hold on to one. Incidentally, this is was let me to change...

                                    The reason I changed TimeoutFactory was to isolate the thread pool from timeout factory. Therefore, I had to break the singleton design, and once broken, then needed a way to stop/shutdown the timers. If you apply the results of the patch, what I did was have it wrap a java.util.Timer instance. This allows the timeouts to be cleared if the destination manager is restarted.

                                    I had to segregate the execution target pools, mainly done so that the TransactionManager would have a separate thread pool than JBossMQ. So, once I added such a thing, then it made sense to clean up more at the transport layer, etc.

                                    Methods in SpyMessage (like isOutdated) should never have been public, because they're not really part of the JMS API.

                                    Sorry to make you get involved in a big feature, but when you end up fixing one thing, one other thing leads to another, etc. I'll see about getting this into CVS head at least.


                                    1 2 Previous Next