1 2 Previous Next 17 Replies Latest reply on May 14, 2008 3:40 AM by timfox

    JBoss 2.0 Thread model

    timfox

      OOOPS! Sorry I accidentally deleted this thread while doing some forum housekeeping! :)

      Here it is in an ugly form I'm afraid:

      jmesnil said:


      JBM 2 is using MINA as its remoting API.
      All JMS connections and their children (session, producer, consumer) to the same JMS server use a single MINA NioSocketConnector (i.e. one single TCP socket).
      This means that all the JMS resources associated to the same JMS server share the same MINA IoSession.
      In turn, this means that the messages sent by 2 JMS Sessions created from the same JMS Connection are ordered globally. This is a performance-killer and less than optimal : order must be ensured at the JMS Session level only.

      One potential solution is to use 1 MINA IoSession per JMS Session + 1 MINA IoSession for each JMS connection.
      This solution is simple to implement.
      However this means having many TCP sockets open between a JMS client and a server...

      The other solution is to introduce a customized ExecutorFilter to process MINA messages in other threads than the I/O process thread and still ensure that messages associated to a JMS Session and its children are treated by the same thread.

      How to implement this ExecutorFilter?

      The filter will have a ThreadFactory.
      When a MINA message is sent/received, we look for a thread associated to the message target (AbstractPacket.targetID).
      We need to correlate the session ID based on this targetID (the target can be a JMS Producer or Consumer)
      Once we got the JMS Session ID, we use it a key to look up in a Map to get a Thread in return.
      If there is such a thread, we use it.
      Otherwise, we get a new thread and associate it to the JMS Session ID.
      We can use a WeakValueHashmap (from jboss-common) with the targetID as the key and the thread as the weak value.

      Thread life cycle

      We get a new thread when we see a new JMS Session ID.
      However, we do not know a priori when all messages targeted to a JMS Session have been send or received.
      We can parse all the messages to find one corresponding to closing the JMS Session but is not a good idea to have to parse every message to identify this few messages.
      Instead, the threads can have a keep alive time (e.g. 60 seconds) before being reclaimed (and thus being removed from the WeakValueHashmap).
      If a thread associated to a JMS Session ID is reclaimed while the Session is still open, we create a new one next time we receive a message with this ID.

      We also need to have threads to execute JMS Connection messages.

      ID correlation

      For now, each JMS resources has a random UUID.
      We need to be able to deduce a JMS Session ID from one of its children ID (Producer, Consumer, QueueBrowser).
      Their IDs could be prepended by a JMS Session ID (e.g. <session ID>/<resource ID>).
      Or we can add a new attribute to the Packet interface (e.g. resourceID) which will be set either to a JMS Connection ID (for JMS Connections) or a JMS Session ID (for JMS Sessions, Producers, Consumers & QueueBrowsers).
      This resourceID will be used as the key to the WeakValueHashMap.
      This latest solution avoid parsing text for every message but adds a bit to the message size.

      wdyt?


        • 1. Re: JBoss 2.0 Thread model
          timfox

          dmlloyd said:


          "jmesnil" wrote : All JMS connections and their children (session, producer, consumer) to the same JMS server use a single MINA NioSocketConnector (i.e. one single TCP socket).

          This is very similar to what I'm doing with Remoting 3 ("R3"), and I've run into the same issues as you.

          In R3, the protocol consists of message datagrams tunneled over a single TCP connection. We have the additional issue that certain messages must be well-ordered with respect to certain other messages (though this isn't always the case; some messages can be processed in any order). In order to maximize throughput, we want to process messages in parallel whenever it's OK to do so.

          To achieve this, I've done two things. First, every incoming message is assigned to an Executor. Second, I've got a simple class called OrderedExecutorFactory which can produce Executor instances which execute in order with respect to that Executor, but in parallel with respect to other ordered Executors and unordered tasks. (see http://tinyurl.com/33b4gq)

          As an aside, by using Executors rather than a ThreadFactory, I can push the responsibility of maintaining a thread pool off to someone else; yet a simple implementation is still available in the standalone case by doing Executors.newCachedThreadPool() for example.

          Since the ordering of messages is a detail of the protocol, the messages have to be at least partially decoded before they can be assigned to an Executor. Therefore I don't bother with IoFilters or anything like that - I've found that it's simpler to just have an IoHandler and do all the message decoding and Executor delegation within the handler.

          I also hope to add support for SCTP in the future, which allows multiple independent streams within a single connection (among other cool features), reducing this "head-of-line" contention issue. However, while I've heard that there is an implementation in the works at Sun, it's still probably a ways off yet (unless one wants to use APR, which adds other dependencies as well).

          One thing I was thinking about as a possible performance enhancement would be to have more than one connection - not one per session necessarily, but maybe like 2-4 connections, using some type of load-balancing among them. This should give SCTP-like behavior (well, to a limited extent) with reduced head-of-line contention, but without a ridiculous explosion of connections.


          • 2. Re: JBoss 2.0 Thread model
            dmlloyd

            "accidentally", yeah right :-)

            • 3. Re: JBoss 2.0 Thread model
              timfox

              Jeff - I think it's best for your class to have no knowledge of what a "jms session" is.

              Instead, you can add a new attribute on Packet - executorID which just takes an int value (say).

              This value is used in the MINA handler to look up the "thread" (or executor or however it is done) for processing the packet after receipt.

              To your class it's just an id - it knows nothing about jms session ids.

              Before sending a packet you just set this value at our level. You make sure you set the same value for all requests on the same core session, and if the requests are on a connection or other object you can use its id.

              • 4. Re: JBoss 2.0 Thread model
                timfox

                 

                "dmlloyd" wrote:

                One thing I was thinking about as a possible performance enhancement would be to have more than one connection - not one per session necessarily, but maybe like 2-4 connections, using some type of load-balancing among them. This should give SCTP-like behavior (well, to a limited extent) with reduced head-of-line contention, but without a ridiculous explosion of connections.


                For JBM this is not really an issue. Our invocations are all guaranteed to take longer than some smaller finite amount of time, unlike remoting where it depends on the remoting user.

                • 5. Re: JBoss 2.0 Thread model
                  timfox

                  I have seen this problem come up many times in different guises.

                  What we're really talking about here is some kind of "sticky" thread pool executor.

                  Where you have a pool of threads to service requests, but consecutive invocations can have an affinity for a particular thread depending on some key (this corresponds to the "session id").

                  I'm kind of surprised j2se doesn't already contain something that can do this.

                  • 6. Re: JBoss 2.0 Thread model
                    timfox

                    One way of implementing this is having a set of BlockingQueues one for each "id" value.

                    Then you have a pool of workers that poll on these queues.

                    This is basically how a ThreadPoolExecutor works, but instead of a single queue feeding it, you have many queues.

                    I'd take a look at what Trustin has done in org.apache.mina.filter.executor.OrderedThreadPoolExecutor (which works similarly to how I've described)

                    You want to avoid having one thread per id value since you may have many thousands of id values.

                    • 7. Re: JBoss 2.0 Thread model
                      timfox

                      You can also add optimisations like caching the last blocking queue used and id seen to avoid a look up every time, since in many cases consecutive packets will be for the session value of "id".

                      • 8. Re: JBoss 2.0 Thread model
                        dmlloyd

                         

                        "timfox" wrote:
                        I have seen this problem come up many times in different guises.

                        What we're really talking about here is some kind of "sticky" thread pool executor.

                        Where you have a pool of threads to service requests, but consecutive invocations can have an affinity for a particular thread depending on some key (this corresponds to the "session id").

                        I'm kind of surprised j2se doesn't already contain something that can do this.


                        Yeah - well technically it doesn't have to be the same thread. The executor just has to guarantee that B "happens-after" A, in the same thread or a different one. I agree on the j2se statement (although it's really a trivial amount of code to implement; see the above link).

                        • 9. Re: JBoss 2.0 Thread model
                          dmlloyd

                           

                          "timfox" wrote:
                          One way of implementing this is having a set of BlockingQueues one for each "id" value.

                          Then you have a pool of workers that poll on these queues.

                          This is basically how a ThreadPoolExecutor works, but instead of a single queue feeding it, you have many queues.

                          I'd take a look at what Trustin has done in org.apache.mina.filter.executor.OrderedThreadPoolExecutor (which works similarly to how I've described)

                          You want to avoid having one thread per id value since you may have many thousands of id values.


                          This is essentially what I've done. My OrderedExecutorFactory is much simpler than the one in MINA (there's some IoSession-specific stuff in theirs so you can't really use it verbatim; mine is generic).

                          Since the link seems to have gotten "killed" in your "editing" session, here it is again: http://tinyurl.com/33b4gq

                          • 10. Re: JBoss 2.0 Thread model
                            timfox

                             

                            "david.lloyd@jboss.com" wrote:

                            This is essentially what I've done. My OrderedExecutorFactory is much simpler than the one in MINA (there's some IoSession-specific stuff in theirs so you can't really use it verbatim; mine is generic).

                            Since the link seems to have gotten "killed" in your "editing" session, here it is again: http://tinyurl.com/33b4gq


                            That's the kind of thing. Although I'd use a ConcurrentLinkedQueue or LinkedBlockingQueue to prevent having to synchronize on the whole list.

                            Also I'd caution against the "execute on current thread if queue is empty" optimisation since if the current thread is the thread that does IO from the selector this can keep that thread tied up for a long time, during which time it can't service any more IO requests.

                            • 11. Re: JBoss 2.0 Thread model
                              timfox

                              David - can you provide an example of how your OrderedExecutorFactory would be used in practice?

                              • 12. Re: JBoss 2.0 Thread model
                                dmlloyd

                                It's pretty simple. Every message is associated with an Executor. You have a "main" Executor, which is typically a plain thread pool, which does any work that can be done in an unordered fashion. New messages that aren't related to any previous messages would be processed by this executor directly. Then you create a single OrderedExecutorFactory with your "main" Executor as its parent.

                                Now if you know that all messages tagged with some ID must be processed in order, you get a new Executor from the factory and associate it with this ID. Then all your message processing for that ID is done sequentially. If there are some asynchronous things that can be done, you can always submit another task to the main Executor.

                                Also I'd caution against the "execute on current thread if queue is empty" optimisation since if the current thread is the thread that does IO from the selector this can keep that thread tied up for a long time, during which time it can't service any more IO requests.


                                Not sure what you're getting at here. There's no such optimization - basically the code says, "if this is the first time a task was added to the queue, then this executor is not running, so run it", and on removal, "if this was the last task removed from the queue, no need to run anymore".

                                As far as using a concurrent queue - perhaps you could squeeze out a tiny bit more performance, but unless it shows up as a blip on a profiler, I'm not going to bother because frankly I doubt that there will be significant (unwanted) contention for the lock. (Remember that the purpose is to serialize access, after all)

                                • 13. Re: JBoss 2.0 Thread model
                                  timfox

                                   

                                  "david.lloyd@jboss.com" wrote:

                                  Not sure what you're getting at here. There's no such optimization - basically the code says, "if this is the first time a task was added to the queue, then this executor is not running, so run it", and on removal, "if this was the last task removed from the queue, no need to run anymore".


                                  My mistake - I read the code too fast :)


                                  • 14. Re: JBoss 2.0 Thread model
                                    trustin

                                    We also need to think from the MINA transport standpoint. MINA could provie a meta-transport which maps arbitrary number of connections into arbitrary number of sessions. Of course, there should be underlying protocol implementation that manages the mapping, but I think it's trivial.

                                    Once such a transport is provided, both Remoting and Messaging could simplify their implementation. WDYT?

                                    1 2 Previous Next