10 Replies Latest reply on Oct 22, 2006 9:53 PM by ovidiu.feodorov

    Remoting - Server-side Thread-per-connection Model

    ovidiu.feodorov

      The ServerInvoker has a bounded "ServerThread"s thread pool, the number of simultaneous invocations that can be handled at a time is limited by invoker's maxPoolSize configuration parameter. Once a "ServerThread" is allocated to a client connection, it will stay associated to that connection until the client closes the socket or a socket timeout occurs.This could lead to thread starvation in presence of a large number of clients. Increasing the number of server threads will draw down the performance because the large number of context switches.

      A non-blocking NIO model would probably more appropriate here. Other messaging projects use this approach already.


      Tom Elrod wrote:

      So guess would be good to cover scenarios and best approaches for them. The first scenario would be having a limited set (in the low hundreds) of clients that each regularly make invocations. In this scenario, having optimal throughput is probably the desired behavior and the current remoting code is designed for this. At the other end of the spectrum would be having a large number of clients (in the thousands) that each make periodic requests (meaning is not in tight loop that continually makes invocations). The best scenario for this is to use non-blocking i/o. The main reason for this is if use blocking i/o (besides running out of resources) will have thread per request and will be slow as balls because of all the thread context switching for all the concurrent requests. Then there are the scenarios in between (btw can add config so that remoting does not hold connection after invocation on the server side so that will free worker threads faster, which I am pretty sure is already in jira).


      Bill wrote:

      IIRC, Tom took a lot from the PooledInvoker. The PooledInvoker had a LRU queue that closed idle connections when one was needed so that there was no starving. Also, the PooledInvoker tried to avoid a large number of context switches by associating the client connection to a specific server thread. Therefore, no thread context switches were required. Of course, this design was predicated on "keep alive" clients. So if the use case was very "keep alive" oriented, it performed really well.


      Bill and Tom raised good issues. Where is the cutt-off line? For what number of clients an asynchronous non-blocking server-side approach becomes more efficient than the current model?

      Intuitively, one can imagine, and it can be probably very easily proven, that if only two clients connect to the server, then the thread-per-connection model is as efficient (if not more efficient) than the asynchronous approach. That probably doesn't stand true anymore for a 1000 clients.

      Moreover, the usage pattern for a typical JMS application is to open a connection and keep it open for the life-time of the client instance (unless one uses the anti-pattern of creating a Connection for each message sent). This seem to me a "keep alive" type of client.

      We could poll our users for usage patterns, that would be interesting data.

      So, in the end, it's not a matter of beliefs, but statistics. You optimize your implementation for the most common usage pattern. If the average usage pattern of a typical JMS application is to create 10-100 clients that keep sending messages periodically over a long-running Connection, I don't think it makes sense to even think about NIO at this point. Add to this that with the new distributed destinations, you can "spread" processing across nodes, so if you have 1000 clients and 4 machines, that yields 250 clients/node.

      The "right" long-term solution would be Remoting to support both patterns and make this configurable. That'll make everyone happy.

      Bill wrote:

      I think you should make a few prototypes and bench.

      Tim wrote:

      At the end of the day, we will be benchmarked against QPid, ActiveMQ and others and the bottom line is, it doesn't matter how wonderful our internal architecture is, we won't be able to touch them because the benchmarks will be decided on how fast we can push bytes over sockets, and we will lose right now. Period.


      Very good points. One more reason to start benchmarking during the early implementation phase, so we won't have surprises at the end. I am totally with you both on this.

      Tim wrote:

      Much as I hate to say it, our competition has it right when it comes to their approach to remoting, actually they all seem to pretty much do it the same way (apart from us).


      You're probably right. Do you have numbers? Maybe they're right, maybe they aren't. Would you bet your house without seeing numbers?

      Scott wrote:

      I agree that there is a conflict in terms of timeframe and existing features. I don't see that there is any inherent conflict with an efficient messaging transport and an rpc transport. Its just an issue of correctly layering the concepts. The only question in my mind is whether there are sufficient issues in the tradeoff between performance and the overhead of a layered transport implementation. At this point I don't see that we can make that call.


      Not exactly sure what you mean by "layered transport implementation". Could you please clarify?


        • 1. Re: Remoting - Server-side Thread-per-connection Model

           


          The "right" long-term solution would be Remoting to support both patterns and make this configurable. That'll make everyone happy.


          Absolutely


          Not exactly sure what you mean by "layered transport implementation". Could you please clarify?


          Remoting currently views things in terms of the Object world, meaning at the highest API level, it expects to be dealing with objects. Then as dive deeper in the guts of remoting were have transport and marshalling, the same theme still exists (although not as tightly constrained). Think what Scott is talking about is having the transport layer so that at lowest level is only dealing with raw data, then passes that data to a higher level, where would be converted into whatever format (an Object for example), then pushed up.

          This way, would be possible to just use that low-level transport layer where just dealing with raw data, instead of having the ?everything is an object? type view pushed on you by the higher level API.

          If this is what Scott is saying (and hope it is), then I agree.



          • 2. Re: Remoting - Server-side Thread-per-connection Model
            starksm64

             

            "tom.elrod@jboss.com" wrote:

            This way, would be possible to just use that low-level transport layer where just dealing with raw data, instead of having the ?everything is an object? type view pushed on you by the higher level API.

            If this is what Scott is saying (and hope it is), then I agree.



            Yes it is. The lowest level is a raw msg that works with the asynch io frameworks (NIO, APR, ...), another layer is the payload format, and another layer is marshalling/unmarshalling to objects. One needs to be able to plugin both the raw msg handling and payload format so that complete wire specifications like ampq and iiop (we talked about this long ago) can be handled.


            • 3. Re: Remoting - Server-side Thread-per-connection Model
              timfox

               

              "scott.stark@jboss.org" wrote:

              Yes it is. The lowest level is a raw msg that works with the asynch io frameworks (NIO, APR, ...), another layer is the payload format, and another layer is marshalling/unmarshalling to objects. One needs to be able to plugin both the raw msg handling and payload format so that complete wire specifications like ampq and iiop (we talked about this long ago) can be handled.


              I like this idea.

              For messaging we would probably just use the very lowest level and forget about the rest.

              However, is it realistic to get it implemented this way in our timescales?

              • 4. Re: Remoting - Server-side Thread-per-connection Model
                timfox

                 

                "ovidiu.feodorov@jboss.com" wrote:

                Bill and Tom raised good issues. Where is the cutt-off line? For what number of clients an asynchronous non-blocking server-side approach becomes more efficient than the current model?


                If we want to compete in the "enterprise" market (large telcos / banks) I suggest we need to support 1000s of connections, in which case I very much doubt a blocking approach is going to work well.

                If we're happy living in the small office, hobby user market then sticking with blocking might be an acceptable route IMHO.


                • 5. Re: Remoting - Server-side Thread-per-connection Model
                  timfox

                  There is also another issue here:

                  Currently the channels are architected according to a SEDAish style.

                  All the message adding and delivery operations for a particular channel are handled by the same thread, this means we reduce context switches because there aren't multiple threads blocking and unblocking to do operations on the channel.

                  Each channel has it's own "event queue" where delivery, or message handling operations can be deposited for execution on the queues thread. (Actually different queues can share the same thread but that is not relevant to the discussion).

                  Currently we only have a partial SEDAish approach since remoting does not allow us to write back to the socket using a different thread.

                  E.g. ideally we want to do something like this (using NIO):

                  Have a selector loop which responds to NIO events on the NIO channels, then depending which queue the event is destined for (the event could be a send, or an acknowledge or a cancel for instance), the selector loop thread would lookup the correct queue and deposit the event on the queues event queue.

                  For those events which can be asynchronous (e.g. ack, cancel) the request eventually gets processed by on the event queue thread and that is the end of that.

                  For synchronous operations e.g. send, the event queue thread would write back the response (non blocking) to the channel when the send was complete.

                  This gives a very nice usage of threads with a minimal amount of forced context switches - the only ones are in transferring the event from the selector thread to the event queue thread.

                  Currently since we're using blocking IO, and remoting needs to write the response back on the same thread as the request, in the case of send the following happens:

                  The server socket thread unblocks as the invocation arrives, the event is added to the queues event queue, then the server socket thread needs to block again until the send has been processed on the event queue (it uses a future for that). Therefore we have an extra context switch.

                  If we have a lot of requests then we have a lot of threads blocking/unblocking.

                  I suspect this would have performance implications.

                  • 6. Re: Remoting - Server-side Thread-per-connection Model
                    ovidiu.feodorov

                     

                    Tom wrote:

                    Remoting currently views things in terms of the Object world, meaning at the highest API level, it expects to be dealing with objects. Then as dive deeper in the guts of remoting were have transport and marshalling, the same theme still exists (although not as tightly constrained). Think what Scott is talking about is having the transport layer so that at lowest level is only dealing with raw data, then passes that data to a higher level, where would be converted into whatever format (an Object for example), then pushed up.


                    How about exposing access to raw data from the top-most level, specifically to allow speed optimizations like sending fast, high-throughput (that means no serialization or marshalling) acknowledgments, something that we need in Messaging, for example?

                    Tim wrote:

                    For messaging we would probably just use the very lowest level and forget about the rest.


                    No. We also need invocations (high level). See http://www.jboss.org/index.html?module=bb&op=viewtopic&t=92954

                    Tim wrote:

                    However, is it realistic to get it implemented this way in our timescales?


                    I don't think it's so complicated. The implementation doesn't need to be perfect from the first try. We only need to get the API right.

                    Tim wrote:

                    If we want to compete in the "enterprise" market (large telcos / banks) I suggest we need to support 1000s of connections, in which case I very much doubt a blocking approach is going to work well.
                    If we're happy living in the small office, hobby user market then sticking with blocking might be an acceptable route IMHO.


                    How about we have both? And a huge red double-pole-service-disconnect kind of breaker saying on one side "Enterprise" and on the other "Hobby User"? :)
                    Seriously now, I don't disagree at all that high end goes with NIO. The doubts I have are related to how NIO performs in the case when you have just two clients that keep sending messages for a month on a "keep-alive" connection. But I think we already agreed on the fact that we need to bench.


                    • 7. Re: Remoting - Server-side Thread-per-connection Model
                      timfox

                       

                      "ovidiu.feodorov@jboss.com" wrote:

                      How about we have both? And a huge red double-pole-service-disconnect kind of breaker saying on one side "Enterprise" and on the other "Hobby User"? :)


                      Well we could have both, but it seems like a waste of effort to me.

                      AFAIK performance of NIO for small numbers of connections shouldn't be any different to blocking IO.

                      I don't think there's anything intrinsic to NIO that would make it slower. Perhaps the scare stories in this area are related to applications making poor use of NIO rather than NIO itself.

                      I would just concentrate on the NIO personally.

                      • 8. Re: Remoting - Server-side Thread-per-connection Model
                        ovidiu.feodorov

                        The only thing on the short term, is that we have a working blocking implementation, and if we get the API right, we can replace it with NIO later.

                        • 9. Re: Remoting - Server-side Thread-per-connection Model
                          timfox

                          is it possible to just swap it out?

                          Surely the semantics of blocking and non blocking would imply a different API?

                          • 10. Re: Remoting - Server-side Thread-per-connection Model
                            ovidiu.feodorov

                            This is the wiki document that contains proposed Remoting API extensions: http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossMessagingRemotingAPIExtensions

                            Once materialized, it should isolate Messaging from future Remoting implementation changes.

                            Please comment.