7 Replies Latest reply on Sep 14, 2006 1:56 PM by tom.elrod

Have remoting support a non invocation based model

timfox Sep 12, 2006 6:22 PM

Hi Tom et al

As we move further ahead with JBoss Messaging, we're going to need to provide a non blocking approach to processing requests on the server.

I.e. on the server side:

Read some data (non blocking) from the socket and pass it to JBM to process it.

Some time later (if the request requires a response) and probably on a different thread then JBM needs to write some data onto the socket.

The reads and writes need to be decoupled, so we can provide SEDA or related style processing models.

Currently, remoting seems to exclusively work with a blocking, invocation based model.

What I mean by this is that a thread blocks until the request arrives, the request is executed and a response is written back, all using the same thread. So the read and write is coupled - (this is the "invocation")

Similarly on the client side, for many of our use cases we simply want to write to the socket and return immediately - we do not want to send an "invocation" and wait for a response.

The current remoting model client model does not seem to support this.

I believe the request to extend the Client API to support an asynchronous send has already been logged with a JIRA task.

In order to provide the desired server side functionality, I guess we would need the remoting API to be extended so that we can write onto the socket, decoupled from the invocation.

Also we would need to provide an invocation type that didn't write a response back onto the socket.

Wdyt?

1. Re: Have remoting support a non invocation based model

ovidiu.feodorov Sep 12, 2006 6:41 PM (in response to timfox)

It is my understanding that Remoting team has this on their agenda. For the time being we are using a request-response model, and we will migrate to an asynchronous model as soon as it becomes available.

This is a local change, it won't affect our functional tests suite (or it shouldn't, at least). It can only improve perfomance.
Actions
2. Re: Have remoting support a non invocation based model

timfox Sep 13, 2006 5:18 AM (in response to timfox)

"ovidiu.feodorov@jboss.com" wrote:
It is my understanding that Remoting team has this on their agenda.

Is there a JIRA task for this?
Actions
3. Re: Have remoting support a non invocation based model

timfox Sep 13, 2006 5:38 AM (in response to timfox)

BTW - I am not talking about the task for implementation of the APR transport.

I'm talking about extending the remoting API so it can deal with any abstract non invocation based model - concrete implementations of this could use APR, Java NBIO or whatever.

In my mind the base abstraction for this would be a bi directional "channel", either end of which can have bytes read from/written to in a non blocking fashion.

This is similar to what ActiveIO does, and Apache MINA (although wraps it with a lot more abstractions)
Actions
4. Re: Have remoting support a non invocation based model

tom.elrod Sep 13, 2006 3:35 PM (in response to timfox)

Below is list of possible scenarios for making remote calls through remoting.

Legend:
thread blocks = --|
thread returns = -->

1. synchronous call

caller thread -- remoting client --| -- NETWORK -- pooled processing thread -- handler

Calling thread goes through remoting client call stack till makes network call and blocks for response. The pooled processing thread will call on the handler, get the response from the handler and write it back to the network where will be picked up by the blocking caller thread.

2. asynchronous call - client side

caller thread --> remoting client (worker thread) --| -- NETWORK -- pooled processing thread -- handler

Calling thread makes invocation request and returns before making network invocation. Remoting client pooled thread takes invocation and makes call over the network. From here is same as case 1, but response is just thrown away.

3. asynchronous call - server side

caller thread -- remoting client --| -- NETWORK -- pooled processing thread --> pooled async processing thread -- handler

Calling thread goes through remoting client call stack till makes network call and blocks for response. The pooled processing thread will hand off invocation to a pooled async processing thread and will return (thus unblocking the calling thread on the client). The pooled async processing thread will call on handler, get response, and throw it away.

4. non-blocking asynchronous call

caller thread -- remoting client --> -- NETWORK -- pooled processing thread -- handler

Calling thread goes through remoting client call stack till makes network call where will only write to network, but not wait (block) for server response (see http://jira.jboss.com/jira/browse/JBREM-548). The pooled processing thread will call on the handler, get the response from the handler and write it back to the network. Not sure yet what will have to be done for this implementation as don't know if will be a problem with pooled processing thread sending data back to network with no one on the other side.

Note: in the above scenarios, there is actually an accept thread on server that gets socket from server socket and passes onto a pooled processing thread and goes back to listening for next socket request. Have removed it from thread stack diagram to make easier to read.

1 - 3 are already available within remoting today. 4 is scheduled to be implemented. For 2 - 4, only getting request to server is covered. Getting response back to client will require callbacks. Also important to remember that remoting has one API that all the transport implementations support. In order to change that API for new desired behavior, all the transports must be able to support it (how it supports it is an implementation detail).

As for sending raw data, this is possible to do on the client side by using the Client.RAW property in the invocation metadata map (which will send only the raw payload object and not wrap in InvocationRequest object). However, this will only be honored by the remoting http server (CoyoteInvoker). The other remoting servers (i.e. socket server invoker), will throw an exception when it does not get payload of type InvocationRequest. I can change the code for the other server invokers to behave like the CoyoteInvoker so can accept raw payload objects. The only issue with this is then loose all the extra remoting info stored within the remoting InvocationRequest (such as client sessiond id and subsystem). So this means that won't be able to a) have multiple handlers registered with connector as no way to determine which subsytem to route call to and b) be able to determine which client made call within handler based on InvocationRequest passed. Would also need to make sure don't use the Client.RAW when using addCallbackListener() method or will loose client session id and no way to setup callbacks on server side. When using RAW for http, currently work around this by using data from the http header to provide the client session id and subsystem.

The same argument goes for server response being of type InvocationResponse, in that it contains extra metadata than can be used on the client side. However, can change the other transports to behave like the http server (in that if the request payload is not of type InvocationRequest, when will not wrap the handler's response object in a InvocationResponse).
Actions
5. Re: Have remoting support a non invocation based model

starksm64 Sep 13, 2006 6:25 PM (in response to timfox)

At some level we should align with the java.util.concurrent.Executor/ExecutorService/Future/CompletionService/ as a simple api for asynchronous invocations. This still applies to the invocation based model.

I view what Tim is asking for as more of an expansion of the transport channel abstraction spi that essentially allows for more interception of raw transport packets.
Actions
6. Re: Have remoting support a non invocation based model

timfox Sep 14, 2006 5:17 AM (in response to timfox)

"tom.elrod@jboss.com" wrote:
Below is list of possible scenarios for making remote calls through remoting.

Legend:
thread blocks = --|
thread returns = -->

1. synchronous call

caller thread -- remoting client --| -- NETWORK -- pooled processing thread -- handler

Calling thread goes through remoting client call stack till makes network call and blocks for response. The pooled processing thread will call on the handler, get the response from the handler and write it back to the network where will be picked up by the blocking caller thread.

2. asynchronous call - client side

caller thread --> remoting client (worker thread) --| -- NETWORK -- pooled processing thread -- handler

Calling thread makes invocation request and returns before making network invocation. Remoting client pooled thread takes invocation and makes call over the network. From here is same as case 1, but response is just thrown away.

If you are going to throw it away, why write it on the server side in the first place? Seems wasteful.

3. asynchronous call - server side

caller thread -- remoting client --| -- NETWORK -- pooled processing thread --> pooled async processing thread -- handler

Calling thread goes through remoting client call stack till makes network call and blocks for response. The pooled processing thread will hand off invocation to a pooled async processing thread and will return (thus unblocking the calling thread on the client). The pooled async processing thread will call on handler, get response, and throw it away.

Again - why write it in the first place?

4. non-blocking asynchronous call

caller thread -- remoting client --> -- NETWORK -- pooled processing thread -- handler

Calling thread goes through remoting client call stack till makes network call where will only write to network, but not wait (block) for server response (see http://jira.jboss.com/jira/browse/JBREM-548). The pooled processing thread will call on the handler, get the response from the handler and write it back to the network.

Again, writing the response seems pointless. Only causes extra traffic and context switches.

Not sure yet what will have to be done for this implementation as don't know if will be a problem with pooled processing thread sending data back to network with no one on the other side.

So why send it?

Note: in the above scenarios, there is actually an accept thread on server that gets socket from server socket and passes onto a pooled processing thread and goes back to listening for next socket request. Have removed it from thread stack diagram to make easier to read.

1 - 3 are already available within remoting today. 4 is scheduled to be implemented. For 2 - 4, only getting request to server is covered. Getting response back to client will require callbacks. Also important to remember that remoting has one API that all the transport implementations support. In order to change that API for new desired behavior, all the transports must be able to support it (how it supports it is an implementation detail).

If the API is extended, surely existing transports are unaffacted, they can throw UnsupportedOperationException for any new methods, which won't be called by old user code anyway

I guess I should clarify where we are coming from here.

In messaging we need to provide very high throughput, in a different league to your normal ejb installation. We're talking up to 100s of thousands of messages per sec (depending on network) and these are 1 or 2K messages.

When we are benchmarked against our competitors it doesn't really matter how much our core code is optimised if we aren't efficiently handling the network transport. This is where we will be killed.

Any extra reads or writes or threads blocking when they don't need to do will contribute to that.

A request-response model is great for RPC style usage patterns, but IMHO doesn't really suit what we need in messaging.

Also in the future we probably need to provide wire format compatibility with other protocols so our requirements are very specific:

For our socket transport we need a single TCP connection that can be read from and written to in a non blocking fashion from both ends.

On the server side we want to do something like the following:

1 Data is read (non blocking) from channel by acceptor thread and work is handed off to worker.
2 The work may be passed between one or more worker threads each of which is specialised for a particular type of work. Each worker threads basically goes around in its own loop. This is basically a SEDA style model and gives us great throughput and scalability w.r.t. number of concurrent "requests" since there is no thread per request and there are far less context switches. We already have the SEDA machinery in place in JBM, the last piece of the puzzle that is missing is the support from remoting for the non blocking functionality.
3 For some incoming data it may be necessary to write some outgoing data back to the socket. Note this is done on a completely different thread to the acceptor thread and the acceptor thread may have served many more incoming data in the mean-time.

It's also crucial to us that we only use a single TCP connection and concurrent requests are multiplexed over that -i.e. we don't want a socket pool as is currently the case with the socket transport.

Actually, if the channel abstraction is bidirectional then multiplexing becomes straightforward too. You simply need to wrap the data requests and responses in a packet with a header identifying the "logical" connection and correlate them on receipt.

I don't think it would be hard to write such a transport (in fact some tools such as Apache MINA make it almost trivial - although I am not sure we should be using that library) and we in JBM would love to do so and contribute it back to remoting.

The problem I have right now is that the remoting's conceptual model of invocations seems so far removed from the model we require I don't know how we could shoe-horn it in to fit.

There is some analogy to the servlet API here. The servlet API was designed a long time ago when everyone was using the blocking IO api (there was no Java NBIO of course) to write server applications that had the classic thread per request, blocking on accept, thread pool model.

As we all know, since the servlet API basically assumes a request/response model it makes it very hard (impossible) to reconcile with a decoupled request/response approach. So basically any servlet applications are doomed to not really benefit from non blocking IO.

Remoting also assumes an invocation based model, so IMHO suffers from the same problems.

My personally opinion is that you guys you should extend the great work you have done with remoting to date by extending the API to support the newer approach, otherwise it's going to be hard for high performance applications like us to use it. And this will be more so going ahead, as more people throw out their blocking IO.
Actions
7. Re: Have remoting support a non invocation based model

tom.elrod Sep 14, 2006 1:56 PM (in response to timfox)

If you are going to throw it away, why write it on the server side in the first place? Seems wasteful.

Correct, is wasteful. Is this way because was quickest way to get it done and have confidence would work correctly instead of introducing new code and execution path. Can be remedied as part of JBREM-548.

If the API is extended, surely existing transports are unaffacted, they can throw UnsupportedOperationException for any new methods, which won't be called by old user code anyway

If the API is extended, the transports need to support that change in that code base (this is the point I was making). Correct about previous versions being fine. But if add API that is not support by all transports, then defeats one of the main purposes behind remoting (which is common api regardless of how implemented in different transports).

In regards to how server side works (at least for socket server invoker) you should really read http://labs.jboss.com/portal/jbossremoting/docs/guide/ch05.html#d0e1120

It's also crucial to us that we only use a single TCP connection and concurrent requests are multiplexed over that -i.e. we don't want a socket pool as is currently the case with the socket transport.

This is controlled by clientMaxPoolSize property (see http://labs.jboss.com/portal/jbossremoting/docs/guide/ch05.html#d0e1052). If only want to allow one connection per client, then can set to 1 (although think code currently allows value greater than 1, but easy to fix that).

As for blocking IO, not too difficult to update remoting so calling threads return immediately for one-way (async) calls and make it so server side throws away result before even reaching network. However, for async callbacks, this is more difficult in that more infrastructure is required as have to know how/where to make the callback and how to deliver it once there (which is what remoting callbacks are for). If you have any suggestions for improving the callback part, feel free to express them. Just remember that I need to be able to support the API/model over all the different transports.

As for multiplex channel, if want to write your own, use MINA or whatever, be my guest.
Actions

Go to original post