Used of JGroups shared transport in AS 5
brian.stansberry Mar 19, 2008 4:00 PMFirst in a series of posts re: use of the JGroups "shared transport" in AS 5 instead of the JGroups multiplexer. There's been some discussion of this on the public jbosscache-dev mailing list, and too many private discussions, but this needs to go in front of a wider audience.
JIRA for this is http://jira.jboss.com/jira/browse/JBAS-5329
Shared JGroups Resources
The purpose of both the multiplexer and the shared transport is to make it possible for different services that need to use JGroups to share some of the resources JGroups uses. In AS 4 and earlier, each clustered service needed to open its own JGroups channel; no resources were shared. This has become a bigger and bigger issue as the number of clustered services has grown.
The resources most desirable for sharing are:
1) Network sockets. Sharing these between services simplifies configuration and administration and saves memory (i.e. network buffers).
2) Threads. A JGroups channel creates a thread pool for passing incoming messages up to the clustered service. A pool that is shared across services can more effectively manage the number of threads in use.
The JGroups multiplexer was the original way JGroups sought to provide sharable resources. Basically, an entire underlying JChannel was shared, with an adapter (Multiplexer+MuxChannel) on top that multiplexed/demultiplexed messages and passed them to the appropriate service. See http://www.jgroups.org/javagroupsnew/docs/manual/html_single/index.html#d0e2203 for details.
The shared transport was added in JGroups 2.6.2. Here the shared object is not an entire JChannel, but rather the transport protocol (UDP or TCP) that makes up the bottommost element in its protocol stack. The network sockets and the thread pool are all managed by the transport protocol, so just sharing this protocol lets us achieve the most desirable sharing. See http://www.jgroups.org/javagroupsnew/docs/manual/html_single/index.html#d0e2325 for more.
With both approaches, a key goal is that an application using JGroups does not need to know if it is using shared resources or not. The application codes to the abstract org.jgroups.Channel class' API; whether that API is implemented using a JChannel+Multiplexer+MuxChannel or by a shared transport JChannel or just by a plain JChannel should be transparent to the application.
Advantages of Shared Transport over Multiplexer
The shared transport has a number of major advantages over the multiplexer:
1) No "impedence mismatch". A number of JChannel behaviors, particularly around view management, need to be massaged/hidden from the application if the Multiplexer+MuxChannel is used. E.g. the underlying JChannel sees the group membership as {A, B, C}. But, if Service1 connects a MuxChannel on nodes A and B but not C, Service1 should get a view of just {A, B}. The Multiplexer+MuxChannel needs to have some pretty complex (and fragile) logic to resolve the impedence mismatch between what the JChannel says is the view and what the application needs to see as the view.
2) Testability. A transport protocol has a much more limited set of behaviors than a full JChannel. It also has a known and limited set of configuration options, whereas a JChannel is infinitely configurable. As a result, rigorous testing of sharing behavior is much more manageable with a transport protocol.
3) Greater configuration independence. Different services that wish to share a transport protocol need only agree on the configuration of that transport protocol. The rest of their protocol stack can be completely different. It was clear that getting agreement on a multiplexed channel's protocol stack for all AS 5 services was either not going to happen or would force a kind of lowest common denominator config.
4) FLUSH. The FLUSH protocol stops all activity on a channel for a period. Mostly happens around view changes and state transfer. This is somewhat disruptive to the service using the channel. With the multiplexer, a FLUSH initiated by Service1 also effects Service2, Service3 and Service4, with no benefit to those other services. With a shared transport, a FLUSH on Service1's channel is transparent to the channels used by Service2, Service3 and Service4.
Advantages of Multiplexer over Shared Transport
The one advantage of the multiplexer is improved startup times. Until a few members have joined a JChannel's group, the discovery sequence takes a few seconds. With the multiplexer there is only one JChannel so the discovery sequence only occurs once per VM. With shared transport, it occurs once per service.
I want to do some stuff to help mask this, particularly lazy initializing the AS's clustered caches. This will speed the standard start time; the cost of the discovery is only incurred if the user deploys things that need the clustered caches.
Bottom Line
I think the advantages of the shared transport far outweigh the disadvantages, and intend to use it in AS 5. Bela strongly agrees. We both lack confidence in the multiplexer as a reliable solution.