JBAS-2957 prescribes the use of JGroups as the underlying communication mechanism for HA-JNDI AutoDiscovery.
For design background, the current implementation works as follows. The NamingContext attempts to discover an HA-JNDI server via multicast request using multicast client code located in the NamingContext class. The configurable elements include the partitionName, the multicast address and port, the binding port, timeout, and binding address. The NamingContext class has hard-coded defaults that can be overridden via the client's jndi.properties file. On the server side, the multicast responder is located in DetachedHANamingService. The configuration properties are similar and are located in the cluster.xml configuration file.
The purpose of the redesign is to replace the multicast code with JGroups code and to allow users to override the JGroups configuration as needed. One design assumption is that we don't want to introduce direct dependencies on JGroups into the NamingContext module. Note that it will be necessary to add jgroups.jar and jbossha.jar to client runtime classpaths for clients that seek to use AutoDiscovery in a cluster environment (assuming an alternative implementation isn't used).
The new design is basically as follows except for configuration modifications once JGroups multiplexing is available for use.
On the client side, the NamingContext will dynamically load a class implementing the org.jboss.naming.NamingDiscovery interface. The default implementation will be org.jboss.ha.jndi.HADiscovery. A new jndi.properties attribute will be available to allow for different (e.g., non-JGroups) implementations. No alternative implementation will be provided. Any jndi.properties attributes associated with AutoDiscovery (e.g., jnp.discoveryXXX) will be passed to the NamingDiscovery constructor in a Hashtable.
On the server side, DetachedHANamingService will start a thread running the JGroups-based server. The JGroups configuration will be configurable via cluster.xml.
One design issue is how to ensure that the HADiscovery client uses the same JGroups channel configuration that's used by the server. This will probably be accomplished by having it access DetachedHANamingServiceMBean via the MBean server. A client will also be able to override the JGroups stack on the client side via an entry in the jndi.properties file.
In the current multicast implementation, the AutoDiscovery configuration can be modified on the server side via various HAJNDI MBean attributes. When this is done, AUtoDiscovery will not work unless the client specifically adds the same changes to the client's jndi.properties file. Otherwise, the client will issue its multicast requests using values hardcoded into the NamingContext class. Since the attributes are simple (e.g., port and address), these changes are easy to incorporate into the client's jndi properties.
With the new implementation, the primary server side attribute is a JGroups stack configuration. On the client side, a matching stack is currently hardcoded into the client class and can be overridden by adding the stack to jndi.properties.
Since adding a JGroups stack to a properties variable can be easily misconfigured, it would preferable to have the client use the same stack as the server (i.e., instead of the hardcoded stack) unless a specific override is made in the properties file. In order to accomplish this, it would be necessary for the client to access the AutoDiscovery server's stack attribute via the HAJNDI MBean.
There are several issues with this alternative.
1) Performance will be affected. (needs to be measured)
2) It's not possible to access the MBean via JNDI lookup if the client doesn't configure a Provider URL since the client will then try to use AutoDiscovery to locate the MBean that AutoDiscovery can't find.
Any thoughts on whether it would be acceptable to require clients to override their JGroups stack if they override the server side configuration? Note that this task will probably be simplified when multiplexing is available and AutoDiscovery doesn't require its own stack.
Any other easy/fast way to access an MBean's attribute from a client without using JNDI?
This is a chicken-and-egg problem. We can't expect the client to contact the cluster to get the protocol stack; the purpose of AutoDiscovery is to find the cluster in the first place.
I think we need to shoot for a situation were 99.99% of the time the default hardcoded stack is used. To me this means that on the server side we shouldn't be trying to piggyback on the multiplexed channel used by the other services; that channel is by its very nature going to have a highly variable configuration, whereas the channel needed for autodiscovery s/b very simple.
For the use cases where another protocol stack is needed, how about we add a property that specifies a URL of a protocol stack file? We would need to handle classpath resources as well. The contents of this file would have to match the protocol stack on the server side; that's just a requirement.
From an e-mail:
I'd like to resolve the configuration issue as it's holding up completion of JBAS-2957/JBCLUSTER-43.
Is it important to expose the JGroups stack for AutoDiscovery? If I do expose the stack, users who override the server side stack in cluster-service.xml will need to provide a corresponding stack in the client's jndi.properties file. There are probably ways to facilitate this such as providing a client-side stack file that can be modified and then referenced via the properties file but it's not foolproof.
If we don't need to expose the stack, I can leave the existing AutoDiscovery attributes in place (e.g., Address, Port, BindingAddress) and just apply them to the hard-coded stack. In this scenario, it's also necessary for users to modify settings in two places but they already need to do this and the settings are simpler (e.g., port number vs. stack).
If I don't use an exposed stack, it's unclear whether I can use the multiplexer. I guess I can't use a multiplexed stack definition from the client in any case unless the stack definition is available on the client.
Looking at JBCLUSTER-43, there are 3 reasons listed for making this change:
1) existing impl has been source of bugs; JGroups discovery is well tested.
2) Allows more config options.
3) Remove code dupl.
Now, with the current impl we have something that in most use cases requires nothing at all from the end user, and in the more complex cases requires setting just 2 or 3 config options on each side (address, port, and occasionally partition name).
I don't think the benefits listed above outweigh the disadvantage of requiring a more complex config for the users who are adequately served by the existing config options. So I'm definitely -1 on something that doesn't support taking the existing properties (or the default values if not set) and using them to create a channel.
As to how to achieve benefit #2, alternate configs, I don't think there is any way to find out an alternate config from the server that doesn't defeat the purpose of autodiscovery. So that implies the end user having to provide the details of the alternate config on the client side.
When you mention providing a protocol stack file on the client side and then referencing it via jndi.properties not being foolproof, do you mean it's easy to screw up or that there are cases where it just won't work? If the former, I definitely agree and would love to find a better solution. But, OTOH people who would use this feature would likely have a more sophisticated environment and could be expected to deal with complexity.
Re: the multiplexer, I definitely want to get Bela's inputs on this. But, when I think of the use of the multiplexer on the server side, I see it being used for a shared channel for *inter-server* communications. With AutoDiscovery, we have clients communicating with servers. Very different use case and I think sharing the same channel for both is likely to cause problems. Having the ability to use a shared channel is fine as an option, but IMHO this will not be the typical use case. For a default, I'd like to see a hard-coded protocol stack where we substitute in the existing address and port properties, and use the partition name property to create the channel.
I want to be sure I understand how this is meant to work. Here is what I imagine as I write this; please advise where I'm wrong :)
1) When discovery is needed, a JChannel is created and configured.
2) Channel connects to the group, joins the group.
3) Application uses channel to make a request to X (the coordinator??) to find out who is providing HA-JNDI.
4) Channel is disconnected, therefore leaves group.
5) Address/port of HA-JNDI server is returned to NamingContext.
1) List of who is providing HA-JNDI is maintained using existing mechanisms.
2) Special channel is created for listening for and responding to discovery requests. It is *not* used for maintaining the list of servers providing HA-JNDI.
A question is, who does a client ask to find out who is providing HA-JNDI? The coordinator is not safe, as it is technically possible that the coordinator is another client! (Actually, we should probably configure client-side GMS so that doesn't happen). I guess if we can ensure no client will be coordinator, we can use state transfer to provide the list of servers providing HA-JNDI. If we can't rule out clients as coordinators, probably need to make an RPC call to all group members and integrate the responses.
Your description of the implementation is close. On the client side, the client multicasts the request to any server providing HA-JNDI. So it's not addressed to a coordinator and it's not going to be handled by another client process unless the process is a HA-JNDI server.
On the server side, a thread in DetachedHANmingService is listening for discovery requests. When it receives a request, it responds with its own HA-JNDI address. On the client side, the client accepts the first response and then discontinues waiting for further responses.
Be sure to test scenarios where there's more than one client in the group at the same time. Maybe make a mock client that is as much as possible like the normal one except it doesn't disconnect from the group, thus letting you see what happens if it is in the group and receives the multicast request.
The request and response messages are typed so that a recipient can distinguish between them.
Currently a client won't respond to another client's request because it only issues requests and then blocks for a response. It will ignore non-response message types. I need to check the client implementation and ensure that it will continue listening for a response (up to the timeout threshold) if it receives another message type. Good point!
Definitely gotta have unit tests of that kind of thing. Otherwise, once you think of everything and get it all correct, a year from now I'll come along, make some tiny tweak and break something very subtle. :-)
BTW, just to document this thought in this thread -- earlier I mentioned ensuring the client is not the coordinator and then using state transfer. GMS has a disable_initial_coord flag that could be set to ensure a client isn't the coordinator. But this is an overly fragile approach, e.g. if someone uses a custom protocol stack and copies the same stack on client and server side, they won't have this flag set anymore. Your approach is better.
Here's a revised proposal.
1) Store the JGroups stack in a static variable so that it's accessible from the client and server implementation classes.
2) Leave the existing variables (multicast port/address, binding address) as is
so that they can be applied to the static stack when configuring the channel. This will give users the same view of AutoDiscovery that they currently have while replacing the implementation under the covers.
3) Once the multiplexer is available, review how it might be incorporated as an option. For example, it might be utilized in an alternative configuration scheme where the user can specify a multiplexed stack on the server side. In this case, the user would probably need to specify a comparable stack in the jndi.properties file using a new variable.
Bottom line here - JGroups-based AutoDiscovery would expose the same properties as the current implementation so the user's configuration options would be unchanged. New client and server variables that would offer the use of a JGroups stack instead of the default configuration would would be documented but not configured by default.
I'll proceed with the static variable stack and substitution of the existing configuration parameters as described. When the multiplexer is available, I'll revisit the implementation and revive this discussion if warranted.
On the testing side, there are already two small sets of tests for AutoDiscovery. The Naming module provides single node tests that exercise AutoDiscovery while the Clustering module provides a different set that uses multiple nodes. I'll see if it's possible to introduce a test where a client receives a request while waiting for a response.
The legacy auto discovery also needs to be implemented if this is going to make it into 4.0, and should be done as a reference anyway.
I don't see how this jgroups implementation is an immeadiate improvement in terms of ease of use and reliability. That needs to be proven.
We also need an integration with the remoting discovery capabilities since this is really going to be the mechanism for all proxies, including naming.