11 Replies Latest reply on Jun 23, 2006 1:35 PM by Tom Elrod

    General Discovery Service in Clustering

    Brian Stansberry Master

      The Remoting project includes a service discovery framework (Detectors). AS Clustering also includes a service simple discovery mechanism in HA-JNDI AutoDiscovery. As discussed at
      http://www.jboss.com/index.html?module=bb&op=viewtopic&t=80096 the HA-JNDI needs to use at least the same abstractions used by Remoting.

      Tom Elrod made the following comment re: discovery in Remoting:

      My preference would be for this to live within clustering project (which would become separate from JBossAS code base). Remoting would then depend on that project and *all* discovery implementation code removed from remoting project. I would be happy to migrate any remoting discovery stuff from remoting if think it would be of value, but end goal is for remoting project to be void of it.


      The purpose of this thread is to discuss the use cases of and design for the discovery component. I'll open a separate thread to discuss packaging issues related to having a clustering-based discovery separate from both the Remoting and JBossAS code bases.

        • 1. Re: General Discovery Service in Clustering
          Tom Elrod Master

          Assuming that we move ahead with this discovery project, one of the features I would like to see added to it would be the ability to distinguish between a new server coming online and a crashed server coming back online. This would require having some way to persist the identity of the crashed server so when it starts back up, can use the same identity (see http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3939003#3939003 for more info as related to remoting). Of course would need to make so this feature could be turned on/off.

          • 2. Re: General Discovery Service in Clustering
            Jerry Gauthier Apprentice

            Which modules require a discovery mechanism and how do they currently implement discovery? I've listed those that have been mentioned in this and other threads. Please correct any mistakes that I've made in describing the existing implementations.

            1) Naming Service. Uses its own discovery implementation to locate an HA-JNDI server when AutoDiscovery is enabled and a remote client doesn't specify an HA-JNDI server in its configuration. This implementation uses IP multicast with the client code located in the Naming module and the server code located in the Clustering module. This implementation uses independent configuration of the multicast address:port on the client and server sides. It also uses independently defined default values for cases where the user doesn't specify an address/port. This discovery is limited to a single HA-JNDI server; after discovery, the smart proxy implementation (HA?) is responsible for locating other HA-JNDI servers and directing requests to them.

            2) Server Module RetryInterceptor. Uses JNDI to reestablish a proxy. If a naming server can't be located, it relies on the Naming Service to locate and use an HA-JNDI server. This appears to be the same use case as the one described above for the Naming Service.

            3) JBoss Operations Network. This product needs to discover all nodes in a cluster. I'm not familiar with it so I can't comment on how it currently achieves this.

            4) Remoting Service. Uses its own implementation when required to discover remoting servers. Two implementations are available, one is multicast based and the other is JNDI based. The client and server also require the use of JMX in this service's discovery implementation. This implementation is more comprehensive than the HA-JNDI one as it allows clients to locate all remoting servers and track changes in the Remoting network topology. The client and server can be configured independently but default values are defined in a common class.

            • 3. Re: General Discovery Service in Clustering
              Jerry Gauthier Apprentice

              I know this is low priority right now (at least for Clustering work) but I'm trying to characterize the basic use cases for discovery.

              If I have a separate client process that uses Naming or Remoting AND I'm running in a cluster environment, I may not know the location of a server offering Naming or Remoting services since the network topology would be dynamic.

              One simple way to identify a server of interest would be to broadcast a request using multicast. This would require both client and server to agree on the multicast adddress:port but wouldn't require further common configuration details (if I recall correctly). Of course, the servers would need to be actively listening for such requests.

              JGroups could be used to provide this service but would be heavier weight than basic multicast code; it would presumably be a more hardened implementation and possibly less efficient due to the additional overhead. In theory, JGroups could be interchanged with a pure multicast offering by hardcoding the stack and only exposing the multicast address:port.

              The Naming and Remoting services utilize their own multicast implementations to provide discovery services for their clients. One difference in the implementations is that the Naming service client only cares about the first server to respond while the Remoting service client is designed to maintain a list of available servers. The Naming service client doesn't explicitly maintain a list of available servers as this is handled transparently (to the Naming service) through smart proxies.

              Another difference is that the Remoting sevice's detection code is modularized while the Naming service's discovery code is embedded in the client and server classes.

              The Remoting service also offers JNDI based discovery. I couldn't find an example of how this worked and I was unsuccessful in getting it to work. I suspect I either didn't understand how it works or I missed something in configuring it. If I understand the basic premise correctly, it seems that the use case here would be that a remote client running in a cluster was aware of a Naming server but not of a Remoting server. In this case, the client could look up the server's identity rather than issue a multicast request for it. (Tom - is this correct?)





              • 4. Re: General Discovery Service in Clustering
                Tom Elrod Master

                My ideal would be to have discovery service where could plug in the mechanism used (jgroups, jndi, etc.) via configuration. If no configuration used, then would use simple multicast as the default. Might even leave multicast default on server, even if configured otherwise, in case default (non-configured) client tries to do discovery.

                Whether discovery stops at first found remote target or waits to get all could be configurable as well (maybe first found being the default).

                As for Naming, I would like to rewrite to use remoting as transport (or unified invoker at very least). So the steps would be 1. lookup remote targets using discovery then 2. use selected target for making naming calls (basically just like now, but making more generic).

                For remoting JNDI detector example, can checkout JBossRemoting project (at HEAD) and there is an example under org.jboss.remoting.samples.detection.jndi package.

                • 5. Re: General Discovery Service in Clustering
                  Bela Ban Master

                  I think a pluggable mechanism is very much what is needed. For example, IP multicast will most of the time not be permitted for a client looking up a server via a WAN, and not all clients and servers are in the same network.
                  Making this pluggable allows the user to pick the right mechanism

                  • 6. Re: General Discovery Service in Clustering
                    Brian Stansberry Master

                     

                    "tom.elrod@jboss.com" wrote:

                    Whether discovery stops at first found remote target or waits to get all could be configurable as well (maybe first found being the default).


                    I think are 2 issues.

                    1) What does the client using discovery expect to get back when he asks for who is providing a service? I would generally think that would be a list of all known providers.

                    2) How that list is provided is an implementation detail of the particular discovery impl. I'm imagining a couple flavors of that:

                    a) servers are responsible for tracking the cluster topology for the service (a la the current DRM). A discovery impl designed for this kind of env would return the list provided by the first server that responded.

                    b) servers only know about themselves. A discovery impl designed for this kind of env would need to aggregate the responses from the various servers. AIUI the multicast discovery in Remoting works like this. The JNDI-based discovery in Remoting is kind of a hybrid; the client aggregates the list of who provides the services by knowing how to query the JNDI server; the servers help out by publishing data in the correct location.

                    The current HA_JNDI discovery is a bit different from the above. It only returns a single server address, rather than a list. But, it could have returned a list if we'd written it that way; the server-side knows who all the servers are that are providing HA-JNDI.

                    • 7. Re: General Discovery Service in Clustering
                      Tom Elrod Master

                      So looks like could be broken out into two different modes. The first would be where discovery client would do a one time ping for servers (meaning would start, try to find 1 or more discovery servers, return result, and shutdown). The second would be where discovery remains running the background, monitoring topology changes (where could reuse api from the previous mode, except stay running and put result into internal registry that can be queried). Sounds like would be good to start putting a requirement doc somewhere?

                      Another issue, which is more troublesome, is what will the result returned from a server discovery be? If bandwidth/performance wasn't an issue, would be great if was an instance containing full tree of clustered components living on that particular server. For example, might look something like:

                      DiscoveryServerDetection
                       - Identity
                       - InetAddress
                       - PartitionInfo
                       - HashMap<String, List>
                       - "ejb", HATargets
                       - "remoting", InvokerLocators
                       - "jms", [whatever]
                       ...
                      


                      Which would basically be a mapping of clustered components registered in the discovery server. Of course there are some issues here such as the client being able to know what the value types are for each of the sub-components (i.e. "remoting" value containing a List of InvokerLocators). Is also issue of this list potentially being *very* big containing complex, large objects. Mabye could also have this be configurable as to if just get the root detection message containing only the Identity or if should get everything.



                      • 8. Re: General Discovery Service in Clustering
                        Bela Ban Master

                        +1 on capturing the requirements in a document somewhere. The wiki maybe ?

                        • 9. Re: General Discovery Service in Clustering
                          mazz Master

                          Just wanted to chime in here. The new ON infrastruture is currently using the remoting discovery mechanism and so would need to migrate over to this new mechanism. Make sure the APIs are as compatible as possible :-)

                          As for requirements we need: the only thing we need today is that we are told when a new server *invoker* comes online and offline and what its InvokerLocator is. We don't need to know anything else other than that.

                          Today we listen for the network registry's notifications (ADDED, REMOVED, UPDATED). If the semantics of the NetworkRegistry notifications doesn't change, then the ON code probably won't have to change much.

                          Tom - is it your thoughts that Network Registry, MulticastDetector, etc. are going away or just moving to a new project? I would assume for backward compatibility, they will live somewhere (whether in the remoting project or not doesnt' matter).

                          • 10. Re: General Discovery Service in Clustering
                            Tom Elrod Master

                            My initial thinking is that at least the core registry of remoting servers would be maintained within this new discovery service (meaning that when a new remoting service comes online, it would be registered and when it goes away, would be unregistered). The code that converts and fires these events to indicate a remoting server has come online or died in a format like it does now would probably live within remoting project.

                            • 11. Re: General Discovery Service in Clustering
                              Tom Elrod Master

                              Brian and spent a few hours talking through this as JBossWorld. We came to the conclusion that it is not worth the effort to do the discovery service. The main reason is would have to basically re-write most of the clustering code. Currently a great deal of the behavior that is needed is implicitly supplied by jgroups, which is mainly ensuring the registry views are in sync and versioned. To allow this in a generic manner would require adding some transport mechanism outside the actual discovery, which would basically emulate what jgroups is doing now.

                              Only thing that will probably come out of this idea is to allow same concept of initial discovery within HAJNDI when remoting is used to replace the default transport being used now within HAJNDI. Will then allow use of remoting discovery to find initial target HAJNDI servers.