1 2 Previous Next 21 Replies Latest reply on Apr 17, 2008 5:01 PM by galder.zamarreno

    JBAS-4919 - ha singletons in heterogenous topologies

    galder.zamarreno

      Re: http://jira.jboss.com/jira/browse/JBAS-4919 and
      http://www.jboss.com/index.html?module=bb&op=viewtopic&t=127194

      For the moment, I'd like to focus the discussion of this JIRA in the context of JBoss 5, where we can still change the API for HASingletonElectionPolicy. As far as ha singleton election is concerned, as Brian suggested,

      boolean isElectedMaster(HAPartition partition);

      should change to
      boolean isElectedMaster(List<ClusterNode> newReplicants);


      This is the method that HASingletonSupport.partitionTopologyChanged(List newReplicants, int newViewID) calls to decide the elected master.

      I'm a bit concerned however about the suitability of other internal singleton selection related methods in HASingletonElectionPolicy's public API:

      ClusterNode pickSingleton();
      ClusterNode pickSingleton(HAPartition partition);
      boolean isElectedMaster();


      As far as I can see from the code, these are only used by the convenience HASingletonElectionPolicyBase class which I assume was created to make the life easier to people wanting to provide their own election policies. Users could choose not to go down this route and implement directly HASingletonElectionPolicy, which is why I don't think we should force them to implement these 3 methods as they will never be called. IMO, they should belong to HASingletonElectionPolicyBase rather than HASingletonElectionPolicy. The only relevant method users should implement is
      boolean isElectedMaster(List<ClusterNode> newReplicants);


      Regardless of the whether these 3 methods live in HASingletonElectionPolicy or HASingletonElectionPolicyBase,

      pickSingleton(HAPartition partition);


      needs changing in the same way that isElectedMaster(HAPartition partition); did to

      pickSingleton(List<ClusterNode> newReplicants);


      Thoughts?

        • 1. Re: JBAS-4919 - ha singletons in heterogenous topologies
          brian.stansberry

          Yes, makes sense to have the interface conform to what is actually used by external callers, with pickSingleton taking the newReplicants list.

          First, have a look at the original discussion on http://www.jboss.com/index.html?module=bb&op=viewtopic&t=84307 to see if anything there triggers more thinking. From that, I think I see what happened. My original idea was the policy would have a ref to HAPartition and would know the target service name (i.e. what key the replicants are under in DRM). With that a call to a simple no-args isElectedMaster() method could allow the policy to access the DRM (via HAPartition) to get the current replicants. But I think the idea that the selection was based on the replicant list rather than the view got lost.

          Anyway, passing the replicant list as a method param makes more sense. If a policy impl doesn't want to use the param it doesn't have to.

          • 2. Re: JBAS-4919 - ha singletons in heterogenous topologies
            galder.zamarreno

            Thanks for the link, that explains better the role of pickSingleton and isElectedMaster.

            "bstansberry@jboss.com" wrote:
            Yes, makes sense to have the interface conform to what is actually used by external callers, with pickSingleton taking the newReplicants list.


            Hmmm, do you wanna keep

            boolean isElectedMaster(List newReplicants);

            or

            ClusterNode pickSingleton(List newReplicants);

            or both?

            In the thread link you mention, you mention:

            "bstansberry@jboss.com" wrote:
            I like getting Address back instead of boolean -- that way the singleton knows who the master is. I'd keep a boolean version as well though, as a convenience, i.e.
            public boolean isElectedSingleton(List node) {
            return pickSingleton(nodes).equals(partition.getLocalAddress());
            }


            HASingletonElectionPolicy should probably only select who should ran the singleton, that's its job, and HASingletonSupport can easily decide whether the singleton selected is the clustered node where HASingletonSupport code is running or not.

            I'm not sure about the convenience of keeping the boolean version. By defining the boolean version in the interface as well, you are forcing people to implement it when you already know what the implementation will be. IOW, someone implementing this could provide a valid pickSingleton() implementation and an invalid implementation for isElectedMaster(). Sounds like a potential source for inconsistencies in implementations and double work vs avoiding an equals() call in HASingletonSupport.

            "bstansberry@jboss.com" wrote:
            First, have a look at the original discussion on http://www.jboss.com/index.html?module=bb&op=viewtopic&t=84307 to see if anything there triggers more thinking. From that, I think I see what happened. My original idea was the policy would have a ref to HAPartition and would know the target service name (i.e. what key the replicants are under in DRM). With that a call to a simple no-args isElectedMaster() method could allow the policy to access the DRM (via HAPartition) to get the current replicants. But I think the idea that the selection was based on the replicant list rather than the view got lost.


            I see. So, maybe we're creating more problems for ourselves by having those setHAPartition/getHAPartition and setManagedSingleton/getManagedSingleton? Getting the List might be all we need to actually decide the singleton to run in, unless there's information that is available through DRM and not through ClusterNode that could potentially be used to make such decision at some point.

            WDYT?

            • 3. Re: JBAS-4919 - ha singletons in heterogenous topologies
              brian.stansberry

               

              "galder.zamarreno@jboss.com" wrote:

              I'm not sure about the convenience of keeping the boolean version. By defining the boolean version in the interface as well, you are forcing people to implement it when you already know what the implementation will be. IOW, someone implementing this could provide a valid pickSingleton() implementation and an invalid implementation for isElectedMaster(). Sounds like a potential source for inconsistencies in implementations and double work vs avoiding an equals() call in HASingletonSupport.


              OK, makes sense to me.

              I see. So, maybe we're creating more problems for ourselves by having those setHAPartition/getHAPartition and setManagedSingleton/getManagedSingleton? Getting the List<ClusterNode> might be all we need to actually decide the singleton to run in, unless there's information that is available through DRM and not through ClusterNode that could potentially be used to make such decision at some point.


              Those are needed. A design goal here is to make it possible for a custom policy impl to base its decision on factors beyond the simple topology info:

              1) Can even call into cluster, for example to have some sort of vote. Need HAPartition.
              2) Can use some arbitrary info from the singleton service itself.

              My expectation here is any custom impl will be based off the base class, so we're handling the trivial stuff of storing the partition and singleton service refs. And if someone makes there own impl not using the base class and can't get that part right, well ....


              Another thing to be aware of: the arg passed to HASingletonSupport.partitionTopologyChanged is really List newReplicants, where the list elements are *not* instances of ClusterNode. They are whatever "replicant" was registered for the service (i.e. an invoker stub, or in this case probably a String). There needs to be a translation into List. Looks like the way to do this is via DRM.lookupReplicantsNodeNames(String).


              BTW, I downgraded this back to Major as the standard deploy-hasingleton in 4.2 is not using HASingletonElectionPolicy.

              • 4. Re: JBAS-4919 - ha singletons in heterogenous topologies
                brian.stansberry

                Crap. I see the following didn't make it into the interface:

                /**
                 * Called by the HASingleton to set the name with which the singleton
                 * service is registered with the HAPartition.
                 */
                public void setServiceName(String serviceName);
                
                public String getServiceName();
                


                Those need to be there, otherwise an impl has no idea how to interact with the DRM.

                • 5. Re: JBAS-4919 - ha singletons in heterogenous topologies
                  galder.zamarreno

                   

                  "bstansberry@jboss.com" wrote:
                  My expectation here is any custom impl will be based off the base class, so we're handling the trivial stuff of storing the partition and singleton service refs. And if someone makes there own impl not using the base class and can't get that part right, well ....


                  I see. I've been looking around for a wiki on ha singleton election policy, recommended way to create one...etc but couldn't find anything. I think we need one. Also, the 4.2.2.beta docu seems to have a section on this (see http://labs.jboss.com/file-access/default/members/jbossas/freezone/docs/Clustering_Guide/beta422/html/ch05s11s04.html), but there's no mention of the HA singleton election policies.

                  "bstansberry@jboss.com" wrote:
                  Another thing to be aware of: the arg passed to HASingletonSupport.partitionTopologyChanged is really List<Object> newReplicants, where the list elements are *not* instances of ClusterNode. They are whatever "replicant" was registered for the service (i.e. an invoker stub, or in this case probably a String). There needs to be a translation into List<ClusterNode>. Looks like the way to do this is via DRM.lookupReplicantsNodeNames(String).


                  Thanks for the heads up :).

                  "bstansberry@jboss.com" wrote:
                  BTW, I downgraded this back to Major as the standard deploy-hasingleton in 4.2 is not using HASingletonElectionPolicy.


                  Ok.

                  • 6. Re: JBAS-4919 - ha singletons in heterogenous topologies
                    galder.zamarreno

                     

                    "bstansberry@jboss.com" wrote:
                    Crap. I see the following didn't make it into the interface:

                    /**
                     * Called by the HASingleton to set the name with which the singleton
                     * service is registered with the HAPartition.
                     */
                    public void setServiceName(String serviceName);
                    
                    public String getServiceName();


                    Those need to be there, otherwise an impl has no idea how to interact with the DRM.


                    I guess that's what setManagedSingleton/getManagedSingleton was trying to achieve:

                    /**
                     * Called by the HASingleton to provide the election policy a reference to
                     * itself. A policy that was designed to elect a particular kind of singleton
                     * could downcast this object to a particular type and then access the
                     * singleton for state information needed for the election decision.
                     */
                     void setManagedSingleton(Object singleton);
                    
                     Object getManagedSingleton();


                    Or maybe not. Judging from the javadoc, seems like the aim is different.This method is actually not used anywhere in the code. Shall we swap it for setServiceName/getServiceName? Or shall we keep it and add service name get/set?

                    • 7. Re: JBAS-4919 - ha singletons in heterogenous topologies
                      brian.stansberry

                      We need all 3 properties: partition, service name, singleton.

                      The singleton quite likely has no idea what it's "service name" is, which is what the HASingletonSupport that's controlling it uses to register it w/ DRM.

                      Without the service name the policy can't invoke most methods on the DRM if it needs to. It's the param that gets passed to a lot of DRM methods.

                      Re: docs, yes they are needed. When doing the 4.2 stuff I'd found JBAS-4919 so I made the docs a low priority until it was fixed.

                      • 8. Re: JBAS-4919 - ha singletons in heterogenous topologies
                        galder.zamarreno

                        I've switched to pickSingleton(List) and managed to get the singleton election tests running successfully again. A few notes/questions:

                        1.- I could do with having a method in DRM like this:

                        public List<ClusterNode> lookupReplicantsNodes(String key);

                        And deprecate lookupReplicantsNodeNames() as it returns a List of String names, which could still be accessible via ClusterNode.getName().

                        Otherwise, building the List is inefficient. I have to take the singleton name, look up the List of Cluster Node names, and with that list, take ClusterNode list from the partition and match them to create a brand new ClusterNode list with the nodes where the singleton are running. ugh

                        2.- The code did not inject the ha partition and singleton name into the election policy. I have done this by overriding HASingletonSupport.startService() and assign before super.startService(). I have my doubts whether this is the best place to do this, so feel free to flame me:

                        @Override
                        protected void startService() throws Exception
                        {
                         if (mElectionPolicyMB != null)
                         {
                         mElectionPolicyMB.setHAPartition(getHAPartition());
                         mElectionPolicyMB.setSingletonName(getServiceHAName());
                         }
                        
                         super.startService();
                        }


                        3.- HASingletonSupport does not have access to the singleton object, but we could override startService in HASC again and do something like:

                        @Override
                        protected void startService() throws Exception
                        {
                         if (getElectionPolicy() != null)
                         {
                         getElectionPolicy.setSingletonObject(mSingleton);
                         }
                        
                         super.startService();
                        }


                        4.- TODO: Might be worth creating a test with 3 cluster nodes and have a singleton deployed in two.

                        • 9. Re: JBAS-4919 - ha singletons in heterogenous topologies
                          brian.stansberry

                          Following your points;

                          1) +1. Please open a JIRA.

                          2) Better in createService() so the policy has it available in its own startService(). (The policy is injected into HASS, so it will go through create/start first). Keep doubting; if your gut keeps telling you it's wrong we can discuss more. If I decide it's wrong I won't flame you, as I've just now agreed. :)

                          The fact that this will be done should be documented in HASingletonSupport(MBean).setElectionPolicy().

                          3) That's ugly. That complexity is enough to convince me that the managedSingleton property should come out of HASingletonElectionPolicy.

                          4) Two semi-conflicting thoughts:

                          OTOH that's a lot of overhead. E.g. to run the testsuite we now need to have 3 IP addresses available, with multicast working between all 3. So, for now I'd say no, unless we can use mocks to simulate the cluster in the test driver's VM.

                          OTOH there's lots of other areas where this is useful. But, there's too much other stuff on our plates; if we can't do it in the test driver's VM. let's not do it at all for now.

                          • 10. Re: JBAS-4919 - ha singletons in heterogenous topologies
                            galder.zamarreno

                            Re 1) Done. http://jira.jboss.com/jira/browse/JBAS-5155

                            Re 2) Yeah, makes more sense. Generally you call super.start/create and then do your job, but had to put the election policy injections before super.startService which looked awkward to me. Putting in createService() override makes more sense

                            "bstansberry@jboss.com" wrote:
                            The fact that this will be done should be documented in HASingletonSupport(MBean).setElectionPolicy().


                            I'll document it there and also in the HASingletonElectionPolicy javadoc for setHAP and setSN so that people understand when these methods are called.

                            Re 3) I'm Ok with that too.

                            Re 4) I'll look into the test/mock drivers to see if I can replicate something along this lines. You're right in that my machine can hardly cope with 2 AS 5 nodes running a clustered test. It takes 10 minutes to run a single cluster test!

                            • 11. Re: JBAS-4919 - ha singletons in heterogenous topologies
                              galder.zamarreno

                              Re: http://jira.jboss.com/jira/browse/JBAS-5155

                              The other day was looking into the possibility of having this fixed in AS 4.x. There's 2 (well 4 really) ways this can be fixed:

                              1.- Add new API that allows either: a) passing the singleton name to the HASEP so that DRM can be queried, or b) passing a List to pickSingleton() with the ClusterNodes where the singleton is deployed.

                              2.- Use setManagedSingleton() as a way to pass the String version of the singleton name which is later used to query the DRM. It's got HACK written all over it, but we can do an instanceof for String within it in case someone out there calls this. One thing is for sure, a HASingleton can't be an instanceof String. This safes us the possibility of having to add new API in 4.x and all the trouble associated with it.

                              3.- Is it worth fixing this? Hmmmm, we haven't advertised this much which is good in some ways, but any day, someone running 4.x could come and report this to us in a support case.

                              4.- I guess we could take the add HA Singleton Election Policy functionality as experimental and hence not supported in 4.x. This could safe us a lot of headaches.

                              My preference right now is with either 2 or 4. Thoughts?

                              • 12. Re: JBAS-4919 - ha singletons in heterogenous topologies
                                brian.stansberry

                                LOL. I pick #1 or #3. :-)

                                #4 isn't an option -- the code's in EAP; it's supported.

                                #3 seems to be along the lines of the standard policy in EAP; we don't just fix everything we think of, since the fix could introduce other issues. We are conservative.

                                Would #1 be that bad? You create ExtendedHASingletonElectionPolicy with the missing methods from the AS 5 API. If the injected policy implements Extended, you invoke on it the way AS 5 does. Otherwise it gets used as it is now. In ExtendedHASingletonElectionPolicy javadoc, any HASingletonElectionPolicy methods that don't get called anymore should have comments noting that fact.

                                Don't get me wrong; I can see how it would get messy. #3 is OK.

                                I know you did a lot of refactoring to tighten things up in trunk, i.e. with HASingletonElectionPolicyBase; that doesn't need to happen in 4.x, and *shouldn't* happen if it might break some custom subclass. HASingletonElectionPolicySimple just needs to work correctly.

                                • 13. Re: JBAS-4919 - ha singletons in heterogenous topologies
                                  smeng1

                                  Just to post my 2 cents on this matter, my company has some customers who really need HASingletonElectionPolicy functionality and are unlikely to upgrade to JBoss5 anytime soon. So we could really really do with this fix in AS 4.x

                                  Sorry if I sound a bit aggressive but I'm just being anxious. Thank you for all your hard work.

                                  • 14. Re: JBAS-4919 - ha singletons in heterogenous topologies
                                    galder.zamarreno

                                     

                                    "smeng@unique" wrote:
                                    Just to post my 2 cents on this matter, my company has some customers who really need HASingletonElectionPolicy functionality and are unlikely to upgrade to JBoss5 anytime soon. So we could really really do with this fix in AS 4.x

                                    Sorry if I sound a bit aggressive but I'm just being anxious. Thank you for all your hard work.


                                    Please do not hesitate voting for the JIRA to get fixed :)

                                    http://jira.jboss.com/jira/browse/JBAS-5155?vote=vote

                                    1 2 Previous Next