1 2 Previous Next 25 Replies Latest reply on Apr 24, 2014 4:02 PM by jorgemoralespou_2

    Wanting to contribute

    jorgemoralespou_2

      Hi,

      I was thinking on doing a contribution to the project, so I can solve some issues that customer expects fixed, and also I improve my understanding on SwitchYard (proved I'm not so into it yet).

      I was thinking on working on [SWITCHYARD-2009] Make SCAInvoker being able to invokeLocal if the target node is "this" node on clustered invocations -… .

      For this, I think on these approach:

       

      1. Change the SCAInvoker to do a check not only for clustered configuration but for service being also available locally (This needs probably extension of the registry)
      2. Extend the registry so that can return whether a service is available locally
      3. If it is available locally call the ClusteredInvover in invokeRemote and if it not, call the invokeLocal method

       

      How can the registry know if the service is locally available? Create a local cache in the ISPN registry, and whenever a service is registered (addEndpoint) or unregistered (removeEndpoint) it is put in this cache, and "isServiceLocal(...)" can check whether the service is in this local cache or not. No need to register cache listeners for this cache.

       

      Also, and related to this, it is the [SWITCHYARD-2010] Provide with same LoadBalanceStrategies as there are but with +WithAffinity - JBoss Issue Tracker that can benefit from this behaviour providing with afinity load balancing algorithms, but the loadbalance strategy needs to be promoted to the SCAInvoker.

       

      Problem is I do not know how it will work in Karaf (probably this is main reason why SCA binding is not yet working on Karaf, due to ClassLoading problems for local calls), so I can go with it, until you think how to proceed with the Karaf implementation.

       

      Any comment?

        • 1. Re: Wanting to contribute
          jorgemoralespou_2

          Hi,

          I've been thinking more on this, while doing some coding/tests and I have come with another option, that wanted to discuss:

           

          All services should always be defined as clustered (so removing clustering option from switchyard configuration) and then have additional LoadBalanceStrategy that supports withAfinnity, so that it will try to make request to local always if service is registered locally, and if not, apply the LoadBalanceStrategy on the rest of available RemoteEndpoints (Random, RoundRobin). One of this strategies can be the default one, so that "by default" a service will behave like "not clustered" if it is deployed on same EAP instance, or will behave "clustered" otherwise. This avoids the need to define explicitly if the service is clustered or not, and simplifies configuration.

           

          Of course, registry still needs to be extended so that SCAInvoker knows if call should be local or not.And more logic needs to be placed on SCAInvoker and not in ClusteredInvoker.

           

          Any thoughts?

          • 2. Re: Wanting to contribute
            kcbabo

            Contributions are welcome!  :-)

             

            RemoteEndpoints stored in the registry already have a node name associated with them, so it should just be a matter of checking whether there is a remote endpoint in the registry which has the same node name as the invoker instance.  This should definitely not be the default behavior as always selecting the local endpoint defeats the entire purpose of load balancing.  I can definitely see it as a config option for the existing balancers though.  I suspect that the prime use case here is if you want to deploy multiple applications on a single box and then later move some of the apps off the box and not have to reconfigure the application to communicate with the (now remote) endpoints. If there's another case that I'm missing please let me know.

            • 3. Re: Wanting to contribute
              kcbabo

              -1 to marking all services as clustered.  We deliberately made this an option so that users can choose whether an endpoint is for the local environment only or is eligible for clustering.

              • 4. Re: Wanting to contribute
                jorgemoralespou_2

                Hi Keith,

                Main use case is that every customer I've worked for, now and before, mainly telco operators, with this or other products suite, they always wanted to have services deployed clustered, but invoke the services via a local call if they are co-located, and not do a remote call, which introduces a big overhead.

                When services are co-located and invoked via SCA, which is "synchronous" can, and usually is preferred, to run them in the same container (via a local call), but these services come and go, as they are operated (upgraded, deployed, distributed, ...) so there is no need to reconfigure the switchyard application in this case, it will apply the clustered load balance policy.

                And, if by any reason, user want remote invocations, just using the Policy without affinity will always use remote calls.

                 

                From my point of view it is a main advantage to the actual setup, but of course, all of you have probably thought on these a lot. I would like to see other opinions on this apart from yours.

                 

                I jumped into this, not because I liked it, but because my customer is really unhappy on how it works right now.

                 

                As for the tip about the caches, once stared working on it, I realized the same, but thanks for the tip.

                 

                Waiting for more comments.

                • 5. Re: Wanting to contribute
                  kcbabo

                   

                  Main use case is that every customer I've worked for, now and before, mainly telco operators, with this or other products suite, they always wanted to have services deployed clustered, but invoke the services via a local call if they are co-located, and not do a remote call, which introduces a big overhead.

                  When services are co-located and invoked via SCA, which is "synchronous" can, and usually is preferred, to run them in the same container (via a local call), but these services come and go, as they are operated (upgraded, deployed, distributed, ...) so there is no need to reconfigure the switchyard application in this case, it will apply the clustered load balance policy.

                  Sounds like you are confirming that the use case I described is the one you are after.

                   

                  From my point of view it is a main advantage to the actual setup, but of course, all of you have probably thought on these a lot. I would like to see other opinions on this apart from yours.

                   

                  There are many advantages to clustering.  Dynamic provisioning and reorganization is definitely a useful advantage, but not the only one.  I would say the primary drivers we see are load distribution and liveliness (instance failure doesn't break consumers).

                   

                  This all really boils down to the nature of the application and deployment environment.  In certain application architectures, the overhead of going out of process (network + serialization) is huge relative to processing time, so keeping invocations local is important.  In other architectures, service execution time is lengthy relative to remoting expense which means it makes more sense to distribute load.


                  All these use cases are important and valid. 

                   

                  I jumped into this, not because I liked it, but because my customer is really unhappy on how it works right now.

                   

                  I'm sorry you have to work on something you don't like.

                  • 6. Re: Re: Wanting to contribute
                    jorgemoralespou_2

                    Hi Keith,

                    We seem to be thinking the same but somehow disagreeing :-D

                    Of course, I think that invocations should be local if the services are collocated, and remote otherwise.

                    But what I think it is a better option is than rather than having a property at the domain level, that have to be changed if from clustered to non clustered and viceversa, let these selection be made on the LoadBalanceStrategy.

                     

                    I'll try to be more explicit:

                    Service AService BStatusLoadBalanceStrategyInvocation to Service A in Node 1Invocation to Service A in Node 2
                    Node 1 + Node 2Node 1Service colocated (in Node 1)RandomAfinityLocalRemote (to Node 1)
                    Node 1 + Node 2Node 1Service colocated (in Node 1)Random (without afinity)Remote if Request to Node 2, Local if Request to Node1Remote (to Node 1)
                    Node 1 + Node 2Node 3Service not-colocatedRandomAfinityRemote (to Node 3)Remote (to Node 3)
                    Node 1 + Node 2Node 3Service not-colocatedRandom (without afinity)Remote (to Node 3)Remote (to Node 3)
                    Node 1 + Node 2Node 1 + Node 2Service colocatedRandomAfinityLocalLocal
                    Node 1 + Node 2Node 1 + Node 2Service colocatedRandom (without afinity)Remote if Request to Node 2, Local if Request to Node1Remote if Request to Node 1, Local if Request to Node 2

                     

                    With this reasoning, it is the LoadBalanceStrategy the one that determines whether the call is local or remote, so if for some reason, in 5th case in above table, service B gets undeployed on Node 2, Invocation for Service A on Node 2 will make a remote call to Service B in Node 1 (and not a Local call as documented), without any configuration change.

                     

                    This way all SCA calls will behave the same (no "clustering" configuration) but loadbalancing. And the user will determine if they prefer afinnity, which should be the default. (Maybe not a behavior of the load balancing algorithm, but of the SCA reference, like the clustered might look better).

                     

                    Right now, any SwitchYard User with an homogeneous distribution of services, will have to reconfigure their services when changing to an heterogenous distribution.

                    Also, when doing update of services, (service B in our example above) is redeployed, and it takes some time, in the node that the service is getting redeployed there will be service loss (if not clustered selected), while with my proposal will not be service loss.

                     

                    Can I convince you?

                     

                    Keith Babo escribió:

                     

                    I'm sorry you have to work on something you don't like.

                     

                    Don't use my words in the wrong way I was meaning that I didn't choose this topic because of "extreme excitement on doing it" but just because I think it is a really important feature. I like FSW, SwitchYard, and what I do very much, even I get bitten very often due to my lack of knowledge

                     

                     

                    Thanks for your comments,

                    • 7. Re: Re: Wanting to contribute
                      kcbabo

                      Just to be clear, when I say "config option" I mean a configuration on the load balancer definition on binding.sca.  We can do that by adding another attribute or by using a different name for the load balance policy as you've done above.  In that scenario, a user has complete control over the policy they use and they can use different policies for different services if they like.

                       

                      Does that match your expectations/requirements?

                      • 8. Re: Re: Wanting to contribute
                        jorgemoralespou_2

                        Hi Keith,

                        Just to be clear. Does it mean that if it is configurable, whether to calls can be local if service is on same machine, or remote if service is not on same machine, is a "config option" and that you are ok, at least with my approach? At least until you see the pull request?

                         

                        Thanks for helping so much.

                        • 9. Re: Wanting to contribute
                          kcbabo

                          I didn't quite catch what you are asking for confirmation on, so let me play it back to you to make sure we are on the same page:

                          • ClusteredInvoker should be aware of whether a service being invoked is available locally.
                          • There should be a way to specify that ClusteredInvoker should prefer local invocations over remote invocations.  The existing implementations should not change to make this default behavior, however.

                          I agree with both those points.  Some implementation notes:

                          • I think this change should be fairly localized - probably just to the load balancer strategy implementation itself.
                          • Node name should be optional for cases of remote clients executing outside the container, so keep a version of the constructor which doesn't take it and don't rely on it being non-null.
                          • We can start with just calling these different load balance strategy names.  I have a feeling that approach won't scale particularly well as we will end up with lots and lots of strategy names to configure behavior.  Unfortunately, the SCA schema boxes us into this by only allowing extension attributes and not elements.  I say start with the approach of using a different strategy name and we'll see how it turns out.
                          • I'm not particularly fond of the term affinity for this, but I don't have a better suggestion at the moment that's not super long (e.g. RandomPreferLocalStrategy).  The reason I don't like affinity is that it can actually mean a number of different things.   It definitely could mean prefer local, but it could also mean partitioning workload based on some other criteria (priority, classification, geography, function, etc.).  This isn't a huge deal as it's easy to change names as we develop, but figured I would throw it out there for discussion.
                          • 10. Re: Wanting to contribute
                            kcbabo

                            Actually, let's debate whether this needs a new strategy name for a moment.  I think it may be best to just update our existing strategy implementations to take a little bit of extra config from the binding.sca definition.  Here are the extra attributes defined right now:

                             

                            core/config/src/main/resources/org/switchyard/config/model/switchyard/v2/switchyard_2_0.xsd at master · jboss-switchyard…

                             

                            We could add an additional attribute called "preferLocal" or something to the schema and feed that to the strategy when we create it:

                            https://github.com/jboss-switchyard/components/blob/master/sca/src/main/java/org/switchyard/component/sca/SCAInvoker.java#L276

                             

                            With something like setPreferLocal(boolean preferLocal) here:

                            core/remote/src/main/java/org/switchyard/remote/cluster/LoadBalanceStrategy.java at master · jboss-switchyard/core · Git…

                            • 11. Re: Wanting to contribute
                              jorgemoralespou_2

                              Hi Keith,

                              Thanks for the tips. I'm getting excited on getting this done already. But maybe I'm not that good, I'll try however.

                               

                              With "The existing implementations should not change to make this default behavior, however" means that we should be keeping the clustered configuration element? And add the new behavior, so if schema defined in switchyard:2 use new behavior, if schema defined in switchyard:1 use old?

                              • 12. Re: Wanting to contribute
                                jorgemoralespou_2

                                I buy the preferLocal option, sounds fine.

                                • 13. Re: Wanting to contribute
                                  jorgemoralespou_2

                                  Keith,

                                  Is there a tutorial on how to build and test the whole repositories (what are required) in a Wildfly/EAP? I would like to test when finished in an installation, not just junits

                                  • 14. Re: Wanting to contribute
                                    kcbabo

                                    With "The existing implementations should not change to make this default behavior, however" means that we should be keeping the clustered configuration element? And add the new behavior, so if schema defined in switchyard:2 use new behavior, if schema defined in switchyard:1 use old?

                                     

                                    Yes, this will go in the 2.0 version of the schema.  Since it's a boolean field, it will default to false in the config model so you shouldn't have to check explicitly in your strategy logic whether the underlying switchyard.xml is 1.x or 2.0.

                                    1 2 Previous Next