13 Replies Latest reply on May 29, 2008 2:42 PM by brian.stansberry

    profile service, farming

    starksm64

      In going through the work to add arbitrary deployment content via the ManagementView of the profile service, it seems I should add an APPLICATION_CLUSTERED phase:

       /** The class of deployment */
       public enum DeploymentPhase {
       /** A deployment loaded during the server bootstrap phase */
       BOOTSTRAP,
       /** An mc/service deployment for a Deployer to be loaded after the BOOTSTRAP phase */
       DEPLOYER,
       /** Any deployment content to be loaded after the DEPLOYER phase */
       APPLICATION,
       /** Any deployment content to be loaded across a 'cluster' after the APPLICATION phase */
       APPLICATION_CLUSTERED,
       };
      


      I need such a notion to properly expose what targets would be part of the APPLICATION_CLUSTERED, and this also gets to the issue of not having a farming type service available in jboss5.

      I'm going to add this phase and start looking at what the implications are for the DeploymentRepository spi that describes the contents of a profile. This is work going on as part of JBAS-5370. I imagine we'll need to make some changes to properly support a clustered app notion so I'll create another jira for that.



        • 1. Re: profile service, farming
          brian.stansberry

          Bit of a brain dump here. Any comments on my wrongheadedness would be much appreciated. :)

          When I think about what the clustering services inside the AS will be doing with clustered deployments, I see two main things:

          1) Keeping repositories in sync if people are using individual local filesystem repos on each node. This task is minor/optional/secondary, but I include it for completeness.

          2) Coordinating deployments around the cluster as the ProfileService running on each node becomes aware of changes in the repository. Ensure each node is aware of the change, use a configurable policy to execute the deployment (e.g. sequentially or simultaneously), extend any 2PC deployment capability the profile service exposes to a cluster-wide operation. This is the more significant task.

          Keeping repositories in sync
          ------------------------------------

          Again, this is IMHO the less meaningful task, but one that seems pretty straighforward. Only relevant if the repository is the local filesystem, not enabled by default (even in an 'all' config).

          1.As with deployers/ and deploy/ there is a scanner that reads a set of URIs (e.g. farm/) and presents deployments to the ProfileService
          2.On startup before presenting deployments to ProfileService it reconciles the state of farm/ with the cluster, brings over added deployments not available locally, brings over modified deployments, removes deployments previously removed from cluster but still present locally.
          3.After startup, no longer presents things to the ProfileService; the profile service is responsible for scanning for changes itself.
          4.After startup, however, it continues to scan the farm/ dir for purposes of noticing changes and replicating them around the cluster.

          Coordinating deployments
          ----------------------------------

          Here things get more fuzzy. "Coordinating" basically means either a) controlling the initiating of the deployment process on each node, or b) somehow pausing it in the middle and exchanging messages around the cluster until the cluster-wide state is appropriate and then resuming, or c) some combination of both.

          To me using a) implies some variant of the HDScanner concept -- i.e something that periodically queries the Profile for modified deployments and then invokes on MainDeployer to undeploy the old version and deploy the new version. Looking at the MainDeployer API it looks like it exposes enough methods for pretty fine grained control of this; e.g. the change(String, DeploymentStage) method implies the ability to walk things through deploy/undeploy step by step.

          Using b) implies writing a specialized Deployer(s) that does coordination in the deploy()/undeploy() methods. That seems conceptually wrong to me; i.e. a bit outside of the scope of what a deployer should be doing.

          An advantage of doing it in a Deployer is the possibility of using metadata that an earlier deployer could make available. For example, a bean annotated with @Farm or a war with a special tag in jboss-web.xml could be cluster-deployed without needing to be associated with DeploymentPhase.APPLICATION_CLUSTERED.

          • 2. Re: profile service, farming
            starksm64

            Just some follow up comments since I'm currently testing the DeploymentManager api for adding raw deployment content to a repository.

            So the first issue is that hot deployment is no longer really a scanner service as it was in the past. Such as service does exist, but its nothing more than a thread/executor that calls into the profile for the modified deployments. So 'hotdeployment' is purely a profile implementation detail, and how we reconcile local repositories across a cluster would be part of the org.jboss.profileservice.spi.DeploymentRepository.getModifiedDeployments() implementation. As we build out different repository implementation maybe we'll need to add more policy plugins for things like cluster synchronization.

            Synchronization of multiple local file based repositories as you describe is difficult for something outside of the profile service as the exact state of a deployment is the raw deployment content + any admin edits that may have occurred. Raw content sitting on the disk may not be in the profile depending on what removing a deployment from a profile actually does.

            The "Coordinating deployments" point is a bit confused in my view as there are a number of different aspects of a clustered application.
            1) The simple notion of having a deployment available across all nodes in the cluster, the DeploymentPhase.APPLICATION_CLUSTERED notion of the DeploymentRepository. There could be an annotation for this down to a bean level rather than a coarse deployment level. We could support this using the MetaDataRepository levels and have a clustered MetaDataRepository where we just distributed the BeanMetaData(for example) for the annotated bean. Provided the deployers/deployment attachments were properly integrated with the MetaDataRepository, it should just work, but there is a dependency issue. I'm wondering if there really is a @Farmed vs a more expressive @Clustered notion though. I would think @Clustered could be a superset of @Farmed where if I only wanted distribution of a component across the cluster without any cluster aware aspects applied to the component. That type of farming/clustering seems outside of the scope of the profile service deployment repository. Its just a metadata/metadata repository issue.
            2) Cluster wide dependencies between beans. Dependencies on components that are marked as @Clustered needs a cluster aware dependency implementation. Custom dependencies are supported by the mc, but the component deployers generating the component metadata have to generate the correct dependency implementation.
            3) Clustered aware aspects. Caches, proxies, cluster-wide singletons, are behaviors that you potentially want cluster aware implementations of for components marked as @Clustered.

            Clustering like security is a cross cutting notion that we need the proper metadata for, and hooks to which cluster aware variations of the aspects can be applied. I agree that cluster aware deployers should not be needed. Rather, the deployer should be driven off of the metadata and component aspects to create clustered components. Identifying these aspects and making sure deployers just work when the cluster aware aspects are configured is the main task.

            • 3. Re: profile service, farming
              brian.stansberry

              Been digesting this for a bit. :)

              Synchronization of multiple local file based repositories as you describe is difficult for something outside of the profile service as the exact state of a deployment is the raw deployment content + any admin edits that may have occurred. Raw content sitting on the disk may not be in the profile depending on what removing a deployment from a profile actually does.


              I wasn't clear about one thing when I described that concept -- I was only thinking of that approach in terms of the basic.ProfileServiceImpl; i.e. an equivalent to VFSDeploymentScannerImpl that handles DeploymentPhase.APPLICATION_CLUSTERED. This is one reason I considered it low priority. It doesn't fit with the full profile service impl, which works differently.

              So 'hotdeployment' is purely a profile implementation detail, how we reconcile local repositories across a cluster would be part of the org.jboss.profileservice.spi.DeploymentRepository.getModifiedDeployments() implementation.


              This part isn't clear to me. I certainly see how keeping different repositories in sync across a cluster is a detail of the repository implementation. And I can see how a cluster-aware DeploymentRepository instance could somewhat control when a ProfileImpl is aware of a change; e.g. don't make the profile aware of the change until all the repository instances in the cluster are aware of it.

              But that doesn't get to controlling how the profile changes get reflected in the runtime. That's a task of the MainDeployer and the deployers.

              I'll talk about a specific ideal scenario:

              An ear is deployed on all 4 nodes of a cluster. New version of the ear is deployed. Goal is that the ear be brought to a certain deployment stage (DeploymentStages.REAL?) on all nodes in the cluster such that we know the deployment will work on all nodes. A 2PC "prepare". At that point a cluster-wide "commit" is executed, the deployments are brought to the final stage where they handle requests, and the old version is removed. If there is a failure during the "prepare", the new version is rolled back and the old version left in place.

              There can be other variations on the above, but the main point is there is a multistep deployment process that requires intra-cluster communication at various points. Who controls that process is my question -- doesn't seem like its a concern of the DeploymentRepository, also doesn't seem like a proper concern of a deployer. My last post mentioned "some variant of the HDScanner concept" but that's not it either; you're right, HDScanner is just a trivial link between the profile and the MainDeployer and shouldn't be made into something else. Seems like this is at least partly a concern of a cluster-aware MainDeployer.

              The simple notion of having a deployment available across all nodes in the cluster, the DeploymentPhase.APPLICATION_CLUSTERED notion of the DeploymentRepository. There could be an annotation for this down to a bean level rather than a coarse deployment level. We could support this using the MetaDataRepository levels and have a clustered MetaDataRepository where we just distributed the BeanMetaData(for example) for the annotated bean. Provided the deployers/deployment attachments were properly integrated with the MetaDataRepository, it should just work, but there is a dependency issue.


              This is the part I need to dig into more to get a better understanding of what you mean. Perhaps that will answer my question above. :)

              Cluster wide dependencies between beans. Dependencies on components that are marked as @Clustered needs a cluster aware dependency implementation. Custom dependencies are supported by the mc, but the component deployers generating the component metadata have to generate the correct dependency implementation.


              Good point. This will be a nice thing to have.

              This again has a coordination aspect; e.g. bean A on node 1 expresses a dependency on bean B that will be deployed on node2. If both A and B are known, cluster-wide, to the respository, you don't want A's deployment to fail with a missing dependency just because node 2's deployers haven't processed B yet.

              Clustering like security is a cross cutting notion that we need the proper metadata for, and hooks to which cluster aware variations of the aspects can be applied.


              Agreed. Right now the clustering metadata that exists is scattered, and it only exists for EJBs and web sessions.

              When I think in terms of priority order though, I'm thinking somewhat differently. To me, it's

              1) Restoring some sort of ability to have repository information synchronized. This is really the only thing the old farming did, in a half-assed way ;). I'd like to have this for 5.0.0.GA as I don't like taking something away, even if was half-assed.

              2) Sorting out the "coordination" issue I've been talking about. The lack of that kind of coordination IMHO has always been the biggest weakness in the old FarmService.

              3) Cluster wide dependencies. This could be done earlier if our solution for 1) ensures, following my example, that node 2's ProfileImpl knows about bean B before node 1's ProfileImpl knows about bean A.

              4) Adding cluster-aware aspects to beans other than the existing JEE ones. Includes refactoring existing JEE clustered aspect impls to use as much of a common code base as possible.

              Creating proper clustering metadata is an underlying task that occurs throughout the above.

              • 4. Re: profile service, farming
                starksm64

                 

                "bstansberry@jboss.com" wrote:

                So 'hotdeployment' is purely a profile implementation detail, how we reconcile local repositories across a cluster would be part of the org.jboss.profileservice.spi.DeploymentRepository.getModifiedDeployments() implementation.


                But that doesn't get to controlling how the profile changes get reflected in the runtime. That's a task of the MainDeployer and the deployers.

                Ok, that is true. Additions/removals to the profile do need to be passed through the MainDeployer to bring a running system in sync with the profile.

                "bstansberry@jboss.com" wrote:

                An ear is deployed on all 4 nodes of a cluster. New version of the ear is deployed. Goal is that the ear be brought to a certain deployment stage (DeploymentStages.REAL?) on all nodes in the cluster such that we know the deployment will work on all nodes. A 2PC "prepare". At that point a cluster-wide "commit" is executed, the deployments are brought to the final stage where they handle requests, and the old version is removed. If there is a failure during the "prepare", the new version is rolled back and the old version left in place.

                Ok, DeploymentStages.REAL would be too far as real runtime components would start to be created. Maybe we would just need to run it through the DESCRIBE phase, maybe PRE_REAL. The main problem is a disconnect between a system that looks valid in terms of everything being there, vs actually having all of the runtime components in place.

                "bstansberry@jboss.com" wrote:

                There can be other variations on the above, but the main point is there is a multistep deployment process that requires intra-cluster communication at various points. Who controls that process is my question -- doesn't seem like its a concern of the DeploymentRepository, also doesn't seem like a proper concern of a deployer. My last post mentioned "some variant of the HDScanner concept" but that's not it either; you're right, HDScanner is just a trivial link between the profile and the MainDeployer and shouldn't be made into something else. Seems like this is at least partly a concern of a cluster-aware MainDeployer.

                There really is no one controlling entity other than the admin layer. The admin layer calling the profile service api to do the deployment is what triggers everything, dealing with failures and rolling back to the previous version, but all participants need to be properly written with dependencies in place to allow a failure to be unwound.

                "bstansberry@jboss.com" wrote:

                This is the part I need to dig into more to get a better understanding of what you mean. Perhaps that will answer my question above. :)

                It won't. The metadata repository is just a hiearchical source of metadata that has scopes that can be where server/cluster wide defaults are picked up. It has to fit into any cluster wide notions, but solves nothing in of itself.


                • 5. Re: profile service, farming
                  starksm64

                   

                  "bstansberry@jboss.com" wrote:

                  This again has a coordination aspect; e.g. bean A on node 1 expresses a dependency on bean B that will be deployed on node2. If both A and B are known, cluster-wide, to the respository, you don't want A's deployment to fail with a missing dependency just because node 2's deployers haven't processed B yet.

                  So by running the deployments across the cluster to the DeploymentStages.DESCRIBE phase, we know whether or not all dependencies can be satisfied. The only possible coordinator is the admin layer driving the profile service api usage. Maybe your driving at, do we want this to be an old farming deployment type of service?

                  "bstansberry@jboss.com" wrote:

                  When I think in terms of priority order though, I'm thinking somewhat differently. To me, it's

                  1) Restoring some sort of ability to have repository information synchronized. This is really the only thing the old farming did, in a half-assed way ;). I'd like to have this for 5.0.0.GA as I don't like taking something away, even if was half-assed.

                  2) Sorting out the "coordination" issue I've been talking about. The lack of that kind of coordination IMHO has always been the biggest weakness in the old FarmService.

                  So I think we need to look at a FarmServiceUnitTestCase that uses the profile service DeploymentManager to validate the various types of things that can happen with a two node deployment say, and figure out what will/won't work for the initial release.



                  • 6. Re: profile service, farming
                    brian.stansberry

                     

                    DeploymentStages.REAL would be too far as real runtime components would start to be created. Maybe we would just need to run it through the DESCRIBE phase, maybe PRE_REAL. The main problem is a disconnect between a system that looks valid in terms of everything being there, vs actually having all of the runtime components in place.


                    Yes, REAL is too far, at least right now. :) I said real based on this comment in DeploymentStages

                     /** The installed stage - could be used to provide valve in future? */
                     DeploymentStage INSTALLED = new DeploymentStage("Installed", REAL);


                    where I interpreted the "valve" as being the notion Adrian described in an old thread of bringing a deployment all the way to being ready to handle requests, and then at the last stage switching JNDI refs, connectors, etc to use the new version. But we're clearly not ready for that yet. :)

                    There really is no one controlling entity other than the admin layer. The admin layer calling the profile service api to do the deployment is what triggers everything, dealing with failures and rolling back to the previous version, but all participants need to be properly written with dependencies in place to allow a failure to be unwound.

                    So by running the deployments across the cluster to the DeploymentStages.DESCRIBE phase, we know whether or not all dependencies can be satisfied. The only possible coordinator is the admin layer driving the profile service api usage. Maybe your driving at, do we want this to be an old farming deployment type of service?


                    No, I was being muddleheaded. In Vegas we agreed that it's going to be the admin layer that drives this. In the 2 years since I've just been looking at MainDeployer and the deployers and got my thinking in a muddle. I need to look more at the DeploymentManager API; sounds like that's where ability to do things like "running the deployments across the cluster to the DeploymentStages.DESCRIBE phase" will be exposed.

                    So I think we need to look at a FarmServiceUnitTestCase that uses the profile service DeploymentManager to validate the various types of things that can happen with a two node deployment say, and figure out what will/won't work for the initial release.


                    Sounds good; that will be my focus.

                    • 7. Re: profile service, farming
                      brian.stansberry

                      BTW, will the basic profile service impl still exist in 5.0.0.GA?

                      • 8. Re: profile service, farming
                        brian.stansberry

                         

                        I need to look more at the DeploymentManager API; sounds like that's where ability to do things like "running the deployments across the cluster to the DeploymentStages.DESCRIBE phase" will be exposed.


                        The current DeploymentManager SPI of distribute/start/redeploy/stop/undeploy doesn't expose a method to let a client bring a deployment to DESCRIBE or PRE_REAL. Were you thinking in terms of adding such a method, e.g. a

                         /**
                         * Bring a previously distributed deployment through the DESCRIBE stage.
                         * @param name
                         * @param phase
                         * @return
                         * @throws Exception
                         */
                         public DeploymentProgress describe(String name, DeploymentPhase phase)
                         throws Exception;


                        Another possibility is this step can be an internal detail of DeploymentManager.start() and redeploy, i.e. DeployHandler.invoke() could handle a "describe" invocation. In that case, an impl of DeploymentProgress would handle the coordination task by looping through the targets and calling "describe", "start" etc. on the DeployHandler.

                        A problem with all of this is a DeploymentProgress instance in a remote would have no way to know about cluster topology changes that occur in the middle of an operation (e.g. new node joins in the middle of a deploy; that node never gets the update).

                        • 9. Re: profile service, farming
                          starksm64

                          Yes, I think we'll have to add a describe/prepare method as the caller needs to be able to receive meaningful information about what is wrong if a deployment cannot be described.

                          In terms of being aware of cluster membership changes, that still needs to be added and how that is handled defined. There could be more interaction between the server and client side DeploymentProgress, but its possible that the DeploymentManager is not talking to live servers.

                          I would view the cluster as locked at the start of the deployment op, and new additions need to synchronize their repository view once the op is complete.

                          "bstansberry@jboss.com" wrote:

                          BTW, will the basic profile service impl still exist in 5.0.0.GA?

                          Yes, I expect so.


                          • 10. Re: profile service, farming
                            brian.stansberry

                             

                            So I think we need to look at a FarmServiceUnitTestCase that uses the profile service DeploymentManager to validate the various types of things that can happen with a two node deployment say, and figure out what will/won't work for the initial release.


                            I've added infrastructure for clustered profile service tests (i.e. start two AS instances configured for clustering and the full profile service, execute set of tests).

                            ./build.sh tests-clustered-profileservice


                            Currently this target isn't part of the overall testsuite run, as right now there's no real point.

                            Only test right now is o.j.t.cluster.defaultcfg.profileservice.test.ClusteredDeployUnitTestCase. Right now this is 98% just a copy of your DeployUnitTestCase that uses DeploymentPhase.APPLICATION_CLUSTERED with a couple extra assertions thrown in (i.e.. any DeploymentProgress.getDeploymentTargets() returns a list with 2 targets.) As we go along I'll fill this out.

                            • 11. Re: profile service, farming
                              brian.stansberry

                              I'm starting to wonder whether it would make more sense to just have a JGroups Channel (actually, a more limited version of HAPartition [1]) available during the bootstrap of a clustered ProfileService. Been looking at FileProfileRepository thinking about how a cluster-aware variant of that could be created. But you quickly run into the DeploymentRepository.load() method which indirectly gets called during the profile service bootstrap. A proper clustered DeploymentRepository would need to reconcile at least some of its state with other cluster members during load(), but there's no Channel available. Getting around that would require hacky API changes all the way to the ProfileService.getProfile(ProfileKey key) method.

                              I had been thinking the services of the regular HAPartition would be made available to the profile service as part of DeploymentPhase.APPLICATION. But that just looks wrong; a service depending on its client to provide necessary core functionality.

                              [1] A more limited HAPartition would be a bean that implements an interface w/o ancillary HAPartition stuff like getDistributedState(). The regular HAPartition in deploy/ would probably just have this bean injected into it and delegate to it for its core functions. This way we'd avoid creating an extra Channel.

                              • 12. Re: profile service, farming
                                starksm64

                                It sounds correct to make the cluster aware feature be an implementation detail that requires injection of the clustering function into the profile impl/profile repository impl.

                                • 13. Re: profile service, farming
                                  brian.stansberry

                                  OK, good. That solves a lot of conceptual problems.