11 Replies Latest reply on Mar 8, 2010 10:20 AM by emuckenhuber

    Use of names in DeploymentRepository SPI

    brian.stansberry

      As I'm working to clean up issues arising from the VFS3 switch, I'm finding the "name" concept in org.jboss.profileservice.spi.DeploymentRepository confusing and error prone. There are several different name concepts in the SPI and it's unclear how they should relate.

       

      In my next post I'll get to conclusions, suggestions. People might want to skip the rest of this first post; in a lot of ways it's detailed notes for myself as I analyzed the SPI docs and the actual usage of it trying to clarify things for myself.

       

      I see 3 different "name" concepts in the javadoc, deriving them from similar language in the javadoc describing params or return values:

       

      1. There's a "repostory name" notion, defined as "the unique name of the deployment in the repository". This language is used for
        1. the return value from addDeploymentContent(...)
        2. the "name" param passed to getDeploymentContent(...)
        3. the elements in the String[] returned by getRepositoryNames(....)
      2. A "vfs path" notion, defined as "the full vfs path of the deployment". This "vfs path" language is used for:
        1. the "name" param passed to addDeploymentContent(...)
        2. the "name" param passed to getDeploymentContent(...)
        3. the "name" param passed to (un)lockDeploymentContent(...
        4. the "name" param passed to the various xxxDeploymentContentFlags(...) methods
      3. A "deployment name" or "name of deployment" notion. This is used in various places, although it's unclear it's all the same notion:
        1. the elements in the Set<String> returned by getDeploymentNames()
        2. the "names" vararg passed to getRepositoryNames(...)
        3. the "name" param passed to addDeployment(...)
        4. the "name" param passed to getDeployment(...)
        5. the "name" param passed to removeDeployment(...)
        6. the value returned by ProfileDeployment.getName(), where ProfileDeployment is
          1. the "deployment" param passed to addDeployment(...)
          2. returned directly or indirectly from getDeployment, removeDeployment, getDeployments, getModifiedDeployments

       

      That's looking at it from the SPI language point of view. Walking through some use cases it's clear that the above groupings based on common language are incorrect:

       

      • User calls DeploymentManager.distribute("foo.war", someURL, true) and then DeploymentProgress.run();
        • eventually calls server-side AbstractDeployHandler.handleStream()
          • calls DeploymentRepository.addDeploymentContent("foo.war", ....) -- so the "vfs path" notion here is "foo.war". This call is really the inital entry point into DeploymentRepository usage, so for me this usage fixes the "vfs path" notion in the DeploymentRepository javadoc as meaning the same thing as the name param passed to DeploymentManager.distribute.
            • the repository is dealing with multiple URIs so it has to decide where to place "foo.war". It decides on the URI ending with "farm/".  So the "repository name" notion becomes "farm/foo.war". That's an implementation detail though; the repository could call it "bob" though if it wants
          • calls DeploymentRepository.getDeploymentContent("farm/foo.war") to get a VirtualFile for the added content
          • creates a "deploymentName" basically via a call to VirtualFile.toURI(). Why? Why not use the repositoryName? What if the repository called it "Bob" which has no relationship to VirtualFile.toURI()?
          • this "deploymentName" becomes the "name" property of a new ProfileDeployment
          • calls DeploymentRepository.addDeployment(deploymentName, ProfileDeployment)
          • calls DeploymentRepository.lockDeployment(deploymentName). Seems 2.3 above is wrong; this param is deploymentName, not the initial vfs path
          • returns the repositoryName back to the user
      • User calls DeploymentManager.start(repositoryName) i.e. start("farm/foo.war")
        • eventually calls server-side AbstractDeployHander.start(DeploymentID);
          • calls DeploymentRepository.getDeploymentNames() and validates it contains "farm/foo.war". Seems 3.1 above is wrong; that Set<String> returned by getDeploymentNames() is the repositoryName notion, not deploymentName
          • calls DeploymentRepository.getDeployment(repositoryName). Seems 3.4 above is wrong; that param is a repositoryName not a deploymentName! It also means the DeploymentRepository.addDeployment(...) call above must use repositoryName, not deploymentName.
          • calls DeploymentRepository.unlockDeployment(deploymentName) Seems 2.4 above is wrong; this param is deploymentName, not the initial vfs path
      • User calls DeploymentManager.remove(repositoryName)
        • eventually calls server-side AbstractDeployHander.start(DeploymentID);
          • calls DeploymentRepository.getDeploymentNames() and validates it contains "farm/foo.war".
          • calls DeploymentRepository.removeDeployment(repositoryName). Seems 3.5 above is wrong.

      • User calls DeploymentManager.distribute("foo.war", someURL, false) and DeploymentProgress.run();
        • eventually calls server-side DeployHandler.distribute()
          • creates a VirtualFile from someURL
          • creates deploymentName from vf.toURI() -- so someURL essentially becomes deploymentName
          • adds ProfileDeployment to transient profile
          • returns deploymentName as repositoryName
      • Usage of DeploymentManager.getRepositoryNames()
        • there are a couple places in the test suite where jar names are passed to DeploymentManager.distribute()
          • which eventually passes the jar names to DeploymentRepository.addDeploymentContent()
        • and then later the same jar names are passed to DeploymentManager.getRepositoryNames(...)
          • which eventually passes them to DeploymentRepository.getRepositoryNames(...). Oops, seems 3.2 above is wrong; this param is the "vfs path" notion.
        • 1. Re: Use of names in DeploymentRepository SPI
          brian.stansberry

          Now, suggestions.

           

          Can we reduce things in the SPI to 2 concepts:

           

          1. The initial end user input; what gets passed as the "name" param to DeploymentManager.distribute(String name, URL contentURL, boolean copyContent). Used in
            • the "name" param to addDeploymentContent(...)
            • the "names" param in getRepositoryNames(...)
          2. The "repository name" concept, which is initially returned by addDeploymentContent() and then used everywhere else.

           

          I think, besides changing javadoc, this could partly be accomplished by not creating a deployment name when we already have the repository name:

           

          Index: src/main/java/org/jboss/profileservice/management/upload/remoting/AbstractDeployHandler.java
          ===================================================================
          --- src/main/java/org/jboss/profileservice/management/upload/remoting/AbstractDeployHandler.java    (revision 101736)
          +++ src/main/java/org/jboss/profileservice/management/upload/remoting/AbstractDeployHandler.java    (working copy)
          @@ -296,7 +296,7 @@
                 // FIXME 
                 ProfileKey key = deploymentTarget.getProfile();
                 String profileName = key.getName();
          -      String deploymentName = createDeploymentName(vf);
          +      String deploymentName = repositoryName;
                 // Don't try to re-mount
                 if(deploymentRepository.getDeploymentNames().contains(deploymentName) == false)
                 {
          
          

           

          I'm sure  there's more to it than that.

           

          Where things get a bit dicey is with the MutableProfile.addDeployment/removeDeployment methods, which in AbstractProfile call through to DeploymentRepository.addDeployment, using ProfileDeployment.getName() as the name param. If that code path were actually used, who knows what the ProfileDeployment.getName() value would be. But the DeploymentRepository impl could always throw an ISE or IAE or something if the name conflicted with some already existing deployment. And AFAICT that code path isn't actually used.

          • 2. Re: Use of names in DeploymentRepository SPI
            brian.stansberry

            Ok, I see why the vfs path of the file is being used in a lot of places instead of the repository name; using it to register deployments makes it easier to ignore already known deployments and identify new ones during hot deployment scanning.

             

            I still think the SPI could be cleaner (or maybe just clearer?) but it's not so urgent now. Using the vfs path was leading to problems with the ClusteredDeploymentRepository following the VFS3 upgrade, but I've found a reasonable solution.

            • 3. Re: Use of names in DeploymentRepository SPI
              emuckenhuber
              Yeah, i agree the SPI could be clearer (and cleaner). In general the most important thing is that profileDeployment.name == vdfDeployment.name == managedDeployment.name - this is basically holding everything together. With VFS3 - the deployment.name has to be always the URI.toString() without the trailing "/", since once mounted the URI could change.

              Basically i'm going to deprecate the whole DeploymentRepository stuff, it is too close to a profile. I'm leaving the current implementation, so that you can use it until we find a better soultion for handling deployment contents.

              • 4. Re: Use of names in DeploymentRepository SPI
                brian.stansberry

                Where this got tricky is the ClusteredDeploymentRepository needs to have its own mount of the URIs it's managing, so it can scan everything and replicate changes to any file to the cluster. It can't use the real filesystem URI for this, since children could end up pointing to a copy made by a deployer, not to the orginal file. This is different from HotDeploymentRepository, which only needs to detect new deployments and can hand off a backup VF copy of already known deployments to StructureModificationChecker. So, for a particular file I was ending up with 3 different VFS URIs -- one that matches the real filesystem, one in ProfileDeployment.getRoot() that AutoUnmounter created, and one based on my own internal mount.

                 

                The key thing is the URI of the VirtualFile returned by DeploymentRepository.getDeploymentContent() is what generates the "vfs path" that is used in tons of other places. A DeploymentRepository impl needs understand that and be able to internally translate between that "vfs path" and any URIs it uses internally. I believe I've got that sorted now.

                 

                I was going to tweak the DeploymentRepository javadocs a bit, but if it's going to be dumped maybe I shouldn't bother?

                • 5. Re: Use of names in DeploymentRepository SPI
                  emuckenhuber

                  Hmm, i see. Good to know. I need to look more into how we can improve handling this backup thingy then. Especially a deterministic way to access the originals - since ProfileService should not care about the mounted stuff at all. Maybe we just mount all roots in a virtual location and do all operations based on that.

                  • 6. Re: Use of names in DeploymentRepository SPI
                    brian.stansberry

                    ProfileDeployment.getRoot() is actually returning the mounted originals, so that's deterministic. but:

                     

                    1) It feels a bit strange that the URI of that VF is the mount rather than the normal "real filesystem" path. That's exposing an internal detail that could confuse some user of ProfileDeployment.getRoot(). IIRC for example there's some management view or something that displays the URIs of the ProfileDeployments -- that will end up displaying that mount URI.

                     

                    2) The URI of that VF is not usable by ClusteredDeploymentRepository, because:

                    a) a repository can manage multiple root URIs and that VF's URI doesn't include information to allow the repo to determine which root is involved

                    b) it also omits any path information between the root URI and the file name.

                     

                    But, with the work we want to do on replacing all this, I'm not sure how worth it is to improve it? I've been able to solve my problems; my voluminous posting here are partly me just keeping careful notes and partly documenting issues so we can consider them in the replacement version.

                    • 7. Re: Use of names in DeploymentRepository SPI
                      emuckenhuber

                      Yeah, i don't see much value on improving this piece of code either. I was more referring to better handling of VFS3 in general. Since this problem will remain for hot-deployment in future as well.

                       

                      So maybe we can remove profileDeployment.getRoot() completely and provide additional mechanism to get a hold of the correct files. We only need the VF for creating the actual VDF deployment - and the original (backup) one for hot-deployment checking and removing deployments. Maybe this can be handled more transparently, i need to think about that.

                      • 8. Re: Use of names in DeploymentRepository SPI
                        brian.stansberry
                        AFAICT the only place that needs to see the mounted VF is HotDeploymentRepository.getModifiedDeployments(), which passes it to StructureModificationChecker. So could HotDeploymentRepository.addDeployment() be responsible for the mounting, keeping a cache Map<String realFilesystemPath, VirtualFile backup> to find the backup file for StructureModificationChecker?
                        • 9. Re: Use of names in DeploymentRepository SPI
                          brian.stansberry
                          Ah, missed your statement about needing the backup for removing deployments; yeah that would have to be handled too.
                          • 10. Re: Use of names in DeploymentRepository SPI
                            emuckenhuber

                            Actually we also have to pass the backup root into the StructureModificationChecker, otherwise it could check the temp files when mounted. So we only need the mounted file when creating the actual VDF deployment, for all operations ProfileService should just care about the unmounted view.

                             

                            So i'm thinking of having an abstraction, where we always backup the complete roots and perform all operations based on that - only when creating an deployment we get the mounted root (representing the actual location). I need to see how that could look like.

                            • 11. Re: Use of names in DeploymentRepository SPI
                              emuckenhuber

                              FYI - actually hot-deployment is currently broken, because of not doing all operations based on a backup root: https://jira.jboss.org/jira/browse/JBAS-7790 - So i guess i have to look at fixing some parts in the DeploymentRepository...