5 Replies Latest reply on Mar 5, 2013 9:48 AM by kurtstam

    Versioning artifacts in S-RAMP

    eric.wittmann

      The S-RAMP specification is somewhat silent when it comes to artifact versioning.  There is a core property (attribute) on all artifacts named "version", but it is a simple free-form field that can be set by the client.  The intention appears to defer artifact versioning to clients.

       

      That said, there are two major aspects of versioning that we need to solve:

      • Using the S-RAMP features to appropriately model artifact versioning
      • Identifying when an artifact is unique vs. a new version of an existing artifact

       

      For the first point, here is a simple proposal:

       

      • Store the version number in the "version" core artifact property (duh)
      • Each artifact instance can have "previousVersion" and "nextVersion" relationships (min-cardinality 0, max-cardinality 1)
        • If no "previousVersion" relationship exists, then this is the first version
        • If no "nextVersion" relationship exists, then this is the most recent version

       

      We'll need to work through the requirements and use-cases to see if this model is appropriate.

       

      On the second point (unique vs. new version of existing), things get a little tricky.  Here is one proposal mentioned by Kurt:

      "We should probably default to looking for "artifact in the repository" by checking the name, 
      file size, version (if available) and hash key. If all three match, then link to that artifact rather 
      than adding a new one."

      This seems like a great starting point, but there are some considerations.  I thought about these:

       

      1. If an artifact changes in a non-functional way (e.g. white-space) then we would see this as a new unique artifact.
      2. The name core property (attribute) is mutable, so if a user changes the name then the "new version detection" algorithm would break.
      3. The algorithm might need to be customized based on artifact type.  For example, we probably want to use the targetNamespace property when dealing with WSDLs and XSDs.
      4. Should we use other contextual information such as the Maven GAV info, when available?

       

      NOTE: In some cases, the UI can handle adding new versions via an explicite gesture in the interface.  This discussion focuses on the more common case of attempting to automatically detect new versions of the same artifact.

        • 1. Re: Versioning artifacts in S-RAMP
          kurtstam

          The server should not create a version (unless it is a true derivedArtifact; and thus readonly to the user).

          • Each artifact instance can have "previousVersion" and "nextVersion" relationships (min-cardinality 0, max-cardinality 1)

          Can't we simply add a "isVersionOf" relationship? The user can sort by createDate?

           

          ad 1. Yes you should only be allowed to upload the artifact once. If you want to change it, it needs to be a new version. This includes whitespace change. We *could* allow the user to set a "logicalMatchRelation". Not sure if this will cause more confusion then it will solve.

           

          ad 2. Seems ok to me to allow a namechange, we should have no "version detection"

           

          ad 3. Yeah maybe we can allow "logicalMatch", see also https://community.jboss.org/message/800641#800641. Not sure we should go there.

           

          ad 4. I think only as a validation step to refuse an upload.

          • 2. Re: Versioning artifacts in S-RAMP
            eric.wittmann

            Oh I really like "isVersionOf" as the relationship instead of next and previous.  Good idea.

             

             

            1. I agree that you can only upload an artifact once.  I guess I'm saying that when an artifact is uploaded, do we need to detect previous versions of the same artifact?  We can clearly use the content hash to determine if someone is uploading a duplicate artifact, but what about automatically detecting that a new version is being uploaded (so that we can create the "isVersionOf" relationship appropriately).  That's the use-case where I don't think contentHash, file name, etc helps.

             

            Ultimately, I think there are two use-cases we need to address:

             

            A. Arlen Architect uploads a JAR that contains A.wsdl, B.wsdl, and C.xsd.  Debbie Developer uploads a JAR that contains a JAX-WS implementation of A.wsdl, but she includes A.wsdl as part of that JAR.  In this use-case we do not want to add A.wsdl a second time.  So we use a contentHash to determine that it's a duplicate, and simply link to it rather than add it again.

             

            B. Arlen Architect uploads a JAR that contains A.wsdl, B.wsdl, and C.xsd.  Later, Arlen makes some changes to A.wsdl and uploads a new version of the JAR, which still contains A.wsdl, B.wsdl, and C.wsdl, but A.wsdl is now different.  For B.wsdl and C.xsd, we use contentHash to determine that they haven't changed, so perhaps we just link to them.  However, A.wsdl is different, so we add it as an artifact.  But we should also recognize that not only is it new, but it's also a new version of A.wsdl (and thus we should create the "isVersonOf" relationship between the two A.wsdl artifacts).

             

            Right?

            • 3. Re: Versioning artifacts in S-RAMP
              objectiser

              Hi

               

              I also like the concept of "isVersionOf" as it would potentially allow a version tree to be maintained.

               

              In terms of identifying exact and modified versions of the same artifact, I think using the hash initially is good as it is simple. However it doesn't cater for minor changes, such as the whitespace that Eric mentioned.

               

              The alternative is to have artifact specific comparisons, which may be ok when dealing with high level info (e.g. target namespace), but not sure we want to implement something that does a complete comparison - although that would be an option.

               

              So was thinking maybe the approach should be:

               

              (1) hash comparison to identify exact match

               

              (2) use artifact type specific identity algorithm to identify 'isVersionOf' relationship - using target namespace in wsdl, xsd.

               

              (3) enable appropriate user to be able to refactor repository to indicate that an artifact is actually the same as another, and cause any metadata and relationships on the source artifact to be moved to the 'isVersionOf' artifact?

               

               

              Main reason for (3) is that I think users wouldn't like to have multiple artifacts that are actually the same, but with minor formatting issues, and so having a 'repair' capability would be the next best thing.

               

              Regards

              Gary

              • 4. Re: Versioning artifacts in S-RAMP
                eric.wittmann

                I also like the concept of "isVersionOf" as it would potentially allow a version tree to be maintained.

                 

                Kurt, was your concept of the "isVersionOf" relationship that there would be a single root and all versions of that root would point back to it?  Or would it be more of a tree structure?  I was assuming the former (no support for a tree of versions) to keep things simple.

                • 5. Re: Versioning artifacts in S-RAMP
                  kurtstam

                  Right, no tree structure. We'd point back to the artifact with the olderst creation date.