2 Replies Latest reply on Nov 20, 2011 5:42 PM by bwallis42

    Federation projection overlaps

    bwallis42

      I'm a newby to ModeShape but we are currently using JackRabbit in a development project so I am not new to the JCR concept and API.

       

      I'm looking at the suitability of ModeShape for a large scale document repository that will eventually have in excess of 10M documents and 20TBytes of storage. The current implementation of this system just dumps all the documents into one or more filesystems with a particular date based directory structure and a simple strategy to avoid more than 10,000 documents in a single directory. It also partitions the storage across multiple filesystem mount points (again creation date based).

       

      I'm looking at the Federation connector as a means of

      1. mapping our current directory repository directories into a JCR repository (read only)
      2. storing new data into a different type of persistent store (RDBMS or other possibilities)

      And having a consistent view of the whole repository via the JCR system.

       

      A number of questions come to mind but the first is, What happens if I map two overlapping repositories to the root path, ie: both have the following projection.

       

      {code}/ => /{code}

       

      repository one has the following contents

       

      {code}

      /a/2011/10/4/filea.jpg

      /a/2011/10/4/fileb.jpg

      /a/2011/10/5/filec.jpg

      {code}

       

      repository two has the following contents

       

      {code}

      /a/2011/4/filed.jpg

      /a/2011/5/filee.jpg

      {code}

       

      What do I see in the federated repository? Is it the following?

       

      {code}

      /a/2011/10/4/filea.jpg

      /a/2011/10/4/fileb.jpg

      /a/2011/10/4/filed.jpg

      /a/2011/10/5/filec.jpg

      /a/2011/10/5/filee.jpg

      {code}

       

      What happens if repository one also has the path

       

      {code}

      /a/2011/4/filed.jpg

      {code}

       

      Do I then get

       

      {code}

      /a/2011/10/4/filea.jpg

      /a/2011/10/4/fileb.jpg

      /a/2011/10/4/filed.jpg[0]

      /a/2011/10/4/filed.jpg[1]

      /a/2011/10/5/filec.jpg

      /a/2011/10/5/filee.jpg

      {code}

       

      assuming same-name-siblings is allowed.

       

      thanks

      brian wallis...

        • 1. Re: Federation projection overlaps
          rhauch

          Unfortunately, the federation capabiltiles in ModeShape 2.x do not support what you're trying to do. Basically, federation allows you to have multiple projections, where each projection places an entire subgraph into a particular location within your unified repository. At this time, the projected subgraph cannot overlap other projected subgraphs.

           

          Most of the challenges with supporting the capability you describe above involve identifying the proper behaviors. When merging two independent subgraphs that have nodes with similar paths, when and which nodes should be merged and treated as a single node (e.g., '/a/2011/10') and which should be kept distinct (e.g., '/a/2011/10/4/filed.jpg')? Infering the behavior for files and folders is pretty easy, but it gets more complicated with other node types. Perhaps we could use extended node type attributes, or pre-defined behavior for built-in types and pre-defined mixins for use with custom types.

           

          Another challenge is what to do when the user creates content. For example, consider a new file node being placed under '/a/2011/10/4'. Both sources contain that folder, so in which source should the content be placed? Some simple rules can be used (e.g., the first source declared), but there are other complications, such as if the node cannot be placed in the first source (e.g., due to node type constraints, or because it doesn't support same-name siblings).

           

          In addition to some of the behavioral challenges and specification, the federation connector in ModeShape 2.x doesn't allow many options for richer behaviors. We think, however, that the changes in the architecture for 3.x will allow us to do a lot more, including the use case you've outlined here.

           

           

          BTW, your last example shows same-name-sibling indexes, but in JCR these start at '1' and not '0'. And in fact, every node has a SNS index; it's just that SNS indexes with a value of '1' are not printed in the paths or names. So your last example would look more like this:

           

          /a/2011/10/4/filea.jpg
          /a/2011/10/4/fileb.jpg
          /a/2011/10/4/filed.jpg
          /a/2011/10/4/filed.jpg[2]
          /a/2011/10/5/filec.jpg
          /a/2011/10/5/filee.jpg

          • 2. Re: Federation projection overlaps
            bwallis42

            Thanks for the answer.

            I am currently investigating what is possible and what is not and this helps a lot

             

            regards,

            brian...