4 Replies Latest reply on Jul 15, 2013 11:32 AM by nl

    Best practise: ModeShape in a multi-client environment

    nl

      Hi,

       

      I am thinking about what the best way is to use ModeShape 3 in a multi-client environment. Assume an enterprise application that supports different clients by using a dedicated column in its database tables (xxx_client_id) and ModeShape is used as a storage for related  documents (e.g contracts). The reference between the business object (contract) and the documents is done via the unique identifier (UUID) from MS. There is (as example) a table CONTRACT wich owns a column (xxx_document_id) which points to a node in MS which contains the pdf file.

       

      I currently see 3 options:

       

      a) Use the client id as a property on document nodes.  

      b) Create a workspace for each client

      c) use a dedicated repository for each client

       

      My evaluation on drawbacks (so far):

      a) Visibility needs to be implemented manually which can be a lot of effort. Also since documents should be stored in a structured way (e.g. /files/contracts/contract-123.pdf) one user might get errors if he wants to use the very same location as an other user on a different client because of unwanted same name siblings. Search results might be filtered...

      c) A lot of configuration overhead. I need to create json and cache files for each client. I expect also problems on using a database binary store. From what I see the table name (used) is fixed and as such I need to use different schemas for different repositories.

      b) no drawbacks found so far. Therefor this is my favored approach.

       

      What would you recommend?

       

      Also migration might be a problem. I am currently using MS 2.8.2 with approach c) and only one workspace for each rep. I know that a tool will be provided to export the old structure into a format which can be imported by MS 3. But I assume that it'll be not possible to import the content of workspace "default" from rep "A" into the workspace "client_A" on rep "repostory for all clients"?

      Right?

       

      Thanks for any advice,

       

      Niels

        • 1. Re: Best practise: ModeShape in a multi-client environment
          rhauch

          Re option c), the repository-per-client option: this is really the option if you need complete isolation of client content. It's analogous to a separate database (installation) for each client, and thus it's not really practical or scalable.

           

          Re option b): this very nicely separates all of the client content, but the viability of this option depends very much on the number of clients and how frequently they're added/removed. ModeShape was not designed to have hundreds of workspaces (remember the "/jcr:system" content is shared amongst all workspaces), and you really don't want to be creating them very frequently. Overall, this isn't a bad option, but kind of an anti-pattern with JCR.

           

          Re option a): this does add a bit of overhead to your applications.

           

           

          Have you considered a fourth option of using a hierarchy to separate the content for the different clients? For example, the path to a specific client area might be "/clients/{clientId}", and so the path to the contract document used above might be "/clients/{clientId}/files/contracts/contract-123.pdf". Each client area could be structured independently of all other client areas (thus no conflicts in naming/organization), although by policy you could certainly create similar (or even identical) structures. This approach is probably the most commonly used way of organizing and separating content, because it's actually using the hierarchical strength of ModeShape. It also doesn't have the overhead of option c, and is far more scalable than option b. (If you're going to have many 10s of thousands of clients, you probably want to break the "{clientId}" layer into multiple layers.)

           

          Querying is pretty efficient, too, since JCR-SQL2 has a ISDESCENDANTNODE constraint that is able to limit the results to only those nodes below some path. For example, here's a query that would find all nodes below the "/client/someClientId" node:

           

           SELECT * FROM [nt:base] AS nodes WHERE ISDESCENDANTNODE(nodes,'/client/someClientId')
          

           

          Of course, this is a very general query, whereas you're queries would likely be much more specific.

           

          You've not mentioned needed to authorize content by client, but if you do need that then you could look at the JCR 2.0 access control support that we'll be adding in 3.4 (see MODE-1920 for details).

           

           

          Also migration might be a problem. I am currently using MS 2.8.2 with approach c) and only one workspace for each rep. I know that a tool will be provided to export the old structure into a format which can be imported by MS 3. But I assume that it'll be not possible to import the content of workspace "default" from rep "A" into the workspace "client_A" on rep "repostory for all clients"?

          Right?

          Yes, migration would be difficult going from one 2.x repository to one 3.x repository. You could always implement your own migration application that connects to both and copies content from old to new, using whatever logic you wanted.

           

          Hope this helps!

          • 2. Re: Best practise: ModeShape in a multi-client environment
            nl

            Hi Randall,

             

            In our business area we have rather 100s of installations with max 10 clients per installation than one installation with 100s of clients . Also adding/removing clients really rarely happens.

             

            To sum up I have two options: I can go on with option b) or d) (which is I  truely missed). Honestly, I still prefer option b) as it is a nice tradeoff between total isolation and the one for all approach.

            Regarding authorization: Since our customers have different rules (due to different processes) we use drools for the authorization. Whenever a user wants to manipulate the documents a request is created, passed to drools for validation and the request ends up with either granted or rejected. The rules themselves are also stored in ModeShape in another branch (same workspace, same rep).

             

            Regarding the migration: In either case b) or d) I need to customize the migration. Got it. If you say "migration application that connects to both and copies content from old to new" what what it look like?

            1. Export old repository

            2. Import old repository in a new (temp) repository

            3. Create new (final) repository

            4. Connect both temp and final and perform the copy

             

            Many thanks,

             

            Niels

            • 3. Re: Best practise: ModeShape in a multi-client environment
              rhauch

              In our business area we have rather 100s of installations with max 10 clients per installation than one installation with 100s of clients . Also adding/removing clients really rarely happens.

               

              To sum up I have two options: I can go on with option b) or d) (which is I  truely missed). Honestly, I still prefer option b) as it is a nice tradeoff between total isolation and the one for all approach.

              Okay, given that the number of clients (and thus workspaces) would be on the order of a dozen (or on that magnitude) and that clients are rarely added/removed, then IMO option b) is viable.

               

              However, I still think that option d) is the idiomatic JCR and ModeShape approach, even if there are less than 100 clients in a given installation. Of course, you know fare more about your own applications, so ultimately it is up to you.

               

               

              Regarding authorization: Since our customers have different rules (due to different processes) we use drools for the authorization. Whenever a user wants to manipulate the documents a request is created, passed to drools for validation and the request ends up with either granted or rejected. The rules themselves are also stored in ModeShape in another branch (same workspace, same rep).

              Great!

               

              Regarding the migration: In either case b) or d) I need to customize the migration. Got it. If you say "migration application that connects to both and copies content from old to new" what what it look like?

              1. Export old repository

              2. Import old repository in a new (temp) repository

              3. Create new (final) repository

              4. Connect both temp and final and perform the copy

              That would certainly work (and it might be nice and flexible), or you might be able to skip the temp if the old and new repositories are accessible directly at the same time. Honestly, I'd try and do whatever is easiest.

               

              BTW, you might be able to use JCR export. That often does not work when migrating whole repositories, since per the JCR spec you can't easily "export everything but /jcr:system", and you can't import "a whole file except for the /jcr:system content." But it might work just great to export a customer's content to an XML file and then import that into the appropriate directory. Honestly, I'd probably try that, since it'd be more straightforward. Just be sure to use the system view form of export, not the document view form.

              • 4. Re: Best practise: ModeShape in a multi-client environment
                nl

                Sounds promising. 1000x thanks for your support!!!

                 

                Niels