11 Replies Latest reply on May 1, 2014 9:44 AM by Randall Hauch

    Clustering for Horizontal Scalability

    Andrew Woods Newbie

      I am trying to determine if it is possible to achieve higher overall ingest throughput by leveraging additional ModeShape servers in a cluster. I have successfully setup ModeShape clusters that address the issue of high-availability, but have yet to successfully find a configuration for horizontal scalability.

       

      For testing purposes, my base starting points have been:

      - https://docs.jboss.org/author/display/MODE/Installing+ModeShape+into+EAP and

      - https://docs.jboss.org/author/display/AS71/Getting+Started+Guide

       

      My attempts so far have consisted of both:

      - standing up multiple clustered ModeShapes on separate servers (with Infinispan replicated as well as distributed clustering), and creating nodes and datastreams across the servers simultaneously

      - putting multiple clustered ModeShapes on separate servers behind a load balancer (with Infinispan replicated as well as distributed clustering)

       

      All to no avail.

       

      If you have a suggestion for a direction to investigate, that would be ideal.

      If it would be helpful to share details of my configuration and system, I can do that as well. Attached is my "standalone-modeshape-ha.xml"

       

      Thanks in advance.

        • 1. Re: Clustering for Horizontal Scalability
          Randall Hauch Master

          Have you looked at our clustering quickstart?

           

          - standing up multiple clustered ModeShapes on separate servers (with Infinispan replicated as well as distributed clustering), and creating nodes and datastreams across the servers simultaneously

          You probably want to first try either configuring the Infinispan cache's using "invalidation" mode or "replicated" mode before trying "distributed". You can read more about the differences in our clustering overview.

           

          Are the processes in the cluster not joining the cluster? There are a couple of things to look out for:

           

          1. ModeShape should be clustered. This allows the events describing changes on one process to be sent to all other processes in the cluster. This is how event listeners, indexes, and internal services are made aware of changes originating on other processes. Without this, ModeShape will not behave correctly.
          2. Infinispan should be clustered. This is how the persisted content is made available to all processes in the cluster. When Infinispan is clustered properly, a change saved on one process in the cluster will be (almost immediately) visible to sessions in other processes in the cluster.
          3. Even under normal/default log configuration, ModeShape and Infinispan will both output messages saying when they join the cluster. This should help you identify whether clustering is being achieved.
          4. Clustering will not necessarily help (and might actually hinder) performance when you have clients that are competing for write locks on the same node. ModeShape is consistent, and it achieves this by utilizing Infinispan's locks to achieve node-based write locks. When multiple sessions are competing for those locks, the performance of a cluster might not really scale very well. Of course, when sessions on different clusters are writing to different areas of the repository (or only reading anywhere), then clustering will indeed help performance a great deal.
          5. If you're using a load balancer, make sure that each request uses a single session and then closes that session.
          6. I'd suggest that each process maintains its own query indexes (each updated based upon events) rather than using a single master and multiple slaves (with JMS). The latter is a much more complicated configuration, has difficulty should the master go down, and results in lags between the master index being updated and the slaves seeing those changes in the indexes. By having each process own its own indexes, they each have to keep them up-to-date based upon events (this does have additional overhead during writes), but it means that each of the processes is independent, and the indexes are updated significantly faster.

           

          Regarding your configuration, I did notice this:

           

               <replicated-cache name="sample" mode="SYNC" batching="true">

                  <!-- <transaction mode="NON_XA"/> -->

                  <file-store relative-to="jboss.server.data.dir" path="modeshape/store/sample-${jboss.node.name}" passivation="false" purge="false"/>

               </replicated-cache>

           

          You did say which version of EAP (or Wildfly) you're using, but we have seen problems in Wildfly 8 when we used "batching" and tried to enable transactions. You've commented out the transactions, but IIUC really you want to take out the "batching" attribute and include the "NON_XA" transaction mode. (It's not clear to me whether batching makes use of transactions, and ModeShape requires transactions for the content cache.)

           

          Hope this helps. If you have any questions, just ask here and we'll try to be as specific as we can.

          1 of 1 people found this helpful
          • 2. Re: Re: Clustering for Horizontal Scalability
            Andrew Woods Newbie

            Thank you for giving this some consideration, Randall.

            I have re-run my tests using the modeshape-clustering standalone-modeshape-slave.xml configuration (see attached diff for local updates).

            https://github.com/ModeShape/quickstart/blob/modeshape-3.7.1.Final/modeshape-clustering/standalone-modeshape-slave.xml

             

            I am testing on:

            - Amazon EC2 instances (hence the S3_PING) with

            - EAP 6.1.0 Final (http://www.jboss.org/jbossas/downloads.html) and

            - modeshape-3.7.1.Final-jbosseap-61-dist.zip (http://www.jboss.org/modeshape/downloads/downloads3-7-1-final.html)

             

            I have also set up an Apache2 webserver with mod_proxy_cluster.

            https://access.redhat.com/site/documentation/en-US/JBoss_Enterprise_Application_Platform/6.1/html/Administration_and_Configuration_Guide/Install_the_mod_cluster_Module_Into_Apache_HTTPD_or_Enterprise_Web_Server_HTTPD1.html

             

            The start-up server.log for one of the servers (node0) is attached. That log file shows node0 being clustered with three other servers (node1 - 3).

             

            My test is fairly simple:

            - Start four (node0 - 3) EC2 instances with EAP/ModeShape running on each

            - Cluster the four servers behind Apache and via the standalone-modeshape-slave.xml configuration

            - Run a multi-threaded load-test tool against the clustered, Apache endpoint

            ** Record the total throughput

            - Turn off one server and clear the repository (now only three servers are in the cluster)

            - Re-run test, recording throughput

            - Turn off another server and clear the repository (now only two servers are in the cluster)

            - Re-run test...

            - Turn off another server and clear repository (now only one server is in the cluster)

            - Re-run test...

             

            As a note, the load-test tool creates objects and binaries named with generated UUIDs.

             

            The results do not show the expected throughput gains with more servers behind the load balancer. On the contrary, as the number of servers behind the load balancer decreases, the throughput increases. The end result shows that a single ModeShape repository in this configuration has approximately twice the throughput as four clustered ModeShape repositories.

             

            I am very motivated to determine a means of achieving horizontal scalability with ModeShape. Thank you for your input, and please let me know what additional clues I can provide.

            • 3. Re: Re: Clustering for Horizontal Scalability
              Randall Hauch Master

              Did you remove the "batching" attribute from the cache definition? Did you try invalidated cache in addition to replicated? The file cache store is one of the slowest in Infinispan 5.x; have you considered other ones?

              1 of 1 people found this helpful
              • 4. Re: Re: Clustering for Horizontal Scalability
                Andrew Woods Newbie

                Hello Randall,

                I have re-run the same tests as described above with the options you recommended.

                When I use "invalidation-cache" instead of "replicated-cache", the tests start failing with 500 errors.

                 

                I also ran the tests with the "batching" attribute on "invalidation-cache" removed (falling back on the the default of "false"). The performance is basically the same as with batching=true, although there are some start-up errors that were not there before.

                 

                In any case, the performance trend is the same: more servers in the cluster amounts to degraded throughput.

                 

                I have not swapped in an alternate cache store for two reasons. The primary reason being that right now we are interested in the potential of ModeShape/ISPN clustering as it relates to scalability in relative terms. By that I mean that we want to be able to demonstrate that adding more servers to a cluster provides an incremental boost in throughput performance. If the overall performance in these tests is only "fine", that is ok. The objective is to see performance improvement with increasingly larger clusters.

                 

                The second reason that I did not swap out another cache store was because the jboss-as-infinispan_1_4.xsd [1] schema appears to support the following choices:

                - file store

                - custom store

                - a few different jdbc stores

                - remote store

                 

                [1] https://github.com/wildfly/wildfly/blob/master/build/src/main/resources/docs/schema/jboss-as-infinispan_1_4.xsd#L154

                 

                Given the primary objective stated above, it is not clear to me that coming up with a jdbc, custom, or remote store would be worth the effort.

                 

                However, if I am missing something about the inherent characteristics of these other stores as it relates to clustering scalability, I would be more than delighted to give it a shot.

                 

                All of that said, my suspicion is that there may be a more fundamental configuration that I am not getting quite right. Based on the server.log (previously attached), it would seem that ISPN clustering is working. Also, it would appear that the Apache2 mod_proxy_cluster is working. Is there another level of "ModeShape clustering" that may not be in place?

                 

                Horizontal scalability is a critical capability for most modern repository systems. ModeShape/ISPN has all of the pieces to achieve that potential. I do not actually see concrete descriptions or examples of this in ModeShape's documentation. Taking it back a step, has horizontal scalability every been demonstrated (as far as you know) with ModeShape? If the answer is "no", I/we would be very interested in working with you to put in the technical effort required to make that answer a "yes".

                • 5. Re: Clustering for Horizontal Scalability
                  Horia Chiorean Master

                  My "2 cents" on this topic regarding the different ISPN clustering topologies:

                   

                  • Distribution Mode - unless you're planning to store a "gazillion" JCR nodes which would would exceed the storage capacity of any single server node,  this is by far the worst option: there is no configurable way for ModeShape to be able to tell ISPN which keys to store in which server node, which means that it's very likely (especially for parents with lots of children) that you will end up with the parent document in 1 server node, while the children are split across other server nodes. This means that the overhead for both reading and writing would be significant in the worst case scenario, as a lot of network trips would be required. This also means that any "split-brain" scenario (which ModeShape / ISPN doesn't handle atm) would render data highly inconsistent
                  • Replicated Mode - this is the workhorse of clustering from a topology perspective and should work fine as long as the overall number of JCR nodes (data) doesn't exceed the capacity of any single server in the cluster. Performance wise, this should work fine with a good load balancing configuration and a good cache-store. As Randall mentioned above, the performance in ISPN 5.x for the FileCacheStore is pretty much as bad as it gets, so if you can, you should stay away from it (BDB or H2-file should outperform it). A split brain scenario would still be problematic, but could be partially mitigated by making the load balancer avoid one side of the "brain" and basically keep feeding data to/from the other side.
                  • Invalidation Mode - this could potentially outperform Replicated Mode, but if and only if a) each server node has enough memory to store lots of data and b) a shared cache store is used. By shared cache store I mean anything which will be the "central data hub" - like a DB (and definitively not a FileCacheStore). Storing lots of data in memory (via tuning the eviction & expiration params) would mean that read access should be really fast and only writes would invalidate the data & cause a network trip to the "central cache store". IMO this configuration will work best & outperform Replicated Mode if your system is more read-intensive than write-intensive.

                   

                  Regardless of the above clustering topologies, if your context tolerates write-behind (i.e. eventual data consistency), with that enabled the system should be much more responsive.

                  • 6. Re: Clustering for Horizontal Scalability
                    Richard Lucas Apprentice

                    I too am looking at clustering ModeShape both for high availability and scalability and am in the process of determining how best to configure ModeShape and Infinispan to meet my needs.

                     

                    The main difference between myself and the OP is that I am experimenting on ModeShape 4.0.0-Alpha2 on Wildfly 8.0 which in turn uses Infinispan 6.

                     

                    (Please let me know if I should raise create a new post for this.)

                     

                    I have read https://docs.jboss.org/author/display/MODE40/Clustering and looked at the latest version of the clustering quick start but still have a number of questions around best practices.

                     

                    From a deployment perspective I am looking at the following:

                     

                       * Clustering 2 to 5 ModeShape nodes in AWS with a load balancer sitting in front of them

                       * An application which would have roughly a 20% write and 80% read split

                       * The number of JCR nodes should not grow above the capacity of a single server

                     

                    Given this I am looking at using replication as my clustering mode.

                     

                    The questions I have are as follows:

                     

                    1. Is the ModeShape clustering quick start example still relevant for ModeShape 4?

                     

                    2. The clustering quick start uses the terminology ‘Master’ and ‘Slave’, I can see how this is relevant in the JMS example but is there a master/slave relationship when not using JMS?

                     

                    3. The clustering quick start read me states "In a production environment where a Highly Available & Resilient setup is desired, it is recommended that the configuration used is the distributed JMS configuration, with the JMS provider configured for High Availability.” This differs from Randall’s comments above where he recommends not using JMS due to it’s complexity, fragility and lag. Is this a case of the quick start read me being outdated or are there scenarios where it is better to use JMS and a Master/Slave relationship between nodes?

                     

                    4. In Randall’s comments above he recommends using a ‘NON_XA’ transaction mode instead of batching but currently the clustering quick start example has batching enabled and the transaction configuration commented out. Is this again just a case of the quick start being out of date?

                     

                    5. I am fairly new to Infinispan and am looking for recommendations on which cache store to use.

                     

                    Many thanks in advance.

                    • 7. Re: Clustering for Horizontal Scalability
                      Randall Hauch Master

                      1. Is the ModeShape clustering quick start example still relevant for ModeShape 4?

                      Yes. Just be sure that you're on the 'master' branch of the quick start, and not the '3.x' branch. There is at least one issue (MODE-2198) where the configurations look to have been not updated for Wildfly 8.0.0.Final.

                      2. The clustering quick start uses the terminology ‘Master’ and ‘Slave’, I can see how this is relevant in the JMS example but is there a master/slave relationship when not using JMS?

                       

                      No, there is no difference between them if every repository has a local index (which is actually the recommended practice; see next answer).

                      3. The clustering quick start read me states "In a production environment where a Highly Available & Resilient setup is desired, it is recommended that the configuration used is the distributed JMS configuration, with the JMS provider configured for High Availability.” This differs from Randall’s comments above where he recommends not using JMS due to it’s complexity, fragility and lag. Is this a case of the quick start read me being outdated or are there scenarios where it is better to use JMS and a Master/Slave relationship between nodes?

                       

                      Yes and no. :-) For performance, we tend to recommend each repository maintains its own index, and in that case you would not use JMS and every process updates its local indexes for every change to content made on the cluster. Where this approach has drawbacks is in how it handles temporary disconnects: when a process disconnects from the cluster (either intentionally or unintentionally), it no longer receives any of the events describing the changes. When the process comes back up, it can't simply resume receiving events and updating the indexes: the events that it never saw will not be processed. The correct way to handle this is to remove the local indexes and force a rebuild. It's possible to do, especially when planning disconnects.

                       

                      With JMS the configuration is different: there is a single master writer that is updating the indexes. Should this process go down, simply restarting it would continue processing the enqueued events, so nothing is lost. This configuration also makes it easier to bring up (or take down) slave processes, as they only read the (shared/copied) indexes. The major downside to this approach is that the slaves' indexes are updated periodically, so there may be a lag between when a change is made and when those changes are reflected in the indexes on the slaves. This may or may not be acceptable.

                       

                      Basically, you have to decide which approach best satisfies your "fault tolerant" and "high availability" requirements. There is no single solution that will work for everyone.

                       

                      Now, we understand the difficulties with these approaches: really neither is ideal nor as easy as we'd like. This is why we're focusing on making it far easier to cluster ModeShape 4.0. We've already implemented a bit of this: every process maintains a local journal of events, and when a process joins the cluster (either when it is new or after it disconnects for a while) its local journal is quickly updated to be in sync with the other journals. It does this by iteratively asking the other journals for the history that it missed; it's iterative because after the first step is done, new events might have arrived that were not in the first "batch", so the process continues doing this until there are no more new events other than what the journal has already been watching for. What does this mean for you?

                       

                      Another significant change in 4.0 is how indexes work (or will work, as this work is not yet completed in the codebase). ModeShape will no longer index everything: instead, you will explicitly define all indexes and what properties should be included and where that index is to be stored. When queries are planned, ModeShape will automatically pick and use the index that best meets the need of each part of the query. If no index will work, ModeShape will still properly answer the query, but it may take significantly longer to do this because it probably has to scan all nodes in the workspace.


                      We'll offer several options for index storage, but we expect to eventually offer two options that are managed by ModeShape (local file system, local Lucene) and several that update external systems (e.g., Solr, ElasticSearch). You'll be able to define your own storage providers, too. You will even be able to mix and match by using one provider for one index and another provider for another index.


                      Defining all indexes to be stored locally (e.g., on the local file system) will actually be very similar to ModeShape 3.x defined with local indexes: whenever a change is made, that change is broadcast to all processes in the cluster, and each process uses those events to update the indexes. But 4.0 will have a significant improvement over 3.x, because with 4.0 the process will be able to update the indexes using only the events and will not need to load the node. This should be a lot faster and should (dramatically) reduce the overall load on the cluster and persistent storage.


                      Defining an index to use an external system simply means that, when a change is made on one process in the cluster, only that process will handle the event and try to update the (external) indexes. That's because all of the processes will share the external index inside the Solr or ElasticSearch installation. BTW, one major benefit of this approach is that your applications will be able to query through ModeShape or issue queries directly to Solr and ElasticSearch. There are lots of exciting possibilities here.

                       

                      4. In Randall’s comments above he recommends using a ‘NON_XA’ transaction mode instead of batching but currently the clustering quick start example has batching enabled and the transaction configuration commented out. Is this again just a case of the quick start being out of date?

                      This was a mistake that has not yet been corrected; see MODE-2198.

                      5. I am fairly new to Infinispan and am looking for recommendations on which cache store to use.

                       

                      For ModeShape 4.0 and Infinispan 6, I think I'd recommend considering the LevelDB cache store. There are a few others, but I think this is thought to be the best all-around implementation. Perhaps check with the Infinispan forums for their latest thoughts.

                      • 8. Re: Clustering for Horizontal Scalability
                        Randall Hauch Master

                        I have re-run the same tests as described above with the options you recommended.

                        When I use "invalidation-cache" instead of "replicated-cache", the tests start failing with 500 errors.

                         

                        This doesn't seem right. Replicated is supposed to be the same as invalidation, except that replicating automatically loads updated nodes whereas invalidation should simply purge them from the cache.

                         

                        In any case, the performance trend is the same: more servers in the cluster amounts to degraded throughput.

                        There are a couple of things to talk about. The most important is how you're defining throughput, or rather what you're doing when you measure throughput. Reading should absolutely increase with more servers; if not, then something is definitely wrong.

                         

                        Writing, on the other hand, is far more tricky. Remember that every time a node is created, its parent is affected as well. That means that if your performance test is frequently adding nodes under a single parent or the same parents, then you will have a lot of write contention for that parent node. Earlier on in the conversation you said this: "

                         

                        As a note, the load-test tool creates objects and binaries named with generated UUIDs.

                        What is the node structure that you're using? How are the nodes named? Are all these new nodes (objects) being added under a single parent node? If so, how many child nodes are there? Remember, ModeShape can handle large numbers of children, but it is still hierarchical and adding/removing children from parents with very large numbers of children is not the fastest. As always, we recommend using a well-shaped hierarchy that is neither flat and wide nor deep and narrow.

                         

                        Even if you are creating intermediate nodes to give yourself more of a hierarchy, are these intermediate nodes created up-front, or only as needed? If the latter, then you will still almost certainly have write contention on the parent of the intermediate nodes.

                         

                        I have not swapped in an alternate cache store for two reasons. The primary reason being that right now we are interested in the potential of ModeShape/ISPN clustering as it relates to scalability in relative terms. By that I mean that we want to be able to demonstrate that adding more servers to a cluster provides an incremental boost in throughput performance. If the overall performance in these tests is only "fine", that is ok. The objective is to see performance improvement with increasingly larger clusters.

                         

                        The second reason that I did not swap out another cache store was because the jboss-as-infinispan_1_4.xsd [1] schema appears to support the following choices:

                        - file store

                        - custom store

                        - a few different jdbc stores

                        - remote store

                         

                        [1] https://github.com/wildfly/wildfly/blob/master/build/src/main/resources/docs/schema/jboss-as-infinispan_1_4.xsd#L154

                         

                        Given the primary objective stated above, it is not clear to me that coming up with a jdbc, custom, or remote store would be worth the effort.

                        Sounds good, and looks like you are thinking along the same lines (re relative comparisons) that I would expect.

                        However, if I am missing something about the inherent characteristics of these other stores as it relates to clustering scalability, I would be more than delighted to give it a shot.

                        No, I was just worried that you were not comparing various configurations, and were merely trying to eak out the fastest performance for a given topology or configuration. It's happened before.

                        All of that said, my suspicion is that there may be a more fundamental configuration that I am not getting quite right. Based on the server.log (previously attached), it would seem that ISPN clustering is working. Also, it would appear that the Apache2 mod_proxy_cluster is working. Is there another level of "ModeShape clustering" that may not be in place?

                         

                        Horizontal scalability is a critical capability for most modern repository systems. ModeShape/ISPN has all of the pieces to achieve that potential. I do not actually see concrete descriptions or examples of this in ModeShape's documentation. Taking it back a step, has horizontal scalability every been demonstrated (as far as you know) with ModeShape? If the answer is "no", I/we would be very interested in working with you to put in the technical effort required to make that answer a "yes".

                        As I mentioned before, the biggest bottleneck for ModeShape is updates, and whether those updates result in contention across the cluster. Part of this is because ModeShape is strongly consistent, so when there is contention across the cluster to update the same node, by definition those operations have to be serialized. Perhaps not so surprisingly, when this is done in a cluster, the network overhead of that synchronization is overhead compared to if all the updates were done on a single process. This is why the best approach is to minimize as much as possible this contention. There are multiple techniques, ranging from improving the hierarchy design, using larger-grained transactions (especially for "bulk load" operations), to pre-creating intermediate nodes. Of course, not all will be applicable to every scenario. Feel free to give more detail, and we can try to help you improve the write performance.

                        • 9. Re: Clustering for Horizontal Scalability
                          Richard Lucas Apprentice

                          Thanks for the answering my questions, Randall. This clarifies several questions but also raises a few more around architecting a solution based on the new 4.0 feature set. Rather than hijack this thread I am going to start a new thread around clustering best practices in 4.0.

                          • 10. Re: Clustering for Horizontal Scalability
                            Andrew Woods Newbie

                            Thanks for the feedback, rhauch.

                            The test case I have been focused on is the "write" scenario. And although the test nodes are being created hierarchically under the root repository node, there is probably still contention for the root node in creating the hierarchies at runtime/test-time.

                             

                            It seems,  however, that the fundamental issue is that all of the servers in the cluster are replicating state. And as more servers are added to the cluster, more overhead is incurred in replicating writes/updates across the cluster. The Infinispan "distributed mode" appears to address the problem directly, by only replicating across X number of caches, where X is less than the number of server nodes. It would appear, however, that ModeShape is unable to take advantage of this underlying capability.

                             

                            The remarks you made in reply to ma6rl regarding the upcoming ModeShape 4.0 release, and the support for external indexes (Solr and ElasticSearch) seems promising. If we can architect around contention for updating the same nodes while writing within a cluster, and maintain a single external index, I can imagine some possibilities for horizontal scalability for massive write events.

                             

                            I would appreciate any thoughts you may have on the matter.

                            • 11. Re: Clustering for Horizontal Scalability
                              Randall Hauch Master

                              The Infinispan "distributed mode" appears to address the problem directly, by only replicating across X number of caches, where X is less than the number of server nodes. It would appear, however, that ModeShape is unable to take advantage of this underlying capability.

                              Yes, distributed mode does work that way and would reduce the load/effort required to replicate content changes by reducing the number of caches to which those changes have to be sent/coordinated. But ModeShape can use Infinispan in distributed mode; we just don't normally suggest it for a couple of reasons:

                               

                              1. If you are using a distributed cache with X copies, then this will require the same amount of state transfer as a replicated cache of size X. With distribution, X is generally at least 2 but probably 3 or more; the value of X depends a great deal on whether the cache is using a shared cache store: without a cache store, X needs to be more than 1 for safety. Many ModeShape cluster topologies use cache stores and don't involve large cluster sizes, so with a cluster size of 2 replicated and distributed (X=2) makes no difference. Only when you get to larger cluster sizes will distribution actually be advantageous. (Note that if you have a shared cache store, you could actually use distributed with X=1; we've not done any testing on that.)
                              2. ModeShape uses locking, and this is likely one of the biggest bottlenecks. (ModeShape stores each node as a single entry in ISPN, which means that each entry has lots of relationships to other entries. We have to be able to update all of those related entries in a strongly-consistent manner, and ISPN locking provides a great way for us to achieve that. Locking is at the entry/node level, so there is only locking overhead when there are concurrent attempts to update the same nodes; there is essentially no overhead when concurrently updating unrelated nodes. There is also no overhead when reading nodes, even those that are locked for writing, as we use MVCC to provide the readers the consistent views they need.) Many other users of ISPN caches do not have all these dependencies between their entries and thus don't need to use locking; this gives them more options for their topology and also makes scaling much easier for them.
                              3. Most ModeShape topologies seem to need/want a shared cache store. This provides a single point of contention as well, although it may very well not be the bottleneck you are seeing.
                              4. ModeShape 3.x indexing does add some overhead. As I mentioned earlier, I generally prefer to have each process own its own indexes, but that means that when a process receives an event describing a change made in another process, it has to (in 3.x) load the updated node so that it can be indexed. It's tough to say how much overhead this requires compared to the overhead of state transfer and locking, since the amount of indexing and time to load depends on the density of the changes (when a node is change, how much of that node is changed) and whether your process will end up wanting to load those updated nodes anyway (in which case the indexing is doing work you'd normally do anyway). Anyway, it's easy to see how much overhead this is causing your application: simply load test with indexing, then run the same load test but with queries disabled. (The only thing your application can't do during the test is query; you do want the application to read content as usual.)

                               

                              So feel free to give distributed mode a try. And it might even make sense for you with a small cluster size.

                               

                              The remarks you made in reply to Richard Lucas regarding the upcoming ModeShape 4.0 release, and the support for external indexes (Solr and ElasticSearch) seems promising. If we can architect around contention for updating the same nodes while writing within a cluster, and maintain a single external index, I can imagine some possibilities for horizontal scalability for massive write events.

                               

                              I agree. With 4.0 indexing will only use the information in the events and will NOT require loading any updated nodes. This will reduce the load on the cluster and should not be the bottleneck for massive writes, since updating an index will not require using the cache in any way. (We've also overhauled the event bus mechanism to handle listeners far better than previously.) I do think that support for external indexes in Solr and ElasticSearch will be great for you guys, since IIRC you're already using one of them (Solr?). Essentially, for you, ModeShape would never really "own" its indexes but would tell your Solr what's changed and can use your Solr for querying. Plus, the fact that you could also go directly to Solr would also be a great feature.

                               

                              BTW, when using external indexes, you'll still have to create index definitions that describe the kinds of indexes that are available. This is used in the query system so that ModeShape can pick appropriate indexes for a given set of criteria. But interestingly, indexing doesn't need to care about these index definitions: when an index provider is registered, it is given the opportunity to register one or more ChangeSetListener objects that will receive all internal events in bundles we call ChangeSet. Our internal index providers will use a ChangeSetListener for each index, and that listener will look for changes in only those properties that make up the index. The external index providers could do the same thing, but they could also just register a single listener and then consume all the changes and index everything -- ModeShape won't care.

                               

                              Why would an external index provider want to do this? Consider using Solr. If the index provider indexes most/all content, then not only can ModeShape queries use any indexes defined for this provider, but your application can query Solr directly. And if Solr contains a lot of the content, then your Solr queries can be quite powerful.

                               

                              BTW, we've had some people want to use Solr (or ElasticSearch) to store information that is structured in far more business-oriented ways. This would still be possible to do, but may require reading nodes to get more of the business-oriented context. If this is the case, then it may be easier to use a traditional JCR EventListener mechanism where you're listener can load nodes. Or for really intensive scraping operations, you could use the Event Journal feature (already added in 4.0.0.Alpha2) and have the "scraper" do its work against specific time periods, allowing it to operate more slowly than "real time".