Recently I am working on product already running in live for more than 5 years; having DMS module using JSR-170 and JSR-283 features. For this Apache Jackrabbit 2.2.7 is bundled within the application’s EAR package and deployed on JBoss 4.0.1; so the individual repository instance is started and stopped with the containing application, which means that the application is not only connecting to the repository but is also in charge of starting and stopping the repository). The product’s DMS module has performance penalty due to the way Jackrabbit is integrated/deployed and not possible to solve in my case. In short Jackrabbit is easy to use/configure in clustering, but became bottleneck to meet current growing business needs of the product.
Overall Modeshape looks very promising JCR implementation. I am evaluating it for two scenarios (1) consider it to support JCR features while launching product for any new customer (2) to replace Jackrabbit for the existing customers (so need to think mainly on existing content migration strategy from Jackrabbit to Modeshape). Answers to below specific questions would greatly help me for further decision making process.
(1) I understand if we want to use all JCR (1.0 and 2.0) features, then Modeshape must be running on same JVM instance on which application is deployed; because the WebDAV/REST option offers only limited JCR features. For me it is not possible to upgrade JBoss 4.0.1. So please let me know, in case of any limitation by deploying Modeshape on JBoss 4.0.1 by following mentioned steps http://docs.jboss.org/modeshape/latest/manuals/reference/html/configuration.html#deloying_modeshape_to_jbossas.
(2) Just want to know is it possible to run Modeshape in standalone mode on JVM (without deploying on JBoss/Tomcat), even if I don’t require?
(3) I just need to ensure scalability of product; not considering high availability. So please confirm, enabling Modeshape clustering (http://docs.jboss.org/modeshape/latest/manuals/reference/html/configuration.html#clustering_configuration) doesn’t demand for Jboss clustering (http://docs.jboss.org/jbossas/jboss4guide/r4/html/cluster.chapt.html). For example, if there are 3 separate Jboss instances running. On each Jboss instance the product and Modeshape are deployed. Here we don’t want clustering of Jboss; however we’ll require to enable Modeshape clustering for ensuring content synchronization accessed among products on all 3 instances.
(4) In case of Jackrabbit, if too many child nodes then performance goes down (http://wiki.apache.org/jackrabbit/Performance). So for giving consistent performance, does Modeshape has any inherent limitation like this or related to creating content repository model structure?
(5) Jackrabbit says - if you need to write to the same node concurrently, then you need to use multiple sessions and use JCR locking to ensure there is no conflict (http://wiki.apache.org/jackrabbit/QuestionsAndAnswers#Concurrency). This is only true, if only single instance of Jackrabbit is running. But if we consider Jackrabbit in clustering; then good for horizontally scaling reads only (practically zero overhead on read access) and not so good for heavy concurrent writes because exclusive lock over the whole cluster (writes synchronized over the entire cluster). I need to support both heavy concurrent reads and writes in clustering, however Jackrabbit makes write operations in serialized manner (for example if 50 MB pdf file write operation is in progress in Jackrabbit, then it will keep all other operations in wait during that period of time. I confirmed this behavior by taking Java thread dumps). Where can I get detail on how Modeshape reacts to heavy concurrent reads and writes when Modeshape clustering is enabled?
(6) Any option is available for repository content migrating from Jackrabbit to Modeshape, if we consider to switch from Jackrabbit to Modeshape for live systems? Or any suggestion how expensive can it be to develop migration process/script?