6 Replies Latest reply on Dec 9, 2014 5:15 AM by mustafamizrak

    ModeShape External Resources Adding Node Performance Problem

    mustafamizrak

      Hi all,

       

      I configured ModeShape (3.8.x) with external resources. Files are written to external resources.

      I test the application with 2KB-sized of 2000 samples/files (each of them has 2KB size). Writing/Adding up to first 300 files, each add/write node takes no more than 200 ms. After that, the time for adding node increases dramatically (for 400. file takes 3000 ms, single node).

       

      The attached image shows response/complete time over time diagram.

       

      image_0L.png

       

      Even writing files directly to disk takes less time. What is the problem for external resources? How can I handle this problem?


      Any help is appreciated!

       

      Best regards,

        • 1. Re: ModeShape External Resources Adding Node Performance Problem
          hchiorean

          You should profile locally and investigate where the bottleneck comes from (since ModeShape 3.x is not supported in the community anymore, try testing with ModeShape 4.1 if possible)

          • 2. Re: Re: ModeShape External Resources Adding Node Performance Problem
            mustafamizrak

            Hi,

             

            I tested with ModeShape 4.1.0.Final. I used with internal resource as well as external resource. The test cases with JMeter and their results are as follows:

             

            1. Using single external resource: There are 10 threads. Each one want to add  800 (small-sized) 1KB-sized samples to /files_ext_resource directory on external resources. Adding large number of files/nodes into the directory path via http post, it takes longer add node time. After  adding 300 node to ext. resource, the time for adding new node takes enormously time. Results are shown below:
            • Number of sample versus add node time

                 Graph Results.png

            • Response Times Distribution versus number of samples

              Response Times Distribution.png

             

            1. Response Time Over Time
              Bytes Throughput Over Time.png
            2. Using internal resource: There are 10 threads. Each one want to add 800 (small-sized) 1KB-sized samples to /files_int_resource directory on internal resource. Adding large number of files/nodes into the directory path, it takes longer add node time. After  adding 300 node to int. resource, the time for adding new node takes enormously time. Results are shown:
              • Responsive Times Over Time
                Response Times Over Time.png

             

            The question is that: In either internal resource or external resource, the response rate is decrease after a some time. But when deleting existing nodes (i.e. reducing number of files on directory), adding new nodes time decreases and response rate increases and then it saturates again. When the number of nodes gets bigger, adding new nodes increases time  dramatically. Our aim is to use ModeShape to store very big number of files/nodes but it does not seem for that because after some point it will saturated.

             

            Is there any idea to handle this problem?

             

            Regards.

            • 3. Re: Re: ModeShape External Resources Adding Node Performance Problem
              hchiorean

              First and foremost, you need to be aware of the fact that any kind of performance issue is very much context dependent: on your node structure, API usage pattern (how your code interacts with the JCR API) and 1000+ other factors. There is no single magic switch or bug for that matter that will make your use-case very fast all of a sudden.

              Also, please read Large numbers of child nodes - ModeShape 4 - Project Documentation Editor and the attached links which may or may not apply to your use case, but which could shed light on some issues.

              ...Adding large number of files/nodes into the directory path via http post

              If you're measuring performance "remotely" the network round-trip cost also comes into play, not just "local server" performance.

               

              To give you some more insights into the 2 different node types:

              1. "external" nodes - i.e. nodes managed through the FS connector - are not stored (persited) by ModeShape. Instead they are persisted to/from the FS and only held in memory (cached) by ModeShape via a small Infinispan cache. You can control how long they are stored by ModeShape via the "cacheTtlSeconds" property - File system connector - ModeShape 4 - Project Documentation Editor

              2. "internal" nodes - are stored & managed via the (main) Infinispan cache. The main performance factors here are a) the type of the Infinispan cache store you're using and b) the max number of entries you want the cache to store in memory at any given time via the <eviction maxEntries ...> setting. It's imperative that this setting is present (otherwise you'll run out of memory) but the actual value depends very much on your use case and you have to try different values out.

               

              All of the above are general performance considerations but to really solve your issue (if possible) you have to locally profile your application/use case. By that I mean looking at whatever performance problem you're seeing (e.g. throughput) via a profiler: VisualVM, JProfiler, YourKit, Mission Control etc. Only after you do this will you be able to tell what exactly the problem is.

              • 4. Re: Re: Re: ModeShape External Resources Adding Node Performance Problem
                mustafamizrak

                Horia Chiorean wrote:


                Also, please read Large numbers of child nodes - ModeShape 4 - Project Documentation Editor and the attached links which may or may not apply to your use case, but which could shed light on some issues.

                 

                I applied the same. I tried but I got:

                 

                16:16:35,448 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-8) MSC000001: Failed to start service jboss.modeshape.sample.repository: org.jboss.msc.service.StartException in service jboss.modeshape.sample.repository: org.modeshape.jcr.ConfigurationException: The configuration for the 'sample' repository has problems: ERROR: Error at storage.optimization : The 'optimization' field on 'storage' is not defined in the schema and the schema does not allow additional properties.
                    at org.modeshape.jboss.service.RepositoryService.start(RepositoryService.java:193)
                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948) [jboss-msc-1.2.2.Final.jar:1.2.2.Final]
                    at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881) [jboss-msc-1.2.2.Final.jar:1.2.2.Final]
                    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_20]
                    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_20]
                    at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_20]
                Caused by: org.modeshape.jcr.ConfigurationException: The configuration for the 'sample' repository has problems: ERROR: Error at storage.optimization : The 'optimization' field on 'storage' is not defined in the schema and the schema does not allow additional properties.
                    at org.modeshape.jcr.ModeShapeEngine.deploy(ModeShapeEngine.java:480)
                    at org.modeshape.jcr.ModeShapeEngine.deploy(ModeShapeEngine.java:452)
                    at org.modeshape.jboss.service.RepositoryService.start(RepositoryService.java:191)
                    ... 5 more
                

                 

                Do I miss something?

                Regards.

                • 5. Re: Re: Re: Re: ModeShape External Resources Adding Node Performance Problem
                  hchiorean

                  The optimization feature should be triggered by enabling the document-optimization-child-count-target in the Wildfly configuration. If that's what you tried & you're getting this exception, it's probably a bug. Feel free to log a JIRA for it. Thanks.

                  • 6. Re: Re: Re: ModeShape External Resources Adding Node Performance Problem
                    mustafamizrak

                    I have 3 millions files. Average size is 0.5 MB and total size of all files is 1.5 TB. After saving those files to modeshape, my aim is to save new coming files to modeshape taking no more than 0.5 seconds.

                     

                    I do not want such a case that when number of files increases in the repository, the required time for adding files is increased linearly.

                     

                    Is it possible with the latest version of modeshape? If so, what is the best approach/structure/practice/configuration/suggestion for such a case?

                     

                    Regards.