10 Replies Latest reply on Oct 1, 2010 1:40 PM by craigching

    More on infinispan connector performance

    craigching

      So I've enhanced my version of modeshape with my suggestion of replacing LinkedList with ArrayList in MapNode, performance has improved, but I still see JcrSession.save() taking more and more time as I add more nodes to the repository.  My goal is to get to tens of millions of nodes and test searching, but I can't really, practically yet get to 10's of thousands of nodes.

       

      I've been trying to profile my application and I'm seeing, after about 10,000 nodes being added to my repository, that FederatedRequest.await() is taking ~72% CPU and continues to grow with each session.save().  I'm assuming something is going on on the other side of the countdown latch that is being used there where the other side is taking longer and longer before it decrements the latch, thus freeing up FederatedRequest.await().

       

      I'll continue digging, but it's getting harder and harder for me to continue without some guidance, can anyone shed some light on what might be going wrong?  Or point me in the right direction?

       

      My application is the same one that is posted here: http://community.jboss.org/message/560965#560965 except that I've changed it to create 50000 cars instead of 10.

        • 1. Re: More on infinispan connector performance
          craigching

          I'm still digging into this, I've been sidetracked earlier this week, but I wanted to get this data point on the list in case anyone feels like helping me.  I've profiled ModeShape further and I'm finding a large number of calls to MapTransaction.findNode().  For every 100 new nodes that I save in the session, the number of calls to MapTransaction.findNode() goes up by 10,400.  It's very consistent.  Here's a table that shows the data:

           

           

           

          Items saved in sessionMapTransaction.findNode() invocationsInvocations this batchDifference of invocations vs. last batch
          +100 (100 total in repo)14214
          +100 (200)196945480
          +100 (300)355631586910389
          +100 (400)618432628010411
          +100 (500)985233668010400
          +100 (600)1456034708010400
          +100 (700)2030835748010400
          +100 (800)2709636788010400
          +100 (900)3492437828010400
          +100 (1000)4379238868010400
          +100 (1100)5370039908010400
          +100 (1200)64648310948010400
          +100 (1300)76637411989110411
          +100 (1400)89665413028010389
          +100 (1500)103734514069110411
          +100 (1600)118843615109110400
          +100 (1700)134992716149110400
          +100 (1800)152181817189110400
          +100 (1900)170409818228010389
          +100 (2000)189678919269110411
          +100 (2100)209988020309110400

           

           

           

          Again, here is my logic:

           

           

                          Node root = session.getRootNode();

                          for(int i = 0; i < 50000; ++i) {

                              Node entry = root.addNode("e_" + i, "entry:Entry");

                              entry.setProperty("entry:EVENTID", Integer.toString(i));

                              entry.setProperty("entry:USERNAME", users[i % users.length]);

                              entry.setProperty("entry:OBJECTNAME", objects[i % objects.length]);

                              entry.setProperty("entry:SERVERTIME", Calendar.getInstance());

                              if (i % 100 == 0) {

                                  System.out.println("SAVED!!! (" + i + ") - [" + new SimpleDateFormat("HH:mm:ss.SSSZ").format(new Date()) + "]");

                                  long begin = System.currentTimeMillis();

                                  session.save();

                                  long diff = System.currentTimeMillis() - begin;

                                  System.out.println("==> Save took " + diff + " milliseconds.");

                              }

                          }

           

                          Node root = session.getRootNode();
                          for(int i = 0; i < 50000; ++i) {
                              Node entry = root.addNode("e_" + i, "entry:Entry");
                              entry.setProperty("entry:EVENTID", Integer.toString(i));
                              entry.setProperty("entry:USERNAME", users[i % users.length]);
                              entry.setProperty("entry:OBJECTNAME", objects[i % objects.length]);
                              entry.setProperty("entry:SERVERTIME", Calendar.getInstance());
                              if (i % 100 == 0) {
                                  System.out.println("SAVED!!! (" + i + ") - [" + new SimpleDateFormat("HH:mm:ss.SSSZ").format(new Date()) + "]");
                                  long begin = System.currentTimeMillis();
                                  session.save();
                                  long diff = System.currentTimeMillis() - begin;
                                  System.out.println("==> Save took " + diff + " milliseconds.");
                                  System.out.println("Sleeping for 5 seconds");
                                  Thread.sleep(15000);
                                  System.out.println("Back awake and at it now.");
                              }
                          }

           

          Am I doing something wrong here?  It feels like maybe I'm misusing the JCR API or something.  Basically what I'm seeing is that every save of 100 new nodes invokes findNode n + 10400 where n = the previous number of findNodes invocations.

          • 2. Re: More on infinispan connector performance
            rhauch

            Great work!

             

            What version are you running against? We've made a number of improvements and optimizations to the MapTransaction and related classes. I'd be curious if that improved the results.

             

            Also, your tests add all of the nodes under the root. Have you tried this same test when adding nodes at various levels?

            • 3. Re: More on infinispan connector performance
              craigching

              I'm running against a trunk build from about 2 weeks ago, I'll do another build and test.  I can't seem to build on Windows where I'm running the test (maven craps out with an index out of bounds exception in referencing one of the assembly files, I don't see what the problem is), so I'll have to finish pulling on my mac and do a build there.

               

              As for various levels, I haven't done that.  Our use case really has us at a depth of 1 mostly for the tree ... but now that I type that, I'm thinking that's not necessarily true.  In most cases it is true, but where we're going to have a lot of objects it probably isn't.  Let me set up a test like that and see too.

              • 4. Re: More on infinispan connector performance
                rhauch

                If your project is Maven based, and you need to load the ModeShape artifacts into the repository, you don't need to do an assembly build. Instead, you could just use "mvn clean install" to build all the projects and install them into the local repository.  You could even build them without running the tests, with "mvn clean install -DskipTests=true".

                 

                However, if you instead need all of the JARs as files (and not in a Maven repository), then it's probably easier to just do an assembly build.

                • 5. Re: More on infinispan connector performance
                  craigching

                  'mvn clean install' on the ModeShape projects does build the assemblies (I'm building all from the root pom). Unless there's a profile that I can invoke that will skip the modeshape assemblies ...  Anyway, no worries, I'm building on a Linux machine now, I should be able to test here very shortly.

                  • 6. Re: More on infinispan connector performance
                    craigching

                    I've simplified my tracking of this by adding a public static AtomicLong to the MapTransaction class and incrementing the AtomicLong for every findNode() invocation.  Then, just before my sessoin.save(), I reset it and then getAndReset() it right after session.save() finishes:

                     

                     

                                            MapTransaction.invocations.getAndSet(0);
                                            session.save();
                                            curr_invocations = MapTransaction.invocations.getAndSet(0);

                                            MapTransaction.invocations.getAndSet(0);

                                            session.save();

                                            curr_invocations = MapTransaction.invocations.getAndSet(0);

                    And what I'm seeing, with the current trunk, is that each session.save() will invoke MapTransaction.findNode() n + 10300 times where n == previous number of findNode() invocations.  So, the trunk is not really better with respect to this particular profiling point.  I will say this, though, it "feels" faster (wall time) than the previous build I was using, but it's hard to know for sure if it is or not because I've been living in the profiler for the past week and haven't run a test outside of it since ;-)
                    I'm going to start working on a test that more realistically mirrors how we will actually be using modeshape.  I still think this test is worth investigating, though, as it is quite conceivable that we could have large numbers of objects in a more "flat" hierarchy like this test.  So I may come back to this one at some point as well.
                    • 7. Re: More on infinispan connector performance
                      rhauch

                      Yeah, something is definitely wrong. Can you please log a bug in JIRA?

                      • 8. Re: More on infinispan connector performance
                        craigching

                        https://jira.jboss.org/browse/MODE-923

                         

                        I will provide a complete project and a patch to include the way I counted the invocations.  It might not be until this weekend, though, so fair warning!  Thanks for helping with this Randall!

                        • 9. Re: More on infinispan connector performance
                          rhauch

                          Thanks, Craig, for helping out and being so thorough with your testing and reporting!

                          • 10. Re: More on infinispan connector performance
                            craigching

                            I was able to make some time over lunch to get a project and a diff for counting the findNode invocations done.  It's all in the JIRA issue.