1 2 3 Previous Next 42 Replies Latest reply on Jul 13, 2012 12:58 PM by darrellburgan Go to original post
      • 30. Re: Infinispan scalability consulting
        dex80526

        Thanks for the explaination on VERIFY_SUSPECT parameters.

        • 31. Re: Infinispan scalability consulting
          scase

          I didn't see any indication of which OS you are using but you may want to look into huge/large page support and the Java -XX:+UseLargePages option.  Depending on OS/version, it can be a hassle to setup the OS support but we have seen good results with regard to GC times for some applications.  I have seen some other reports of improved application stability as well (not specific to Infinispan).

          • 32. Re: Infinispan scalability consulting
            darrellburgan

            Scott Case wrote:

             

            I didn't see any indication of which OS you are using but you may want to look into huge/large page support and the Java -XX:+UseLargePages option.  Depending on OS/version, it can be a hassle to setup the OS support but we have seen good results with regard to GC times for some applications.  I have seen some other reports of improved application stability as well (not specific to Infinispan).

             

            Thanks we'll check it out.

            • 33. Re: Infinispan scalability consulting
              darrellburgan

              Bela Ban wrote:

              Can you describe a bit more how you run your perf test, e.g.:

              - Access rate, how many reads, how many writes per sec ?

              - How many clients ?

              - How much of the access is going to the *same* data (write conflicts) ?

              - What's your read/write ratio ? Infinispan, like other caches, is designed for high read access

              - What the average data size in the cache, how many elements ?

               

              My performance test isn't precisely representative of production traffic patterns, which are very hard to emulate. But the test does generate a heavy load across about 20 tables, exercising many aspects of both primary and secondary caches in our custom ORM. It is not a test at the Infinispan level - it is a test at our ORM level, so it is hitting the same code that production code would hit. Obviously this adds variables to the test, but it allows me to gain confidence that a given configuration of Infinispan isn't going to melt under real world usage.

               

              Our system typically has approximately a 50/50 ratio of database reads to writes, which translates into approximately the same ratio for Infinispan. The load test has approximately the same ratio.

               

              I typically run four nodes in the cluster of the test, using our production JGroups configuration (TCPGOSSIP et al). Each node spawns about 300 threads, each of which enters into a tight loop (only 1 ms sleep per iteration) that relentlessly reads and writes data to cached tables using our custom ORM. I create enough objects that the JVM does a 1 sec garbage collection every 5-6 seconds or so. Using this setup I can simulate the kind of traffic we might see if we had several thousand end users hitting the system pretty hard.

               

              I can monitor the test in these ways:

               

              • We have a monitoring page in our system that shows great detail about every cache in this system. From this I can tell the number of hits we are generating, the size of every cache, and historical information about all the statistics that our Infinispan listener gathers.
              • Our code also has a "trip wire" where if a given cache put or invalidation takes more than 5 seconds, part of the monitoring page goes yellow. One of my primary goals in using the load test is to ensure that trip wire is never tripped.
              • I also can monitor the logs, where I can optionally spit out copious amounts of transactional information about what is going on. Plus any exceptions of any kind are logged. My other primary goal is to ensure the load test can run without throwing any exceptions.

               

              Some limitations of the test:

               

              • It all runs inside my development machine, so all TCP connections are in-memory. Thus network latency is not a factor in the test, and it probably should be.
              • I only give the JVMs 2GB of heap size, whereas in production we have more like 16GB heap size. So the gc behavior is pretty different.
              • The transactions the load test executes are not representative of the exact mix of transactions we see in production, which is quite complex.
              • I don't have a good sense of transactions-per-second or the other stats you mentioned. I should probably enhance the load test to gather those metrics.

               

              See my subsequent post for some results I've gleaned from my latest load testing.

              • 34. Re: Infinispan scalability consulting
                darrellburgan

                I've done some further tuning and am now achieving some excellent results, with the following changes from production:

                 

                • Infinispan 5.1.5.FINAL
                • <FD> timeout greater than expected longest gc pause
                • All caches are invalidation caches.
                • All cache puts are done using putForExternalRead(), which allows invalidation caches to behave the way our custom ORM really wants them to behave.
                • All caches use replication queues set at 100 max entries and 500ms max latency.
                • All caches use AdvancedCache with flag SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP, and (crucially) Flag.SKIP_LOCKING.

                 

                Because the custom ORM in this older version of our system does not have transactioning, and because we are using invalidation caches across the board, I am hoping it is safe to use SKIP_LOCKING, since the order of cache invalidations is not important as long as they get there. I'm also willing to accept some race conditions between two threads where one is trying to read a cache entry at the same time one is trying to invalidate it. Are there other things I should worry about with SKIP_LOCKING?

                 

                In any event, with this setup I was able to increase the heaviness of the load test by about an order of magnitude without any problems. I ran a load test for multiple hours yesterday and did not have a single event of a cache put or invalidation taking more than 5 secs. In fact, the average perceived cache put and invalidation time on the tables in play for the load test were between 0 and 3ms, generally so fast that our measurements are not accurate.

                 

                In short, this is an extremely promising development. It will not help us with our newer architecture, which is fully transactional and uses a JPA-based ORM, but for our older system this looks promising.

                • 35. Re: Infinispan scalability consulting
                  darrellburgan

                  Also here is our production JGroups config file. Does anyone see anything notable that we should change?

                   

                  <config

                            xmlns="urn:org:jgroups"

                            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                            xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.1.xsd">

                   

                            <TCP

                          bind_addr="${jgroups.tcp.address:xx.xx.xx.xx}"

                          bind_port="${jgroups.tcp.port:xxxxx}"

                                      port_range="6"

                                      receive_on_all_interfaces="true"

                                      loopback="true"

                                      recv_buf_size="20000000"

                                      send_buf_size="640000"

                                      discard_incompatible_packets="true"

                                      max_bundle_size="64000"

                                      max_bundle_timeout="30"

                                      enable_bundling="true"

                                      use_send_queues="true"

                                      sock_conn_timeout="300"

                                      enable_diagnostics="false"

                          timer_type="new"

                          timer.min_threads="4"

                          timer.max_threads="10"

                          timer.keep_alive_time="3000"

                          timer.queue_max_size="500"

                          thread_pool.enabled="true"

                                      thread_pool.min_threads="2"

                                      thread_pool.max_threads="30"

                                      thread_pool.keep_alive_time="5000"

                                      thread_pool.queue_enabled="false"

                                      thread_pool.queue_max_size="100"

                                      thread_pool.rejection_policy="discard"

                                      oob_thread_pool.enabled="true"

                                      oob_thread_pool.min_threads="1"

                                      oob_thread_pool.max_threads="8"

                                      oob_thread_pool.keep_alive_time="5000"

                                      oob_thread_pool.queue_enabled="false"

                                      oob_thread_pool.queue_max_size="100"

                                      oob_thread_pool.rejection_policy="discard" />

                      <TCPGOSSIP

                                      timeout="3000"

                                      initial_hosts="xx.xx.xx.xx[xxxxx],xx.xx.xx.xx[xxxxx]"

                                      num_initial_members="2" />

                      <MERGE2 max_interval="30000" min_interval="10000" />

                            <FD_SOCK/>

                      <FD timeout="25000" max_tries="3" />

                      <VERIFY_SUSPECT timeout="1500" />

                      <pbcast.NAKACK

                                      use_mcast_xmit="false"

                                      retransmit_timeout="300,600,1200,2400,4800"

                                      discard_delivered_msgs="false" />

                      <UNICAST timeout="300,600,1200" />

                      <pbcast.STABLE

                                      stability_delay="1000"

                                      desired_avg_gossip="50000"

                                      max_bytes="4m" />

                      <pbcast.GMS

                                      print_local_addr="false"

                                      join_timeout="7000"

                                      view_bundling="true" />

                      <UFC max_credits="2000000" min_threshold="0.10"/>

                      <MFC max_credits="2000000" min_threshold="0.10"/>

                      <FRAG2 frag_size="60000" />

                   

                  </config>

                  • 36. Re: Infinispan scalability consulting
                    sannegrinovero

                    Are there other things I should worry about with SKIP_LOCKING?

                    The internal locks (which you're skipping) are used also to make sure that mutations on the same key are applied in the same order on each replica. When you enable this flag, you might end up having different values in different nodes (permanently).

                    So you're only safe to use it if you either don't care, or your custom ORM makes sure that the same key won't receive several conflicting changes in short time.

                     

                    So I guess you want to remove this flag; the good news is that locking became much more efficient than in the older versions, when this flag could buy you a strong improvement, but today it shouldn't slow you down too much; it would be great if you could measure it: I'd be interested to know what impact it has in your case.

                     

                     


                    In short, this is an extremely promising development. It will not help us with our newer architecture, which is fully transactional and uses a JPA-based ORM, but for our older system this looks promising.

                    Nice to hear on the first sentence; on the second, are you considering Hibernate OGM ?

                    • 37. Re: Infinispan scalability consulting
                      belaban

                      Good to see you finally got some good numbers ! However, running all 4 nodes on the same box doesn't really simulate the real world; e.g. while all TCP connections are loopbacks, the 4 processes do compete for CPU and network, so I expect 1 process per physical box to actually run faster.

                      • 38. Re: Infinispan scalability consulting
                        darrellburgan

                        Sanne Grinovero wrote:

                         

                        The internal locks (which you're skipping) are used also to make sure that mutations on the same key are applied in the same order on each replica. When you enable this flag, you might end up having different values in different nodes (permanently).

                        So you're only safe to use it if you either don't care, or your custom ORM makes sure that the same key won't receive several conflicting changes in short time.

                         

                        So I guess you want to remove this flag; the good news is that locking became much more efficient than in the older versions, when this flag could buy you a strong improvement, but today it shouldn't slow you down too much; it would be great if you could measure it: I'd be interested to know what impact it has in your case.

                         

                        Given that we are using invalidation caches only, how could there be different values in different nodes? The order of application of invalidations shouldn't matter, or am I misunderstanding?

                         

                         

                        Sanne Grinovero wrote:

                         

                        Nice to hear on the first sentence; on the second, are you considering Hibernate OGM ?

                         

                        Yes our new architecture is based on JPA. So many of the tricks we are employing to wring performance out of Infinspan will not help us for the new architecture. But this does buy us considerable time, given it will be awhile before we start throwing serious load at the new architecture.

                        • 39. Re: Infinispan scalability consulting
                          darrellburgan

                          Bela Ban wrote:

                           

                          Good to see you finally got some good numbers ! However, running all 4 nodes on the same box doesn't really simulate the real world; e.g. while all TCP connections are loopbacks, the 4 processes do compete for CPU and network, so I expect 1 process per physical box to actually run faster.

                           

                          The intent of my load test is not really to simulate actual production performance, but simply to give me an apples-to-apples way to measure how changes to our Infinispan and custom ORM configuration affect the scalability. So, I'd rather the load test err on the side of being too hard on the data grid. Ideally I'd like my load test to be worse than anything Infinispan will see in production, but I don't think I'm quite there yet.  :-)

                           

                          Did you have a look at the JGroups config I posted above? We adapted it from one of the examples that came in the JGroups distro. Anything wonky in there?

                           

                          BTW kudos to you and the JGroups team for a remarkable piece of software ...

                          • 40. Re: Infinispan scalability consulting
                            darrellburgan

                             

                            Sanne Grinovero wrote:

                             

                            Nice to hear on the first sentence; on the second, are you considering Hibernate OGM ?

                             

                            Yes our new architecture is based on JPA. So many of the tricks we are employing to wring performance out of Infinspan will not help us for the new architecture. But this does buy us considerable time, given it will be awhile before we start throwing serious load at the new architecture.

                             

                            Sorry I just realized you were referring to this:

                             

                            http://www.hibernate.org/subprojects/ogm.html

                             

                            I'll have to read up on this and understand how it would fit into our new architecture. Also, do you have any real world experience? Our database is highly normalized and joins happen a lot. Can Hibernate OGM + Infinispan compete with an RDBMS for serving up data that has lots of relationships?

                            • 41. Re: Infinispan scalability consulting
                              sannegrinovero

                              Our database is highly normalized and joins happen a lot. Can Hibernate OGM + Infinispan compete with an RDBMS for serving up data that has lots of relationships?

                               

                              All relationships from JPA2 are supported, so "highly normalized" should not be a problem but it doesn't support using joins at query time, unless some very simple cases. We have plans for it, but it's a though one.

                               

                              So it can compete in terms of raw performance in CRUD operations when measuring transactions/second, but no it can't compete in query flexibility of a traditional RDBMS. I would suggest to identify which parts of your domain module might benefit from the alternative storage engine, and split the storage responsibility so to use the best tool for each area.. of course defining "best" is tricky in most cases as performance benefits have a tradeoff.

                               

                              On real world experience.. not I'm afraid not. Some bigger customers sponsored development of some features and are certainly running POCs and have serious intentions about it, but I'm not aware of existing production ready deployments; if you would like to try it out this is the perfect timing to provide feedback: it's growing quickly but we can still affect design, and a good time to join the project too.

                              • 42. Re: Infinispan scalability consulting
                                darrellburgan

                                Okay our changes have been live for a week now, and the data grid is performing spectacularly well. We still don't have a solution for our transactional new architecture, but the old architecture is humming along just fine.

                                1 2 3 Previous Next