Performance Gates for accepting performance fixes in Narayana

Version 4


    Narayana pull requests that target performance optimizations must be run using the PERF profile which is enabled by default. This profile runs a set of micro benchmarks written using JMH: “JMH is a Java harness library for writing benchmarks on the JVM. It was developed as part of the OpenJDK project. JMH provides a very solid foundation for writing and running benchmarks whose results are not erroneous due to unwanted virtual machine optimizations.”

    When you run a performance test lots of things can skew the results (GC, HotSpot compilation, swapping, dead code elimination etc). JMH was written with these issues in mind resulting in much more accurate benchmark results. That said, we are still always happy to hear from our community if you have strong feelings as to the nature of our testing or durations/run counts etc used to run the benchmarks.

    If you have comments about this article please join the discussion using the post on our dev forum.

    How we do evaluate performance improvements?

    By default we run the benchmarks on all PRs. To turn it off just add !PERF in the PR comment.
    The PERF profile runs the benchmarks twice, once on master then on the PR code as part of the same Jenkins job (i.e. both runs are on the same virtual machine so they will have identical resources so the two runs ought to be comparable).

    • First we run a warm up cycle where each benchmark runs for 1 second 20 times
    • Then we perform the measurement 2 times by running the benchmark for 180 seconds. The results are written to csv files (which are archived as jenkins build artifacts). The JMH parameters are taken from an environment variable called JMHARGS. This can only be changed via the Jenkins job config. At the time of writing its value is: JMHARGS="-i 2 -r 180 -wi 20 -f 1 -rf csv -rff"

    We then compare the results of running the branch in the PR with the master branch. The results of the comparison are outputted as a PR comment.

    • If there is a 3% or more degradation in throughput (PR branch versus master) on any benchmark the PR is rejected. The threshold can be changed by editing the Jenkins config for job btny-pulls-narayana by setting a java system property.
    • If there is a less than 10% improvement a comment is added to the PR saying: “If the purpose of this PR is to improve performance then this PR has failed. The threshold for passing optimization related PRs is 10% or greater” (or words to that effect). The threshold can be configured in the PR Jenkins job by setting a system property.

    What tests are executed?

    We have only ported a limited number of our performance tests toJMH benchmarks:

    • testCheckedAction (begins 10 transactions each registering a synchronization and each associating 5 further threads and then each suspending. The for each transaction, resume it remove each child thread and commit);
    • testThreadActionData (begin an AtomicAction and a sub transaction. Manipulate the thread stack and then commit both transactions);
    • onePhase commit of an AtomicAction with one dummy participant;
    • twoPhase commit of an AtomicAction with two dummy participants;
    • jtaTest (enlists 2 XA resources in a JTA transaction using a variety of store types for transaction logging);

    There are many other performance tests that are not integrated into this suite that are candidates to be ported. These will be migrated as and when required

    What will I see as a user?

    Your pull request comment (pass or fail) will include the tools output that records the numbers for your proposed modification or for the performance regression

    What can I do if my proposed performance improvement fails the PR check?

    1. It may be that your improvement is less than the threshold (10%) for our current tests. You should look at the suite and check whether or not it is testing something that would show your improvement. If it wouldn’t show it you will need to raise a PR on the test suite with a test that does exercise your changes.
    2. If you are still convinced it’s worth including please do raise a discussion at our dev space: