(lies, damned lies, and benchmarks)
Note: Please also read our Performance FAQ.
Recently, a company called Software Tree released a new version of their ORM product, JDX. This release came equipped with STORM, the Software Tree Object/Relational Mapping Benchmark. Since I am interested in ORM, and particularly since they claimed to have ported the benchmark to Hibernate, I knew, with a sinking feeling, that I would have to waste a day evaluating this benchmark. Having dealt previously with vendor-produced benchmarks that claimed to show great differences in performance between their own product and Hibernate, I was quite certain that I knew in advance what the results of this benchmark would show. A first look at the website which described the benchmark convinced me that this was certainly not a serious effort. The benchmark includes one class, Account, with four properties, inserts a few thousand of them in the database, and then runs a few queries by primary key.
Now, having spent a fair amount of time performance profiling database access technologies, I knew for a fact that these kinds of benchmarks are usually completely misleading. With such small datasets, accessed repeatedly, the database is able to completely cache results in memory; the benchmark never actually involves any real disk access (watch your hard drive while STORM runs). We never get to see what happens once the dataset is too large to fit in memory, or is being updated by another transaction. We never get to see what happens with the database is under load. In fact, this benchmark involves no concurrency at all! We never get to see any joins or any of those other things that happen in realistic use cases. Furthermore, these kinds of benchmarks are often run against a local database, which gives results that are absolutely meaningless once the database is installed on a physically separate machine. What this means is that
- any overhead added by the ORM is massively exaggerated compared to production scenarios
- we cannot observe how the system scales
The interesting part of ORM performance comes when you start to investigate caching and especially the flexibility of association fetching strategies. In a nontrivial object model with associations, it is usually association fetching that limits performance. ORM tools must provide flexible ways of choosing between
- lazy fetching plus process-level caching and
- eager fetching using outer joins
We need to avoid the complementary evils of "n+1 selects" and "fetching too much". The n+1 selects problem results in unacceptable increased latency due to many small queries to the database. On the other hand, fetching too much data reduces concurrency - especially in stricter transaction isolation modes - and requires the serialization of large result sets between processes. These are the things that really kill you once you have not only Accounts, but also Customers and Orders, Payments, etc.
Unfortunately the STORM benchmark has only Accounts.
Any numbers to be got out of STORM and other similar benchmarks are meaningless, including the ones I will show below. They are not reproducible in different environments. They are easily confused by many kinds of strange behavior in the database cache and/or hotspot JVM. Please, ignore them!
So, anyway, I downloaded JDX.
I spent some time reading the documentation, to get a feel for exactly what JDX was. Upon first impressions it seems to be nothing special. Actually I'm not sure how you could charge money for this product. It is certainly a very strict functional subset of opensource solutions like Hibernate or OJB. In fact, as I verified later, it looked like JDX is not even what Christian Bauer describes as "full object mapping" in our book. It does not implement automatic dirty checking - meaning that it is up to the user to mark objects as modified, by manually calling update(). The query facility is primitive. The APIs have method names with underscores (C++, anyone?). It has a non-xml based mapping document language. It does not appear to implement transactional write-behind. It doesn't seem to support query pagination. The manual does not seem to mention anything about caching. Nor does the term "outer join" appear anywhere. Hmmmm. Am I sounding a bit too negative here? Well, JDX costs money - $1400 per seat - so I think its reasonable to expect a feature set to rival the free solutions.
Anyway, the benchmark is what we're here for. So, after some fiddling, I got both JDX and Hibernate versions of STORM running on the built-in HSQLDB. The STORM version completed in 12 seconds, the Hibernate version in about two minutes. So JDX is approximately 10 times faster....
I actually laughed out loud before going straight to the benchmark code. Well, the "benchmark" is actually two Java classes. The Account class, and another class with a lonely main() method.
I fixed the basic errors in the Hibernate implementation of the benchmark. The session handling was broken, of course. (Note to future benchmarkers: the Hibernate Session is usually a transaction-scoped object!) Also the programmer didn't understand the rules about flushing sessions, and was flushing the session multiple times. (Note to future benchmarkers: Transaction.commit() flushes the session.)
I ran the benchmark again. This time, Hibernate takes 21 seconds on HSQL. Still twice as slow. Now, HSQLDB is an in-memory Java database that persists data to a flat file. Another way of saying this is: it is not actually a database. I've seen plenty of misleading results from HSQL before, so I quickly tossed it in favor of Oracle. (Note to future benchmarkers: ignore results obtained with HSQL, real databases benefit from very different performance optimizations.)
Reconfiguring the benchmark for Oracle was not especially difficult. This time, JDX STORM completed in 146 second, and Hibernate in 128 seconds. Okay, I thought, so we are slightly faster on Oracle. I'd certainly prefer to be faster on Oracle and slower on Hypersonic! I am still not sure of the cause of the performance difference - but I discovered later (by running Optimizeit) that JDX does not use prepared statements for the queries, and I think that might account for the difference.
Please consider a developer evaluating JDX, who does not have time to waste a whole day fiddling with benchmark examples. They would probably stop after reaching the conclusion that JDX is ten times faster than Hibernate. Now, I'm not accusing Software Tree of intentionally crippling the Hibernate benchmark - but certainly they did not do due diligence. It would not have been hard to drop an email to the Hibernate mailing list, describing the results that they were seeing, and asking for help to optimize the benchmark. They did not do this. Why? I guess because they got the results they were looking for, and stopped looking.
Now, STORM comes with two configuration parameters. One is the number of objects to be inserted/queried per transaction - the other is the number of iterations. The total number of objects in the benchmark is the product of these two parameters. By default, objects per transaction is 2, and iterations is 25000. I moved a zero. For 20 objects per 2500 iterations, Hibernate completed in 18 seconds, and JDX in 53 seconds. Since this seemed just as silly as the first result, I went looking for the bug in the benchmark. I've not been able to find any bug yet, but that doesn't mean its not there. (I'm not sufficiently familiar with JDX.) A strange result, all the same.
So now I was feeling pretty pleased with myself. I tried MySQL. Oh oh! For the default settings, JDX beat Hibernate again, 38 seconds to 61. Damn. I changed the parameters again. Now JDX and Hibernate are neck and neck. I played with these parameters some more and verified that, indeed, Hibernate throughput benefited from large numbers of objects per transaction. With many objects per transaction, Hibernate wins tidily over JDX (an order of magnitude even). For very small numbers of objects per transaction, JDX had an edge (approximately 30%). I have no explanation for this result. Nor does Optimizeit help - Optimizeit is not showing any difference in the profile between the two cases. I even changed the benchmark timings to exclude the overhead of starting and stopping transactions and connections, including only the actual statement execution. It made no difference. All this merely goes to demonstrate my point that in these toy benchmarks, the database and hotspot do funny things. Pay no attention. Different "funny things" happen in production.
We are going to see a lot more benchmarks in the future. As the Java open source community grows, we threaten existing businesses, and the jobs of people employed by those businesses. This is just fine - open source opens up new opportunities and creates new jobs. But Software Tree have a problem. There is no way that Software Tree can realistically compete with Hibernate. As a small company, they simply don't have the resources to catch up with Hibernate in terms of features. And they cannot possibly compete on price. That leaves FUD. Intended or not, this benchmark was misleading and Software Tree published it even when they should have had reasonable doubts as to the accuracy of its results. But what other choice did they have?