I've been evaluating JBoss 4.0.1RC2 and WebLogic 8.1SP2 to host an online auction application. I'm now far enough along to have a few numbers, and I thought others might be interested in the effects of different tuning operations.
The test server setup is the same for both: a BigIP load balancer set to do simple round-robining of HTTP requests, a bank of three Sun Ultra T1 servers running Apache 2 with mod_jk2 and mod_weblogic (each on its own virtual host), a cluster of two Sun Enterprise 220R servers with 4GB of memory to run the app server software (Sun JVM 1.4.2_06), and a Sun Enterprise 420R running Oracle 8.
The test client for this particular set of numbers is a stress-test tool that opens a bunch of simultaneous connections and requests the "place a bid" JSP on the same item. As soon as each JSP returns a result, it immediately opens a new connection and places another bid. Cookies are not preserved across requests, so each one is a new HTTP session.
The application is a combination of JDBC, session beans, and CMP entity beans, all local. One quirk is that it has its own internal equivalent of the JBoss Cache -- when a clusterwide variable is set, the machines in the cluster make HTTP requests to each other to keep everything in sync. Placing a bid depends on either one or two such requests depending on whether you've landed on the cluster's master server or one of the slaves. (I.e., each request will cause cross-cluster traffic.) Once a bid is placed and the result is returned to the client, there is some additional background tasks such as sending E-mail notifications.
But you wanted to see numbers!
With no tuning whatsoever, just the default "all" server configuration, JBoss completes a test run of 500 bids in 4:33. With optimistic locking and instance-per-transaction on all the relevant entity beans, the same test run takes 3:43. With the PreparedStatement cache configured, it's 2:44. With interval-based rather than synchronous HTTP session replication, that drops to 2:35. With a maximum heap size specified in the JVM startup arguments (-Xmx1024m) I get a nearly 50% speedup and the run time drops to 1:21. That got knocked down to 1:10 when I modified our intra-cluster code to remember session cookies between requests (cutting down on session creation) along with a few other tweaks such as fiddling with thread priority to reduce starvation.
The speed increase from the JVM memory setting really surprised me, especially since I haven't seen it mentioned in any JBoss tuning manuals. Maybe it's just considered so obvious as to not be worth mentioning, I don't know. But it made as big a difference as all the other tuning steps combined.
Now to WebLogic. It started off at 2:40 with no particular tuning effort. With the code tweaks I made in the course of testing JBoss, that dropped down to 1:40, but I don't know which changes in particular accounted for which parts of that speed increase. With the JVM memory settings recommended by BEA's tuning guide, the run time dropped to 0:57. With the JDBC connection pool set to a fixed size rather than dynamically growing and shrinking, the time dropped to a blazing fast 0:37. (Fiddling with the pool size had no measurable impact on JBoss, though I did try it.)
So as things stand right now, for this particular test suite on this particular application, with all the tuning tweaks I know how to do, WebLogic is a little under twice as fast as JBoss.
One thing I've noticed from watching the tests run is that JBoss seems to be a lot burstier in its responses. It will crank along for 30 or 40 requests at about the same rate as WebLogic, maybe even a tad faster. Then it will pause for a few seconds, respond to another 10 or 15 requests, pause for a second, and so on. It looks like garbage collection, as there is often no activity in my debug logs during the pauses. WebLogic's results come back at a steady pace with no hiccups. The application code is exactly the same in both cases, so the app servers must be doing something very different with their memory management (assuming I'm correct about it being garbage collection.)
I plan to run other benchmarks and try out any other tweaks that I can find. If I get any interesting results I'll be sure to report them here. And of course I welcome any suggestions people have based on my admittedly vague description of my setup.