9 Replies Latest reply on Aug 11, 2008 5:55 AM by adinn

    WS-AT -- Poor Performance

      Hello,

      I've been load-testing the demo application (WS-AT, WS-BA) and the test results seemed worse than expected. Applying a load of 400 threads per minute (ramp up time = 60 seconds) is enough to get the transaction manager into an inconsistent state (aborted transactions, time outs, heuristic attemtps...etc.), after which the entire JBoss server becomes unstable and stops responding.

      A load of 400 hits/min corresponds to about 7 transactional requests per second. It's hard to imagine that the server could crash if just 7 users try to send requests at the same time. CPU and memory utilization where nowhere near capacity (CPU was at 20%).

      As a sanity check, I load-tested the JMX console servlet, which handled 5000 hits/min without blinking. The CPU was fully utilized, as it should.

      Is this performance level expected? Could this be improved via optimization or fine tuning? I understand that WS-AT is not intended for high volume, but wasn't sure what to expect.

      Please help!
      George

        • 1. Re: WS-AT -- Poor Performance
          jhalliday

          > the test results seemed worse than expected

          What were your expectations based on?

          > a load of 400 threads per minute

          huh? Threads per minute is a rather odd metric. Do you mean transactions per minute?

          > transaction manager into an inconsistent state (aborted transactions, time outs, heuristic attemtps...etc.)

          Only the last of those, assuming you mean heuristic outcomes, may indicate inconsistency. The others may be undesirable but are not inconsistent. Most likely the load on the server is causing communication timeouts.

          > A load of 400 hits/min corresponds to about 7 transactional requests per second.

          Sorry, nope. There is a big difference between a steady 7 tx/sec and and 400 tx/sec for one second followed by 59 seconds with no new tx. Also what is a 'hit' in this context? The term is well defined for web servers - a single http request e.g. GET/POST/etc - but not for web services. Do you mean new transactions, each of which may contain multiple 'hits' i.e. business logic calls, not to mention the hidden 'hits' for transaction control messages.

          > CPU and memory utilization where nowhere near capacity (CPU was at 20%).

          Again, what was your expectation and why? A transactional web service with durable participants will never saturate CPU as it has to log to disk for recovery purposes. And what do you mean by memory utilization? Memory bandwidth? Memory consumption?

          > Is this performance level expected?

          You have said nothing about your hardware, nor accurately described the software configuration used to run the load test. Not even which versions you are using. So I can't really comment. Not that I could anyhow - we don't currently do performance benchmarking or tuning for XTS. We are focused on getting the 1.1 spec implementation feature complete first.

          > Could this be improved via optimization or fine tuning?

          Of course not, that's unthinkable! It's already perfect! :-)

          > I understand that WS-AT is not intended for high volume,

          Oh really? How did you come by that understanding?

          > but wasn't sure what to expect.

          "test results seemed worse than expected" ? Ummm....

          Over the next few months the focus of the work on XTS will shift from feature development to development of a comprehensive test framework. First up is interop and crash recovery testing, but the time will come when we focus on performance. Meanwhile if you describe in detail the test scenario you are using we can at least start to identify potential issues.

          • 2. Re: WS-AT -- Poor Performance

            Hi,

            I ran a large number of tests on various hardware and OS (Fedora, MAC OS X), with RAM ranging from 2-4 GB, and dual to quad CPUs. The RAM and CPUs were never fully utilized. The tests were driven from JMeter. I'm using JBoss 4.2.2 (also tried 4.2.3) with JBossTS 4.3, on Java 1.5 and 1.6. I also tried JBoss with native WS engine as well as Metro.

            I'm working on a project where we plan to use WS-AT on a large scale, and the purpose of the testing is to find out whether this is possible. So far, it seems that JBoss would handle only a fraction of the load it is normally capable of, if that fraction uses WS-AT. I'd like to find out if someone else performed load testing with better results, or whether there's a case study with JBoss running transactional web services in production environment.

            Some of the tests were run with a ramp-up period set to one minute, that is, JMeter will send (x) requests uniformly over the entire ramp-up period of 60 seconds. The server starts misbehaving at about 400 requests over 60 seconds, or about 7 requests/sec. This type of test guarantees that an (x) number of requests will be generated over the ramp-up period, even if earlier requests have not yet received responses.

            The test application that I used is the JBoss XTS demo (restaurant, theatre, and taxi services). I disabled the UI updates (which are built in) to let the app focus on doing transactions. A test request, in this case, is simply to the following URL: <http://localhost:8080/xts-demo/basicclient>, along with the relevant arguments to each of the services. The demo app seemed like a reasonable test, as it is far too bare-boned to be inefficient.

            Another set of tests specified an (x) number of requests per second, repeated (y) number of times. I tried values of 5-30 requests/sec repeated 60 times. The server started to crash at about 20/sec. But the throughput degraded awfully much earlier (at under 10/sec), reaching something like 1 response/sec. In these tests it appeared that JMeter doesn't necessarily start each cycle of requests immediately after the previous set of requests were sent -- i.e, there is a delay between cycles, which may have contributed for the low throughput, and to the fact that the server didn't crash until a higher demand was reached. By "crash" I mean that the server completely stops to respond for several minutes.

            In any case, whether the server reaches a crashing point or not, the throughput drops dead at a dozen or two concurrent users, and the transaction outcomes become unreliable. These tests make it look like WS-AT is not yet usable in a production environment.

            Am I doing something glaringly wrong here?

            Thanks,
            George




            • 3. Re: WS-AT -- Poor Performance
              jhalliday

              Hmm, ok, so you have a single JBossAS instance hosting the transaction manager, client and 3 services? Using WS-AT 1.0 I assume.

              Each transaction involves, based on my rather ropey mental arithmetic, around 25 web services calls, not counting the communication with jMeter. A stable load of 20 tx/sec is thus handling a web service call every 2ms, which is not outrageous for an untuned implementation but clearly leaves a lot of room for improvement. Just out of curiosity, what's the average tx duration and CPU utilization when you run one tx at a time?

              I'm surprised the CPU is not going over 20% in this configuration. Since it's not waiting on the network the only other bottleneck possibility on the hardware side is the disk. Persumably you have log4j turned down so that's not getting in the way? Internally it may be bottlenecked on a Thread pool or some over conservative synchronizaiton. Does sticking a profiler on the server during these tests throw up anything interesting?

              • 4. Re: WS-AT -- Poor Performance

                >single JBossAS instance hosting the transaction manager, client and 3 services?

                Yes.

                >Each transaction involves, based on my rather ropey mental arithmetic, around 25 web services calls

                I didn't realize that, but it's interesting information. I though it was more like 6 roundtrips (one for the prepare and one for the commit/rollback). Could you explain how you came up with that number (25)?

                >what's the average tx duration and CPU utilization when you run one tx at a time?

                One tx at a time does OK, with CPU showing some 50% and throughput of 15/sec, or about 66 ms/tx.

                Otherwise, I turned off logging completely, and it made only a slight difference. Haven't used a profiler.

                • 5. Re: WS-AT -- Poor Performance
                  adinn

                   


                  Internally it may be bottlenecked on a Thread pool or some over conservative synchronizaiton.


                  Specifically, the WS-COOR/AT/BA services use a TaskWorker thread pool to perform asynchronous processing of incoming messages/posting of response messages. The thread pool size has a default maximum of 10 worker threads. You will need to increase this to avoid comms latencies delaying processing of bursts of incoming messages.

                  In older releases (ones which do not include JBossTS 4.3.0 or later -- i.e. AS 4.2.X AS 4.3.X EAP 4.2 EAP 4.3) this maximum is configured from the value specified in the web.xml in ws-c.war. The listener property TaskManager.maxWorkerCount limits the number of worker threads that the thread pool will create. You can change this either by

                  i) editing web.xml in ws-c.war in xts-demo.ear

                  ii) editing web.xml in the war in /XTS/xts-install/lib/ws-c.war and rebuilding the demo

                  iii) editing web.xml in /XTS/WS-C/dev/dd/ws-c_web-app.xml and rebuilding XTS then rebulding the demo

                  In later releases where XTS is deployed as a sar (ones which do include JBossTS 4.3.0 or later -- i.e. AS 5.0.0Beta4 AS 5.0.0.CR1) this value is currently hard-wired in XTSService.java in /XTS/sar./src/org/jboss/jbossts/.You need ot edit this fiel and rebuild XTS and the demo. It will eventually be possible to set it using a property in jbossjta.properties n.b. this setting configures both the XTS 1.0 and XTS 1.1 impelementations.


                  • 6. Re: WS-AT -- Poor Performance
                    adinn

                     


                    Each transaction involves, based on my rather ropey mental arithmetic, around 25 web services calls, not counting the communication with jMeter.


                    Ok, so the sequence of messages is this:

                    Web Client:
                    Activation Service::CreateCoordinationContext (RPC style = 2)

                    Web Client:
                    Web service invocation (RPC style = 2 x number of service calls = 6)

                    Web Service:
                    Registration Service::RegisterParticipant (RPC style = 2 x number of service calls = 6)

                    Web Client:
                    CompletionCoordinator Service::Commit (1 way = 1)

                    Coordinator Service:
                    Participant Service::Prepare (1 way x number of participants = 3)

                    Participant Service
                    Coordinator Service::Prepared (1 way x number of participants = 3)

                    Coordinator Service:
                    Participant Service::Commit (1 way x number of participants = 3)

                    Participant Service:
                    Coordinator Service::Committed (1 way x number of participants = 3)

                    CompletionCoordinator:
                    CompletionInitiator Service::Committed (1 way = 1)

                    I make that 28 messages in total. Admittedly 7 of them are RPC style responses sent on the HTTP back channel so there are only 21 HHTP connections established. Also, the figures here are assuming there are no delays in seeing responses. These can cause resends of Prepare, Prepared, Commit and Committed messages.

                    • 7. Re: WS-AT -- Poor Performance
                      adinn

                      Oh, I forgot to add an extra RPC style call from the web client. It needs to register for the completion protocol in order to be able to invoke the CompletionCoordinator::Commit operation.

                      Web Client:
                      Registration Service::RegisterCompletionParticipant (RPC style = 2)

                      So, make that 30 messages and 22 HTTP connections.

                      • 8. Re: WS-AT -- Poor Performance

                        Thanks for the config tips. I set the maxWorkerCount to 500, and re-ran the tests with 99 threads over 1 second. This would have killed the server in the original setup, but now the server doesn't crash and continues to respond. However, the transactions are unsuccessful, it appears that almost every one of them throws exceptions.

                        After the 99 concurrent threads completed, and the dust settled, I ran a single thread, which completed a successful transaction.

                        Below are samples of relevant sections of the log (these are repeated dozens of times in the log). After all the polite queuing and waiting of threads, some "UnknownTransactionException" is thrown, and everything collapses. Although, at the end, a "[STDOUT] done" line is written to the log.

                        Any thoughts?

                        Thanks again for the patient explanations.
                        George

                        ***********************

                        2008-08-08 10:13:07,892 INFO [STDOUT] CLIENT: obtaining userTransaction...
                        2008-08-08 10:13:07,892 INFO [STDOUT] CLIENT: obtaining userTransaction...
                        2008-08-08 10:13:07,892 INFO [STDOUT] CLIENT: obtaining userTransaction...
                        2008-08-08 10:13:07,892 INFO [STDOUT] CLIENT: obtaining userTransaction...
                        2008-08-08 10:13:07,892 INFO [STDOUT] CLIENT: obtaining userTransaction...
                        2008-08-08 10:13:07,893 INFO [STDOUT] CLIENT: starting the transaction...


                        2008-08-08 10:13:08,218 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_3] - queueTask: creating worker
                        2008-08-08 10:13:08,218 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_3] - queueTask: creating worker
                        2008-08-08 10:13:08,219 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_3] - queueTask: creating worker


                        2008-08-08 10:13:08,456 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:08,456 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:08,458 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task

                        2008-08-08 10:13:08,699 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_4] - queueTask: queueing task for execution
                        2008-08-08 10:13:08,700 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_2] - queueTask: notifying waiting workers (10)


                        2008-08-08 10:13:08,702 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:08,702 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:08,702 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_4] - getTask: waiting for task
                        2008-08-08 10:13:08,702 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_4] - getTask: waiting for task


                        2008-08-08 10:13:09,076 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:09,076 INFO [STDOUT] CLIENT: problem:
                        2008-08-08 10:13:09,076 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_2] - queueTask: notifying waiting workers (64)
                        2008-08-08 10:13:09,076 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_4] - queueTask: queueing task for execution
                        2008-08-08 10:13:09,069 INFO [STDOUT] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at java.lang.Thread.run(Thread.java:613)
                        2008-08-08 10:13:09,076 INFO [STDOUT] CLIENT: problem:
                        2008-08-08 10:13:09,076 INFO [STDOUT] com.arjuna.wst.SystemException: com.arjuna.wst.UnknownTransactionException
                        2008-08-08 10:13:09,076 INFO [STDOUT] at com.arjuna.mwlabs.wst.at.remote.UserTransactionImple.begin(UserTransactionImple.java:122)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at com.arjuna.mwlabs.wst.at.remote.UserTransactionImple.begin(UserTransactionImple.java:96)
                        2008-08-08 10:13:09,076 INFO [STDOUT] at com.jboss.jbosstm.xts.demo.client.BasicClient.testAtomicTransaction(BasicClient.java:453)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at com.jboss.jbosstm.xts.demo.client.BasicClient.doGet(BasicClient.java:408)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at javax.servlet.http.HttpServlet.service(HttpServlet.java:690)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:179)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446)
                        2008-08-08 10:13:09,077 INFO [STDOUT] at java.lang.Thread.run(Thread.java:613)
                        2008-08-08 10:13:09,078 INFO [STDOUT] com.arjuna.wst.SystemException: com.arjuna.wst.UnknownTransactionException
                        2008-08-08 10:13:09,078 INFO [STDOUT] at com.arjuna.mwlabs.wst.at.remote.UserTransactionImple.begin(UserTransactionImple.java:122)


                        2008-08-08 10:13:09,991 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.queueTask_4] - queueTask: queueing task for execution
                        2008-08-08 10:13:09,991 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_3] - getTask: returning task
                        2008-08-08 10:13:09,993 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_4] - getTask: waiting for task
                        2008-08-08 10:13:09,993 DEBUG [com.arjuna.wsc.logging.WSCLoggerI18N] [com.arjuna.services.framework.task.TaskManager.getTask_4] - getTask: waiting for task
                        2008-08-08 10:13:09,993 INFO [STDOUT] done.






                        • 9. Re: WS-AT -- Poor Performance
                          adinn

                          Thanks very much for retrying the test and for the feedback. I'm not yet sure what is happening but the error message reveals that something is going wrong under UserTransaction.begin() after the RPC invocation of ActivationCoordinator::createContext() and either before or during the RPC invocation of RegistrationCoordinator::enlistForCompletion().

                          The first of these RPCs obtains a context from the WS-AT Activation Service. This context provides a unique identify for the TX. A context is only returned to the client when i) on the server side of the RPC the TX identified by the context has been registered in the list of active TXs (i.e. where the Registration Service will find it) and ii) on the client side after return from the RPC the context has been stashed away in a ThreadLocal private to the client thread.

                          The second RPC asks the Registration Service to register the client with the Completion Service so that later on the client can ask the Completion Service to perform a commit or rollback operation. The second call can generate an UnknownTransaction exception before attempting the RPC if it does not find a current context stashed in the client's ThreadLocal. Clearly, this ought not to happen. It can also generate an UnknownTransaction exception if the RPC returns a soap fault with an UNKNOWN_TRANSACTION faultcode. This happens if the service fails to locate the TX in the list of active TXs using the context id supplied in the RPC . Once again this ought not to happen.

                          The TX might be removed from the list of active TXs if the transaction timed out. Since you are not providing a timeout in the begin call (I can tell this from the stack trace) then, assuming you are using the default configuration, the server side will only time out theTX after 2 minutes. Your error messages appear to indicate that errors are happening within 2 seconds of the start calls being made so that makes timeouts look unlikely as a cause

                          It would help us identify the problem if you could:

                          i) switch debug trace back on to see if there are any timeout messages from the transaction reaper (unfortunately, the version you are using only prints debug messages when it times out TXs). Look for occurences of the string Reaper in trhe log. Alternatively, if you can run the tests under a debugger you can place a break on line 259 in class com.arjuna.ats.arjuna.coordinator.TransactionReaper. If it passes this line then TXs are being cancelled.

                          ii) run the tests under a debugger and break the places where the Registration Service generates an UnknownTransaction exception. On the client side you should place a break point on line 168 in com.arjuna.mwlabs.wst.at.remote.TransactionManagerImple and line 118 in in class com.arjuna.wsc.RegistrationCoordinator. On the server side you should place a break point on line 111 in com.arjuna.wsc.messaging.RegistrationCoordinatorProcessorImpl and lines 156 and 164 of com.arjuna.mwlabs.wst.at.RegistrarImple. The first breakpoint indicates the context info is missing before the RPC. The second indicates that the server sent a response indicating the TX was nto found. The remaining break points identify different cases on the server side where the RPC failed to find the TX. Let me know if any of these break points are hit.

                          Also, if you are able to provide me with a copy of your test code I will be happy to try debugging it myself.