if you have 100'000 clients accesssing your app in 1 minute, it means that you will have about 2000 concurrent connections. Are they web or fat java clients?
Personaly, I wouldn't be very affraid about JBoss but more about the DB load (if any) and the OS tuning (to handle connections). As for clustering, I wouldn't use http session clustering and simply put a load-balancer in front of a set of JBoss instances.
> if you have 100'000 clients accesssing your app in 1
> minute, it means that you will have about 2000
> concurrent connections. Are they web or fat java
Java (MIDlet) clients. They open a HTTPConnection and get a tiny XML file from the server - if there is new data at all... usually, most requests will have "null-results".
One option we are considering is writing the data into the filesystem and then serving from an optimized Web server. Possibly, the logic for getting "non-standard" data could even be implemented at the client (first try the optimized file-HTTP-server, if that fails, fall back to the App-Server).
> Personaly, I wouldn't be very afraid about JBoss but
> more about the DB load (if any) and the OS tuning (to
> handle connections).
As the responses only change every few minutes for the same request, the DB load should be minimal. I think handling the connections will be quite a problem. I've done a little benchmark with Apache vs. Tomcat, Apache serving a tiny file, Tomcat with a minimal servlet ("put the String 'TEST' as a final static byte array into the resulting stream - DONE).
Apache made up to 6.000 connections / minute (possibly the limit of my DSL connection), Tomcat only got up to 3.000 (starting at 600). It was a little Java app that simply started 100 Threads opening connections, getting the content length and closing the connections for one minute.
One problem may be that Tomcat opens a new Thread for every connection - after the test, the whole server took 5 minutes to recover and even SSH was terribly slow... (it only has 128MB of RAM, though, which may be the explanation - however, 3.000 connections / minute is by FAR less than what we *may* have to deal with). An implementation with non-blocking IO may solve this, however...
As these are TCP connections, does that mean that the connection must be kept open as long as a (slow) client fetches data? From my understanding, the major problem will be concurrently open connections. We can easily optimize response time at the server, but if we go through a slow network, this won't help much. The data to be transferred is quite little, but it may still take almost a second/request due to the slow GPRS networks...
> As for clustering, I wouldn't
> use http session clustering and simply put a
> load-balancer in front of a set of JBoss instances.
Session clustering would definitely be overkill...
Could you please try with Jetty (aka JBossWeb) as well? It is part of the default jboss install. Furthermore, if the goal is to serve static file, configure Jetty to use a ResourceHandler instead of a servlet handler, it will be faster.
Join the "Benchmarking" thread.
Jetty 4.2 will be in the next JBoss-3.2 release.
If it can handle 2000 req/sec (servlet), I make that 120,000 a minute....
If Connections are the bottleneck, then you may be looking at a cluster - I don't know how Apache would compare here, but a friendly Apache developer reads this list too.
> Join the "Benchmarking" thread.
Thanks for the pointer! That looks very interesting!
> Jetty 4.2 will be in the next JBoss-3.2 release.
When can that release be expected?
> If it can handle 2000 req/sec (servlet), I make that
> 120,000 a minute....
That sounds amazing...
> If Connections are the bottleneck, then you may be
> looking at a cluster - I don't know how Apache would
> compare here, but a friendly Apache developer reads
> this list too.
Apache is out of the race - if we use static pages / static content we will use tHTTPd as this is probably the fastest non-commercial solution currently available (uses non-blocking IO). We'll avoid any "Apache+Tomcat+JBoss" kind of setups. If we need static content plus dynamic content, we'll use different machines right away...
AFAICS, the only advantage of using Apache+Tomcat would be that dividing static / dynamic content is handled smoothly - correct me, if I'm wrong. But I think as we control the client and as we expect heavy heavy loads, it'll be better to just go directly with Jetty or Tomcat and let those only do what they do best: dynamic content.
For the static stuff (probably only media, e.g. images) we may be able to go with one "light" machine as requests for these are expected to be comparatively rare... or maybe we'll store it all in RAM and have a servlet handle this ;-)
If you pull down Jetty-4.2 from SourceForge (give it a few hours, there is a fix going in ), you will find that it ships 2 org.mortbay.jetty.jars - 1 for JVM 1. and one (the default) for 1.4.
The 1.4 jar should contain a non-blocking IO Listener.
JBoss builds with the 1. source as it has to run on 1.3.
Substituting the jar in .../deploy/jbossweb.sar for the one in the Jetty distro should allow you to play with this - you will also have to look at the configurationElement in .../deploy/jbossweb.sar/META-INF/jboss-service.xml and master the art of Jetty's configuration language - a thin XML veneer over it's Java classes (jetty.mortbay.org for doc).
> The 1.4 jar should contain a non-blocking IO
This is very cool! I've played around with the most current version of jetty (was 4.2.0rc1... we're at 4.2.1 now ;-) ) and in a setup with 2 1GHz machines, 512MB of RAM and a direct (via crossover-cable) 100MBit connection, I made up to 25.000 connections per minute.
My "LoadTester" opens 100 Threads and each tries to open a connection and perform one stream.read();, then closes the stream and do it again. Very cool!
> JBoss builds with the 1. source as it has to run
> on 1.3.
> Substituting the jar in .../deploy/jbossweb.sar for
Thanks, I guess that'll help me playing around with the whole setup. From my first impression, Jetty looks very very cool!