As long one JVM can handle the load you might use a.
But you might run into issues that the JVM get stuck by long GC periodically or did not have enough memory (32Bit Java?). In this case you can spit in two instances and build a cluster.
In my experience two instances on the same machine will work if you have enough memory and CPU power.
But you need to verify whether the system can handle the load with the resources.
You missed an option:
c) Run a second AWS instance with JBoss AS and load-balance between the two.
I would pick "c" if using any kind of vitualization, it is the option that allows for greater scalability (you can add a 3rd or 4th instance using the same technique).
The suggestions provided are valuable.
I am looking at to exploit the unutilized CPU capacity (70 % [max affordable CPU utilization] - 40 % [Observed max utilization during peak load] = 30 % free) of the application server to handle even more concurrent users.
For this if suppose I run two Jboss instances (clustered) in the same box with reduced heap sizes and accordingly adjust other associated JVM tuning parameters, would it be possible to have an increase in number of concurrent requests (sum of requests that can be handled by individual instances -> sum of 'maxThreads' of both instances) that can be served by the Server?
Motivation to think in the above respect:- Have come across a rule of thumb for calculating the value of 'maxThreads' that can be specified in Jboss config -> [200 * No. of CPU] +/- depending on RAM and other Machine specs.
Does this rule apply for each instance of Jboss running on the server? or it applies to the server as such, irrespective of how many Jboss instances are running in it?
There is no such "rule of thumb". I suspect that whatever you read was specific to that person's environment. Unless you do you own load testing you will not be able to determine what kind of load your system will be able to handle. And having two JBoss AS instances running on one server will not double the number of request threads that can be specified (or maybe it can, but it won't increase throughput). Youy have to remember that the more load you place on the system the longer it will take to process each individual request. It is a fine balancing act between the amount of load and the response time.