Setup: Our application is hosted on WildFly 10 on CentOS linux box with 16 cores / 24 Gig enabled, Apache HTTP server 2.4.7 with OpenSSL proxies Wildfly through AJP port.
AJP max connection : 300
HTTP max connection : 300
io-threads : 200
task-max-threads : 500
Issue: Roughly every 3 hours time interval we have a strange issue that suddenly application stops responding or very slow, we couldnt connect to the site even index page,
even management console not responding. We have checked the memory & CPU usage its quite normal i.e., 60% free. We also restarted the Apache server to flush all connection, no luck still same.
We tried loadtest in the testing environment by generating 500 concurrent request to a simple servlet (with thread.sleep for 20 secs) on the above configuration. When the 500 connections are in-progress we couldnt establish any new request through browser or loadtest tool. Surprisingly management-console page still works for new request from the browser. This clearly provides the information that management-console app not handled by "default" worker thread.