Did you checked the gc.log for any full-gc's ? Often this is an issue with the old -generation.
Also you should check the DS pool, you can use the management console or CLI for this. There is a statistic how many connections are use-max or active right now.
As you mentioned you use 7.1.1, this is quite old and you should consider upgrade to Wildfly 8+ or if you need to stay with AS7 you might buy a subscription for EAP6 which is based on AS7
Observed current stats of DS pool thru console is "available count: 99, active count: 21, max used:71".
As i have mentioned earlier Database connection pool size: min:20, max:120.
is there anything that i have to change with gc policy as well?
few of gc.log
2015-10-15T14:56:23.694+0100: 1.427: [GC 1.427: [ParNew: 17024K->1667K(19136K), 0.0092082 secs] 17024K->1667K(2095040K), 0.0094255 secs] [Times: user=0.09 sys=0.00, real=0.02 secs]
2015-10-15T14:56:24.413+0100: 2.141: [GC 2.141: [ParNew: 18680K->2112K(19136K), 0.0637609 secs] 18680K->3780K(2095040K), 0.0639565 secs] [Times: user=0.22 sys=0.01, real=0.06 secs]
2015-10-15T14:56:24.757+0100: 2.484: [GC 2.484: [ParNew: 19118K->2112K(19136K), 0.0132346 secs] 20787K->5673K(2095040K), 0.0134277 secs] [Times: user=0.09 sys=0.00, real=0.02 secs]
2015-10-15T14:56:25.272+0100: 3.005: [GC 3.005: [ParNew: 19136K->2112K(19136K), 0.0183817 secs] 22697K->7117K(2095040K), 0.0185814 secs] [Times: user=0.08 sys=0.00, real=0.02 secs]
2015-10-15T14:56:25.288+0100: 3.024: [GC [1 CMS-initial-mark: 5005K(2075904K)] 7166K(2095040K), 0.0057882 secs] [Times: user=0.03 sys=0.00, real=0.02 secs]
2015-10-15T14:56:25.304+0100: 3.030: [CMS-concurrent-mark-start]
2015-10-15T14:56:26.366+0100: 4.090: [GC 4.090: [ParNew: 19136K->2112K(19136K), 0.0327648 secs] 24141K->8747K(2095040K), 0.0329950 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
2015-10-15T14:56:26.694+0100: 4.430: [CMS-concurrent-mark: 0.104/1.400 secs] [Times: user=5.13 sys=0.73, real=1.39 secs]
2015-10-15T14:56:26.694+0100: 4.430: [CMS-concurrent-preclean-start]
2015-10-15T14:56:26.897+0100: 4.635: [CMS-concurrent-preclean: 0.114/0.205 secs] [Times: user=0.58 sys=0.17, real=0.20 secs]
2015-10-15T14:56:26.897+0100: 4.636: [CMS-concurrent-abortable-preclean-start]
2015-10-15T14:56:27.413+0100: 5.140: [GC 5.140: [ParNew: 19110K->2112K(19136K), 0.0288280 secs] 25745K->11623K(2095040K), 0.0290367 secs] [Times: user=0.09 sys=0.00, real=0.03 secs]
2015-10-15T14:56:27.991+0100: 5.721: [CMS-concurrent-abortable-preclean: 0.619/1.086 secs] [Times: user=3.28 sys=0.81, real=1.09 secs]
do DS pool connection effects performance?
As long as there are no full GC messages it should be no problem with GC.
You should check whether only the DB access is slow or the server processing.
And all that monitoring along the days until itwent slow. Than might giv eyou an idea what goes wrong
have tried to check full gc thru JVisulm console, it was noraml and when performed gc manually it spiked to 28% max.
is there any other way to check full gc?
i have set gc trigger interval of 30mins however when i analyze gc logs i can see gc prints of each minute. is thid normal?
do version of jdk effects performance of jboss?
Trigger GC manually (also from your code with System.gc()) is not a good idea as you might interrupt the JVM from working and it will take a full stop to do the GC.
The gc can be monitored by adding this flag "
-verbose:gc" to the JVM start, you might look to the Java documentation for more options.
In normall behaviour it might happen that you have a GC activity each second/minute, but this is not a problem as long as you see only "New generation" GC's and no "Full" which stops the application from working.
Yes, if you use different versions/vendors of a JVM or misconfigure it it will affects JBoss!
Thanks for your valuable information.will take care not to trigger gc manuallly. we have property '-verbose:gc' already in JVM.
"New generation" GC's and no "Full" which stops the application from working."
--how can i check the above behavior? how to check full gc is happening?
Mean time, as our customers were complaining about slowness, we have captured thread dumps, below is the summary i can see:
Overall Thread Count 280 Overall Monitor Count 214 Number of threads waiting for a monitor 0 Number of threads locking a monitor 24 Number of threads sleeping on a monitor 252 Number of deadlocks 0 Number of Monitors without locking threads 33
90% of all threads are sleeping on a monitor.
What does sleeping threads/threads locking on monitor means?does those threads effecting performance?
Please help me to analyse the above dump.
Simple look for GC's in the Old generation (Tenured).
Also you can check this docs http://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/index.html
Sleeping threads are normal as the server use thread pools where the threads are not active.
Locking might cause performance issues. But it can be intentional if you use synchronized blocks.
So if there are issues you need to dig into that deeper
Have observed full gc in gc.logs as below:
2015-11-02T15:35:53.483+0000: 676.310: [GC 676.310: [ParNew: 19136K->2112K(19136K), 0.0395332 secs] 1524558K->1509122K(2095040K), 0.0398816 secs] [Times: user=0.17 sys=0.00, real=0.05 secs]
2015-11-02T15:35:53.687+0000: 676.514: [Full GC 676.514: [CMS2015-11-02T15:35:54.561+0000: 677.377: [CMS-concurrent-mark: 3.997/20.218 secs] [Times: user=29.50 sys=3.66, real=20.22 secs]
(concurrent mode failure): 1507010K->1163354K(2075904K), 12.0530752 secs] 1513316K->1163354K(2095040K), [CMS Perm : 384871K->384686K(386908K)], 12.0627317 secs] [Times: user=12.83 sys=0.00, real=12.06 secs]
2015-11-02T15:36:07.905+0000: 690.723: [GC 690.723: [ParNew: 102272K->9748K(115008K), 0.0310215 secs] 1265626K->1173103K(2190912K), 0.0313788 secs] [Times: user=0.08 sys=0.00, real=0.03 secs]
2015-11-02T15:36:10.108+0000: 692.930: [GC 692.930: [ParNew: 112020K->9937K(115008K), 0.0530277 secs] 1275375K->1182159K(2190912K), 0.0533573 secs]
do i need to configure/change anywhere to avoid this?
>>Locking might cause performance issues. But it can be intentional if you use synchronized blocks.
So if there are issues you need to dig into that deeper
--is this something related to application level?or its something we can do at middleware level?
Yes, a 20sec full stop is bad for the application.
The concurrent mode failed.
So you can set the following options:
-XX:CMSInitiatingOccupancyFraction=80 // lesser will start the CMS collector earlier which might be better
-XX:CMSIncrementalSafetyFactor=20 // maybe higher will start sooner and being better.
For more information read the GC tuning pages for your JVM version (link for Java8 in my earlier post)
But in any case you need to test and monitor it, I recommend to not change multiple options at one time as it can have positive AND negative effects.
Thanks for the update and suggestions!!
Do i need to configure both the above mentioned properties or any one of them?
Unfortunaltely that depends
You need to measure what is the best option for your use-case.
Consider that this might change if the application or load change, so do a lot of tests and trim the GC for the test might be not sufficient for the real environment.
You can do some tests to see how your application behave and try to change the properties in the real environent accordingly step by step and check the results.
was heading ma brain in some other issues, forgot to remind myself that above suggested parameters are already in place.
JVM paramters : -server -XX:+DoEscapeAnalysis -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:+ExplicitGCInvokesConcurrent -XX:CMSInitiatingOccupancyFraction=80 -XX:CMSIncrementalSafetyFactor=20 -XX:+UseCMSInitiatingOccupancyOnly -XX:MaxTenuringThreshold=32 -Xms2048M -Xmx4096M -XX:MaxPermSize=2048M -XX:+HeapDumpOnOutOfMemoryError -verbosegc -Xloggc:D:\Logs\Online\gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails"
any more suggestions/advices?
You need to check which generation run full.
Maybe you need to adjust the amount of memory for Old/Perm generation.
Also it can be an issue that the survivor space is too small (the current alive objects can not copied to the survivor as it is too small) in that case it will go to Old and must be removed later (which is expensive).
So maybe OccupancyFranction to a lower percentage might help to start the CMS earlier.
But you need to measure it.
If the CMS collector does not help you might try G1GC, but here I recommend to use the latest Java8 update