High CPU usage in cluster
cmosher01 Dec 5, 2014 3:28 PMWe are running into a problem with Wildfly 8.1.0.Final, running a standalone cluster. After several minutes of high production-level load, we notice some nodes bogging down, pegging the CPU. Numerous thread dumps always show the following stack trace using all the CPU:
"KeyAffinityService Thread Pool -- 1" prio=10 tid=0x00007f4ec413d000 nid=0x20fe runnable [0x00007f4e904c3000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:272)
at sun.security.provider.NativePRNG$RandomIO.readFully(NativePRNG.java:202)
at sun.security.provider.NativePRNG$RandomIO.ensureBufferValid(NativePRNG.java:264)
at sun.security.provider.NativePRNG$RandomIO.implNextBytes(NativePRNG.java:278)
- locked <0x000000059884afd0> (a java.lang.Object)
at sun.security.provider.NativePRNG$RandomIO.access$200(NativePRNG.java:125)
at sun.security.provider.NativePRNG.engineNextBytes(NativePRNG.java:114)
at java.security.SecureRandom.nextBytes(SecureRandom.java:455)
- locked <0x000000057b251e40> (a java.security.SecureRandom)
at io.undertow.server.session.SecureRandomSessionIdGenerator.createSessionId(SecureRandomSessionIdGenerator.java:44)
at org.wildfly.clustering.web.undertow.IdentifierFactoryAdapter.createIdentifier(IdentifierFactoryAdapter.java:42)
at org.wildfly.clustering.web.undertow.IdentifierFactoryAdapter.createIdentifier(IdentifierFactoryAdapter.java:32)
at org.wildfly.clustering.web.infinispan.AffinityIdentifierFactory.getKey(AffinityIdentifierFactory.java:55)
at org.infinispan.affinity.KeyAffinityServiceImpl$KeyGeneratorWorker.generateKeys(KeyAffinityServiceImpl.java:247)
at org.infinispan.affinity.KeyAffinityServiceImpl$KeyGeneratorWorker.run(KeyAffinityServiceImpl.java:220)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
at org.jboss.threads.JBossThread.run(JBossThread.java:122)
This is coupled with trying to read a massive amount of random numbers. We even introduced an external program on the host to generate random numbers for /dev/random, which helped the alleviate the problem, but only for a short time. After a half hour or so we saw the same behavior. This caused such a problem that we had to abandon using clustering in our production environment.
Any idea what is happening here? Any workarounds? Why is it trying to generate so many keys? Why is it using so much CPU? Is there any way to reduce the number of keys needed? If not, then is there at least some way to configure it to use an internal random number generator instead of using /dev/random or /dev/urandom?