I have a problem where my application in Wildfly 10.1 sometime is leaking database connections.
We recently updated Wildfly from 9 to 10.1 and started to experienced that our web application sometimes (randomly) leaked database connection after startup of the Wildfly server (or rather it seems that the some transaction are not ended so the connection is not returned to the pool).
When we monitor the number of active connections in the connection pool for our XA datasource, we can see that it quite soon after startup of the Wildfly server, the active connections starts to increase rapidly.
Our max Pool Size 800 and we will typically reach this after 10-15 min if we don't restarts the server again. Under normal condition, the number of active connections are typically around 30-40.
Typically, we let the Wildlfy server start up, and then starts Apache so customer can reach our web application.
When I enabled logging for org.jboss.jca, I could see that connection where taken out from the pool but never return and the callstack for all connections in the InUse pool had a last call
originated from an remote EJB call. I couldn't see any issues while looking in our code from the caller of the method. The structure of our application is that we have quite big EAR file with the
web application and other business logic plus a number of EAR files in the same Wildfly server. The communication between EAR files in the same server is done by remote EJB call where the EJB is
looked up through the InitialContext. There is quite some load on the server right form the start when Apache is started.
All our datasource are XADatasources, with the idle-timeout-minutes set to 5 and no flush strategy defined (so I guess it uses the default). We run Wildfy 10.1.0.Final
Java 1.8.0_102, PostgreSQL 9.3.14, uses the provided Hibernate.
Currently, our only workaround is to restart the Wildfly server until it behaves "normal". When it is started without any problem it, runs fine. We suspect that sometimes the transaction for a remote EJB call is not ended for some particular scenario during startup. Unfortunately, we can not reproduce it and we are running out of ideas how to proceed the investigation.
Have anyone experienced the same issue or have any clue how to continue the investigation?