Wildfly 10 clustered timer not recovering after DB connection timeout
sreckom Nov 15, 2016 5:52 AMI have clustered timer that uses database store:
<database-data-store name="synctimer-cluster-store" datasource-jndi-name="java:/si.telekom.jdbc.syncdatasource" database="oracle" partition="syncTimerPartition" refresh-interval="60000" allow-execution="true"/>
If the connection to datasource fails, timer doesn't recover. I have to restart the servers to make it work again. It stopped today after receiving this error from jdbc driver:
2016-11-1502:10:10,263 WARN [] [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (EJB default - 3) IJ000604: Throwable while attempting to get a new connection: null: javax.resource.ResourceException: IJ031084: Unable to create connection
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:343)
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:350)
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createManagedConnection(LocalManagedConnectionFactory.java:285)
at org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreConcurrentLinkedDequeManagedConnectionPool.createConnectionEventListener(SemaphoreConcurrentLinkedDequeManagedConnectionPool.java:1319)
at org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreConcurrentLinkedDequeManagedConnectionPool.getConnection(SemaphoreConcurrentLinkedDequeManagedConnectionPool.java:496)
at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getSimpleConnection(AbstractPool.java:626)
at org.jboss.jca.core.connectionmanager.pool.AbstractPool.getConnection(AbstractPool.java:598)
at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:590)
at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:429)
at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:747)
at org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:138)
at org.jboss.as.connector.subsystems.datasources.WildFlyDataSource.getConnection(WildFlyDataSource.java:66)
at org.jboss.as.ejb3.timerservice.persistence.database.DatabaseTimerPersistence.persistTimer(DatabaseTimerPersistence.java:344)
at org.jboss.as.ejb3.timerservice.TimerServiceImpl.persistTimer(TimerServiceImpl.java:616)
at org.jboss.as.ejb3.timerservice.CalendarTimerTask.postTimeoutProcessing(CalendarTimerTask.java:108)
at org.jboss.as.ejb3.timerservice.TimerTask.run(TimerTask.java:170)
at org.jboss.as.ejb3.timerservice.TimerServiceImpl$Task$1.run(TimerServiceImpl.java:1221)
at org.wildfly.extension.requestcontroller.RequestController$QueuedTask$1.run(RequestController.java:497)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: java.sql.SQLException: Socket read timed out
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:412)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:531)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:221)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:503)
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:319)
... 21 more
Caused by: oracle.net.ns.NetException: Socket read timed out
at oracle.net.ns.Packet.receive(Packet.java:320)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:286)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1042)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:301)
... 26 more
Is there any way to recover from this kind of error without user intervention (restart/redeploy)?