8 Replies Latest reply on Dec 26, 2016 4:54 AM by antonz

    Connection failure and lack of managed connections relation

    antonz

      Hello!

       

      I use HornetQ 2.3.21.Final running on EAP 6.3 I begin to get warns within a couple of days (~35) which look like the following:

       

      2016-12-14 05:33:02,593 WARN  [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from invm:0. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

      2016-12-14 05:33:02,593 WARN  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ222061: Client connection failed, clearing up resources for session 6cddfb8f-c1b4-11e6-957d-9527f5eabbff

      2016-12-14 05:33:02,593 WARN  [org.hornetq.core.server] (hornetq-failure-check-thread) HQ222107: Cleared up resources for session 6cddfb8f-c1b4-11e6-957d-9527f5eabbff

      2016-12-14 05:33:02,594 WARN  [org.hornetq.jms.server] (Thread-17566 (HornetQ-client-global-threads-353003541)) HQ122014: Notified of connection failure in xa recovery connectionFactory for provider ClientSessionFactoryImpl [serverLocator=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=c2c83d93-bd68-11e6-957d-9527f5eabbff, factory=org-hornetq-core-remoting-impl-invm-InVMConnectorFactory) ?server-id=0], discoveryGroupConfiguration=null], connectorConfig=TransportConfiguration(name=c2c83d93-bd68-11e6-957d-9527f5eabbff, factory=org-hornetq-core-remoting-impl-invm-InVMConnectorFactory) ?server-id=0, backupConfig=null] will attempt reconnect on next pass: HornetQException[errorType=NOT_CONNECTED message=HQ119006: Channel disconnected]

        at org.hornetq.core.client.impl.ClientSessionFactoryImpl.connectionDestroyed(ClientSessionFactoryImpl.java:425) [hornetq-core-client-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]

        at org.hornetq.core.remoting.impl.invm.InVMConnector$Listener$1.run(InVMConnector.java:214) [hornetq-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]

        at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:105) [hornetq-core-client-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_112]

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_112]

        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_112]

       

      Everything works, but In a moment  the queue stops working and it starts to generate numerous errors (hundreds of them):

       

      2016-12-14 08:44:31,343 ERROR [org.hornetq.ra] (http-/0.0.0.0:9002-297) HQ154002: Could not create session: javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:/JmsXA

      at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:414) [ironjacamar-core-impl-1.0.27.Final-redhat-1.jar:1.0.27.Final-redhat-1]

        at org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:368) [ironjacamar-core-impl-1.0.27.Final-redhat-1.jar:1.0.27.Final-redhat-1]

        at org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:488) [ironjacamar-core-impl-1.0.27.Final-redhat-1.jar:1.0.27.Final-redhat-1]

        at org.hornetq.ra.HornetQRASessionFactoryImpl.allocateConnection(HornetQRASessionFactoryImpl.java:832)

        at org.hornetq.ra.HornetQRASessionFactoryImpl.createSession(HornetQRASessionFactoryImpl.java:465)

      ...

      Caused by: javax.resource.ResourceException: IJ000655: No managed connections available within configured blocking timeout (30000 [ms])

       

      I suppose that happens because all connections in pool are used by someone and never released. The question is: may the first type of error be the cause of the second one? Or I get something wrong and these errors are not related?

      App server works well after restart but I need to know what causes this to happen. App is sometimes migrated by VMWare VCenter. May that cause HornetQ stop working?

        • 1. Re: Connection failure and lack of managed connections relation
          andey

          Hi Anton,

           

          The default max-pool-size in HornetQ's pooled connection factory is 20. This managed connection pool can be exhausted and the IJ000453 message indicates this.

           

          ~~~

          Caused by: javax.resource.ResourceException: IJ000655: No managed connections available within configured blocking timeout (30000 [ms])

          ..

          ~~~

           

          Add a max-pool-size parameter to the "java:/JmsXA" pooled-connection-factory block in the hornetq-server section of the messaging subsystem.

           

          Like below:-

           

          ~~~~~~~~~~~~~~

            <hornetq-server>

            <jms-connection-factories>

            ...

            <pooled-connection-factory name="sitram">

            ...

            <max-pool-size>XX</max-pool-size>

            ...

            </pooled-connection-factory>

            ...

            </jms-connection-factories>

            </hornetq-server>

            

          ~~~~~~~~~~~~~~

           

          You can define the max-pool-size according to your application requirement.

           

          Please test the above the configuration in your environment and let us know if it helps.

           

          Regards,
          Anup

          • 2. Re: Connection failure and lack of managed connections relation
            antonz

            Sorry, I should have mentioned my pooled-connection-factory details...

            Here they are:

             

            <pooled-connection-factory name="hornetq-ra">

                                    <transaction mode="xa"/>

                                    <min-pool-size>1</min-pool-size>

                                    <max-pool-size>50</max-pool-size>

                                    <call-timeout>180000</call-timeout>

                                    <connectors>

                                        <connector-ref connector-name="in-vm"/>

                                    </connectors>

                                    <entries>

                                        <entry name="java:/JmsXA"/>

                                    </entries>

                                    <client-failure-check-period>2147483646</client-failure-check-period>

                                    <connection-ttl>-1</connection-ttl>

                                    <reconnect-attempts>-1</reconnect-attempts>

                                    <consumer-window-size>104857600</consumer-window-size>

                                    <compress-large-messages>true</compress-large-messages>

            </pooled-connection-factory>

             

            So, the pool size is 50. May client-failure-check-period, connection-ttl or reconnect-attempts be an issue?

            Also, it looks like messages in queue cannot be processed at all.

            • 3. Re: Connection failure and lack of managed connections relation
              andey

              Hi,

               

              The only precaution we can take to avoid are close all connections after use and set the max size of pool as per requirement. You Have to set "max-pool-size" and "thread-pool-max-size" of "hornetq-server" a bit higher than the amount of concurrent producers.

               

              This duration depends on how long your JMS clients retain JMS connection(s). If you have a large number of JMS clients accessing java:/JmsXA, it would be good to increase the max-pool-size to an appropriate figure.

               

              If it is possible that you may have more than 50 users concurrently working in the system, you may need to increase the max-pool-size to allow for a greater number of users.

               

              If the maximum number of concurrent users is expected to be less than 50, it may be that there is some connection leak.

               

              You have to close just the JMS connection inside finally block. Make sure client code is safely closing a producer, session and a connection by invoking just connection.close() inside the finally block.

               

              Regards,

              Anup

              • 4. Re: Connection failure and lack of managed connections relation
                antonz

                Yes, I think it is rather a leak, than too low on connection pool size.

                Is there a way to check which connections are no longer valid and free them? I guess, connection-ttl is responsible for that. Am I right?

                • 5. Re: Connection failure and lack of managed connections relation
                  andey

                  Hi,

                   

                  If you're using JMS, the connection TTL is defined by the ConnectionTTL attribute on a HornetQConnectionFactory instance, or if you're deploying JMS connection factory instances direct into JNDI on the server side, you can specify it in the xml config, using the parameter connection-ttl.

                   

                  The default value for connection ttl is 60000ms, i.e. 1 minute. A value of -1 for ConnectionTTL means the server will never time out the connection on the server side.

                   

                  The IJ000655: No managed connections available within configured blocking timeout exception is usually seen due to 1 of 3 reasons:

                  # The datasource connection pool has not been tuned (e.g. max-pool-size and blocking-timeout-millis) correctly for the maximum load on the application.

                  # The application is leaking connections because it is not closing them and returning them to the pool.

                  # Threads with connections to the database are hanging and holding on to the connections.

                  # This managed connection pool can be exhausted and the IJ000453 message indicates this.

                   

                  - The IJ000655 errors indicate that you are running out of connections (all connections that the pool is allowed to create are already in use/owned by application components). This can be due to greater than expected load or else to a failure of some component(s) to close connections when they are no longer needed.  If you are certain the maximum number of concurrent clients/threads/jobs requiring a database connection should never exceed the max-pool-size, you might try enabling the cached connection manager debug facility as detailed in [1] to verify leak.

                   

                  - In order to avoid such exceptions "IJ000655: No managed connections available within configured blocking timeout (30000 [ms])" , increase the value for blocking-timeout-millis for the connection pool where the default value for blocking timeout is 30000 milliseconds or 30 seconds or increase/tune the value for max-pool-size for the connection pool (the default is 20). The blocking-timeout-millis element indicates the maximum time in milliseconds that the thread will block while waiting for a connection before throwing an exception .

                   

                  Note that "blocking-timeout-millis" setting only impacts the block while waiting for an existing connection, and will not impact the block when creating a new connection takes an inordinately long time.

                   

                  - Verify you have to close just the JMS connection inside finally block. Make sure client code is safely closing a producer, session and a connection by invoking just *connection.close()* inside the finally block.

                   

                  - Verify  to increase greater number of max-pool-size.

                   

                  [1]

                  Enable the CCM for the datasource. It defaults to true if it is not explicitly specified but you may set use-ccm="true" explicitly.

                  ~~~

                      <subsystem xmlns="urn:jboss:domain:datasources:1.1">

                         <datasources>

                            <datasource ... enabled="true" use-ccm="true">

                               ...

                            </datasource>

                         </datasources>

                      </subsystem>

                  ~~~

                   

                   

                  Verify that <cached-connection-manager> exists in the jca subsystem and set debug="true".

                   

                   

                  ~~~

                         <subsystem xmlns="urn:jboss:domain:jca:1.1">

                            ...

                            <cached-connection-manager debug="true" error="false"/>

                            ...

                         </subsystem>

                  ~~~

                   

                  Regards,

                  Anup

                  • 6. Re: Connection failure and lack of managed connections relation
                    antonz

                    Thanks, Anup!

                     

                    Will try to find leaks with <cached-connection-manager>, doesn't look like <blocking-timeout-millis> may help, since there are hundreds of retries and not a single one is successful.

                     

                    Best regards,

                    Anton

                    • 7. Re: Connection failure and lack of managed connections relation
                      andey

                      Hi Anton,

                       

                      ~~~

                      <max-pool-size>50</max-pool-size>

                      ~~~

                       

                      Have you checked to increase max-pool-size?

                       

                      If it is possible that you may have more than 50 users concurrently working in the system, you may need to increase the max-pool-size to allow for a greater number of users.

                       

                      Please attach the JBoss configuration file (standalone-*.xml/domain.xml) file to review?

                       

                      Regards,

                      Anup

                      • 8. Re: Connection failure and lack of managed connections relation
                        antonz

                        Hello, Anup!

                         

                        Haven't tried anything yet - it may take some time since this is a production issue and it is not very easy to be reproduced.

                        Also, sorry, I'm not sure I can post the whole xml because it is not my property

                        I'll definitely let you know when I figure this out or at least try to.

                         

                        Best regards,

                        Anton