4 Replies Latest reply on May 8, 2014 4:39 AM by sbharate

    Jboss server out of memory exception

    ronandavid

      Hello,

       

      My application is running on JBoss 4.0.3SP1/Red Hat Enterprise Linux Server release 5.3 (Tikanga)

      The jboss server sometimes (not very often) stucks. The log file shows memory exceptions (see attached file server_log.txt for more information).

       

      12/17 23:32:41,358 ERROR [org.jboss.ejb.plugins.jms.JMSContainerInvoker] JMSContainerInvoker(ReportMirrorBean) Reconnect: Could not stop JMS connection

      org.jboss.mq.SpyJMSException: Cannot disable the connection with the JMS server; - nested throwable: (java.io.IOException: Client is not connected)

      at org.jboss.mq.SpyJMSException.getAsJMSException(SpyJMSException.java:66)

      at org.jboss.mq.SpyJMSException.rethrowAsJMSException(SpyJMSException.java:51)

      I run jmap on jboss pid (see jmap.txt).

       

       

      Thank you for your help

      It shows a huge amount of mq objects : about 145000 objects org.jboss.mq.SpyXAResourceManager or org.jboss.mq.ConnectionToken.

      How can I have so much mq objects ?

        • 1. Jboss server out of memory exception
          wdfink

          I don't see a 'Out of Memory' Exception in your server.log file.

          It might be that you have a GarbageCollector problem.

          You should activate the GC logging, use jstat for monitoring GC, and provide a server.log with the error.

          The Exception you post might be an aftereffect.

          • 2. Jboss server out of memory exception
            ronandavid

            Hello,

             

            thank you for your answer.

             

            in fact I first noticed exception like

             

            org.jboss.mq.SpyJMSException: Cannot disable the connection with the JMS server; - nested throwable: (java.io.IOException: Client is not connected)

             

             

            at org.jboss.mq.SpyJMSException.getAsJMSException(SpyJMSException.java:66)

            at org.jboss.mq.SpyJMSException.rethrowAsJMSException(SpyJMSException.java:51)

            12/17 23:32:41,358 ERROR [org.jboss.ejb.plugins.jms.JMSContainerInvoker] JMSContainerInvoker(ReportMirrorBean) Reconnect: Could not stop JMS connection

            and then (when the server is stuck) exception like

            003/03 04:10:02,276 ERROR [org.jboss.ejb.plugins.LogInterceptor] http-8443-1: Unexpected Error in method: public abstract void com.alcatel.mcdf.userandrole.interfaces.UserAndRoleAdmin.selectBusiness(java.lang.String) throws java.rmi.RemoteException
            java.lang.OutOfMemoryError: Java heap space
            03/03 04:10:02,276 WARN  [com.alcatel.mcdf.userandrole.ejb.api.UserAndRoleAdminApi] http-8443-1: Retry because RemoteException:Unexpected Error; nested exception is:
                    java.lang.OutOfMemoryError: Java heap space
            03/03 04:10:02,277 ERROR [org.jboss.ejb.plugins.LogInterceptor] http-8443-1: Unexpected Error in method: public abstract void com.alcatel.mcdf.userandrole.interfaces.UserAndRoleAdmin.selectBusiness(java.lang.String) throws java.rmi.RemoteException
            03/03 04:10:02,386 ERROR [org.jboss.ejb.plugins.LogInterceptor] http-8443-1: Unexpected Error in method: public abstract void com.alcatel.mcdf.userandrole.interfaces.UserAndRoleAdmin.selectBusiness(java.lang.String) throws java.rmi.RemoteException
            java.lang.OutOfMemoryError: Java heap space
            03/03 04:10:02,386 ERROR [com.alcatel.mcdf.userandrole.ejb.api.UserAndRoleAdminApi] http-8443-1: RemoteException:Unexpected Error; nested exception is:
                    java.lang.OutOfMemoryError: Java heap space
            03/03 04:10:02,386 ERROR [com.alcatel.mcdf.userandrole.ejb.UserAndRoleAdminBean] http-8443-1: No businessObject created
            03/03 04:10:02,386 ERROR [com.alcatel.mcdf.userandrole.loginmodule.LoginModule] http-8443-1: user is null
            03/03 04:10:02,489 ERROR [com.alcatel.mcdf.service.timer.JobStoreTX] QuartzScheduler_PersistentQuartzScheduler-MRF7082-C21298003036141_ClusterManager: ClusterManager: Error managing cluster: Failure identifying failed instances when checking-in: Connection lost while executing statementExecuteQuery with request 'SELECT * FROM QRTZ_SCHEDULER_STATE' and automatic reconnect failed (org.continuent.sequoia.common.exceptions.driver.VirtualDatabaseUnavailableException: Virtual database common not found on any of the controllers)
            org.quartz.JobPersistenceException: Failure identifying failed instances when checking-in: Connection lost while executing statementExecuteQuery with request 'SELECT * FROM QRTZ_SCHEDULER_STATE' and automatic reconnect failed (org.continuent.sequoia.common.exceptions.driver.VirtualDatabaseUnavailableException: Virtual database common not found on any of the controllers) [See nested exception: org.continuent.sequoia.common.exceptions.driver.DriverSQLException: Connection lost while executing statementExecuteQuery with request 'SELECT * FROM QRTZ_SCHEDULER_STATE' and automatic reconnect failed (org.continuent.sequoia.common.exceptions.driver.VirtualDatabaseUnavailableException: Virtual database common not found on any of the controllers)]
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport.findFailedInstances(JobStoreSupport.java:2104)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:2121)
                    at org.quartz.impl.jdbcjobstore.JobStoreTX.doCheckin(JobStoreTX.java:1383)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:2382)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:2413)
            * Nested Exception (Underlying Cause) ---------------
            org.continuent.sequoia.common.exceptions.driver.DriverSQLException: Connection lost while executing statementExecuteQuery with request 'SELECT * FROM QRTZ_SCHEDULER_STATE' and automatic reconnect failed (org.continuent.sequoia.common.exceptions.driver.VirtualDatabaseUnavailableException: Virtual database common not found on any of the controllers)
                    at org.continuent.sequoia.driver.Connection.statementExecuteQuery(Connection.java:2892)
                    at org.continuent.sequoia.driver.Statement.executeQuery(Statement.java:528)
                    at org.continuent.sequoia.driver.PreparedStatement.executeQuery(PreparedStatement.java:169)
                    at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:211)
                    at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectSchedulerStateRecords(StdJDBCDelegate.java:3696)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport.findFailedInstances(JobStoreSupport.java:2047)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:2121)
                    at org.quartz.impl.jdbcjobstore.JobStoreTX.doCheckin(JobStoreTX.java:1383)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:2382)
                    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:2413)
            Caused by: org.continuent.sequoia.common.exceptions.driver.VirtualDatabaseUnavailableException: Virtual database common not found on any of the controllers
                    at org.continuent.sequoia.driver.Driver.getConnectionToNewController(Driver.java:425)
                    at org.continuent.sequoia.driver.Connection.reconnect(Connection.java:2629)
                    at org.continuent.sequoia.driver.Connection.statementExecuteQuery(Connection.java:2887)

            I don't know if the JMS exceptions are the cause of the lack of memory space but they are raised before the issue occurs. 

            • 3. Jboss server out of memory exception
              wdfink

              How your memory settings are?

              Also, as I menitioned, you should activate GC logging and monitor the GC and memory.

              • 4. Re: Jboss server out of memory exception
                sbharate

                Hey Hello Guys,

                 

                One more thing here that I have also the same issue,.

                 

                DefaultMessageListenerContainer.handleListenerSetupFailure(634) | Setup of JMS message listener invoker failed - trying to recover

                org.jboss.mq.SpyJMSException: Cannot receive ; - nested throwable: (java.lang.OutOfMemoryError: Java heap space)

                    at org.jboss.mq.SpyJMSException.getAsJMSException(SpyJMSException.java:78)

                    at org.jboss.mq.SpyJMSException.rethrowAsJMSException(SpyJMSException.java:63)

                    at org.jboss.mq.Connection.receive(Connection.java:873)

                    at org.jboss.mq.SpyMessageConsumer.receive(SpyMessageConsumer.java:397)

                    at org.springframework.jms.listener.DefaultMessageListenerContainer.receiveMessage(DefaultMessageListenerContainer.java:560)

                    at org.springframework.jms.listener.DefaultMessageListenerContainer.doReceiveAndExecute(DefaultMessageListenerContainer.java:505)

                    at org.springframework.jms.listener.DefaultMessageListenerContainer.receiveAndExecute(DefaultMessageListenerContainer.java:460)

                    at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:871)

                    at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:824)

                    at org.springframework.core.task.SimpleAsyncTaskExecutor$ConcurrencyThrottlingRunnable.run(SimpleAsyncTaskExecutor.java:203)

                    at java.lang.Thread.run(Thread.java:619)

                Caused by: java.lang.OutOfMemoryError: Java heap space

                2014-05-08 02:36:24,066 WARN  [org.jboss.mq.Connection] Connection failure, use javax.jms.Connection.setExceptionListener() to handle this error and reconnect

                org.jboss.mq.SpyJMSException: No pong received; - nested throwable: (java.io.IOException: ping timeout.)

                    at org.jboss.mq.Connection$PingTask.run(Connection.java:1277)

                    at EDU.oswego.cs.dl.util.concurrent.ClockDaemon$RunLoop.run(ClockDaemon.java:364)

                    at java.lang.Thread.run(Thread.java:619)

                Caused by: java.io.IOException: ping timeout.

                    ... 3 more

                2014-05-08 02:36:24,066 WARN  [org.jboss.mq.Connection] Connection failure, use javax.jms.Connection.setExceptionListener() to handle this error and reconnect

                org.jboss.mq.SpyJMSException: No pong received; - nested throwable: (java.io.IOException: ping timeout.)

                    at org.jboss.mq.Connection$PingTask.run(Connection.java:1277)

                    at EDU.oswego.cs.dl.util.concurrent.ClockDaemon$RunLoop.run(ClockDaemon.java:364)

                    at java.lang.Thread.run(Thread.java:619)

                Caused by: java.io.IOException: ping timeout.

                    ... 3 more

                 

                 

                 

                Please suggest.

                 

                 

                Inference of the above analysis:-

                 

                1. We can reduce the Chunk size from 1Mb to 100Kb as receiving multiple pongs is not a problem but having a ping/pong waiting to go over the network can create a problem. By reducing the Chunk Size to 100Kb; it will simulate a Pong when that many bytes are sent over the network to avoid these problem.
                2. We will also optimize the JMS Exception Listener which can handle such exception and avoid the situation where the only solution left is to restart the server.
                3. To reduce the processing time we need to optimize the database queries which are used under policy/non policy processing and for which average duration time is maximum or we can say the queries on which the load is maximum. We have already optimized 3 of them, but third one which was optimized and deployed last Monday has still not shown any improvement in reduction of processing time. We will relook at optimization of this query and also continue optimizing other queries.

                 

                 

                Thanks in advance.