8 Replies Latest reply on Oct 19, 2010 4:26 AM by garytully

    KahaDB slow startup and lot of db*.log files

    ciand7

      Hi,

      We have a problem with AMQ KahaDB installed as part of FUSE ESB 4.2-fuse-01-00. Exact version is 5.3.0.fuse-01-00.

       

      After 2-3 days of uptime, disk storage (data/activemq/default) filled with Hundreds (400-1000) of log files (db-XXXX.log). There are about 200-300 files per day.

       

      Restart takes 5-10 min, all that time - is replaing logs:

      Replayed 6884400 operations from the journal in 287.731 seconds.

      This leads to timeout starting bundles depended on AMQ.

       

      Repeatable restarts don`t help - each restart takes 5-10 min to replay logs.

       

      One solution we found is to remove db.data index and some "old" db-XX.log files.

       

      We have other FUSE ESB installations without such problems.

       

      Any help please.

        • 1. Re: KahaDB slow startup and lot of db*.log files
          garytully

          Are there any exceptions or warnings in the activemq.log to indicate any symptom?

           

          What is the nature of the producers/consumers in the application mix, or is FUSE ESB the only user of activemq?

          • 2. Re: KahaDB slow startup and lot of db*.log files
            dmathew

            I have the same issue with ActiveMQ 5.4 with KahaDb. No errors but with 100K messages, startup takes around 5 minutes. With million messages, ActiveMQ never starts.

             

            See the five minutes time difference between lines:

             

            2010-08-31 12:55:19,286 | INFO  | PListStore:C:\ActiveMQ\apache-activemq-5.4.0\bin\win32\..\..\data\localhost\tmp_storage initialized | org.apache.activemq.store.kahadb.plist.PListStore | WrapperSimpleAppMain

            2010-08-31 13:03:56,983 | INFO  | Listening for connections at: nio://S90356004630988:61616 | org.apache.activemq.transport.TransportServerThreadSupport | WrapperSimpleAppMain

             

            From activemq.log

             

            2010-08-31 12:55:10,436 | INFO  | JMX consoles can connect to service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi | org.apache.activemq.broker.jmx.ManagementContext | JMX connector

            2010-08-31 12:55:11,294 | INFO  | KahaDB is version 2 | org.apache.activemq.store.kahadb.MessageDatabase | WrapperSimpleAppMain

            2010-08-31 12:55:11,435 | INFO  | Recovering from the journal ... | org.apache.activemq.store.kahadb.MessageDatabase | WrapperSimpleAppMain

            2010-08-31 12:55:11,435 | INFO  | Recovery replayed 1 operations from the journal in 0.063 seconds. | org.apache.activemq.store.kahadb.MessageDatabase | WrapperSimpleAppMain

            2010-08-31 12:55:12,090 | INFO  | ActiveMQ 5.4.0 JMS Message Broker (localhost) is starting | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain

            2010-08-31 12:55:12,090 | INFO  | For help or more information please see: http://activemq.apache.org/ | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain

            2010-08-31 12:55:12,278 | INFO  | Scheduler using directory: activemq-data\scheduler | org.apache.activemq.broker.scheduler.SchedulerBroker | WrapperSimpleAppMain

            2010-08-31 12:55:19,286 | INFO  | PListStore:C:\ActiveMQ\apache-activemq-5.4.0\bin\win32\..\..\data\localhost\tmp_storage initialized | org.apache.activemq.store.kahadb.plist.PListStore | WrapperSimpleAppMain

            2010-08-31 13:03:56,983 | INFO  | Listening for connections at: nio://S90356004630988:61616 | org.apache.activemq.transport.TransportServerThreadSupport | WrapperSimpleAppMain

            2010-08-31 13:03:56,999 | INFO  | Connector nio Started | org.apache.activemq.broker.TransportConnector | WrapperSimpleAppMain

            2010-08-31 13:03:57,045 | INFO  | ActiveMQ JMS Message Broker (localhost, ID:S90356004630988-3777-1283273712137-0:0) started | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain

            2010-08-31 13:03:57,639 | INFO  | Logging to org.slf4j.impl.JCLLoggerAdapter(org.eclipse.jetty.util.log) via org.eclipse.jetty.util.log.Slf4jLog | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:03:57,639 | INFO  | jetty-7.0.1.v20091125 | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:03:58,451 | INFO  | ActiveMQ WebConsole initialized. | org.apache.activemq.web.WebConsoleStarter | WrapperSimpleAppMain

            2010-08-31 13:03:58,857 | INFO  | Initializing Spring FrameworkServlet 'dispatcher' | /admin | WrapperSimpleAppMain

            2010-08-31 13:03:59,341 | INFO  | ActiveMQ Console at http://0.0.0.0:8161/admin | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:04:00,091 | INFO  | Initializing Spring root WebApplicationContext | /camel | WrapperSimpleAppMain

            2010-08-31 13:04:02,668 | INFO  | Connector vm://localhost Started | org.apache.activemq.broker.TransportConnector | WrapperSimpleAppMain

            2010-08-31 13:04:04,526 | INFO  | Camel Console at http://0.0.0.0:8161/camel | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:04:04,589 | INFO  | ActiveMQ Web Demos at http://0.0.0.0:8161/demo | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:04:04,651 | INFO  | RESTful file access application at http://0.0.0.0:8161/fileserver | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2010-08-31 13:04:04,698 | INFO  | Started SelectChannelConnector@0.0.0.0:8161 | org.eclipse.jetty.util.log | WrapperSimpleAppMain

            2

             

            Edited by: dmathew on Aug 31, 2010 5:23 PM

            • 3. Re: KahaDB slow startup and lot of db*.log files
              garytully

              That is odd, as that is the temp store and that should be wiped on restart. Think this looks like a bug if temp store start takes 5 minutes. What sort of producer/consumer load is on your broker before the restart and what mix of persistent non-persistent messages?

              I think you should open an issue for this so we can investigate some more and attach your activemq.xml and any other pertinent information.

              • 4. Re: KahaDB slow startup and lot of db*.log files
                dmathew

                I have opened an issue for this.

                https://issues.apache.org/activemq/browse/AMQ-2891

                 

                At the time of restart, there was no producer/consumer connected to it. I was load testing with the goal of million persistent queue messages.

                • 5. Re: KahaDB slow startup and lot of db*.log files
                  garytully

                  The file pending message cursor doing its recovery was the culprit, the default store cursor is a better match for really deep queues. Full details in the jira issue: https://issues.apache.org/activemq/browse/AMQ-2891

                  There is now a progress indicator in the form of info log messages so it is clear what is going on when recovery of a cursor is delaying restart.

                  • 6. Re: KahaDB slow startup and lot of db*.log files
                    ciand7

                    We found problem...

                    There is MessageDatabase.checkpointUpdate(Transaction, boolean) witch called periodically to cleanup data base. We added logging in this method and found that there is "very old" firstTxLocation (metadata.firstInProgressTransactionLocation). So all pagefile indexes are removed from gcCandidateSet and keep on disk. After removing old pagefile with this tx, data base logs are cleared.

                     

                    So, the questions is: What is in progress transaction in terms of kaha db? Is there any way to control one (monitoring, forcing to complete/rollback etc)?

                     

                    Thanks.

                    • 7. Re: KahaDB slow startup and lot of db*.log files
                      tranchida

                      Hello,

                       

                      I experiment the same issue with 5.4.1-fuse for same reason (pending tx).

                       

                      Any news about a fix ?

                      • 8. Re: KahaDB slow startup and lot of db*.log files
                        garytully

                        There has been one fix in this area: https://issues.apache.org/activemq/browse/AMQ-2736 but there still seems to be an issue with an outstanding transaction reference that needs some more investigation.