2 Replies Latest reply on Apr 30, 2012 5:40 PM by jakec

    Events started with Events.raiseTimedEvent(String, TimerSchedule) silently dying!


      JBoss 4.2.2.GA

      Seam 2.2.2.Final

      Windows Server 2008


      We have been using this code successfully for years, originally on Win2k. We have been running on Windows Server 2008 for months. Tuesday night we switched to running on a VM. It has 4 processors and 12G of RAM. It ran fine all Wednesday, but yesterday at 3pm all our timed events stopped firing. I had a process in place for restarting one if it died, because we found that an uncaught exception killed the timer, but it still printed the exception. I made a system wher if a certain file exists, it should restart the schedule, but it relied on another timed file watching service, which also died. Now we are having three separate timed events (one at every 5 seconds, one at 10 seconds, and one at 30 seconds) all stop at the same time. The only thing that has changed is that we are running on a VM now.


      I heard that using Quartz as the timer back end was supposed to be more reliable, so I created seam.quartz.properties (SimpleThreadPool, 20 threads, misfire 60000, RAMJobStore), modified my build.xml to put it in the web app's JAR file, and added the namespace and config line to configuration.xml. I didn't change any code, just relying on raiseTimedEvent() and @Observer. My log file now states that Quartz is running, but it happened again today at 10:25 am. Currently, the only way I can restart them is to restart the server, which is TERRIBLE!


      Does anyone know why the timer service would suddenly stop running? I only found one reference, and the fix was to use Quartz, but they had actually changed code (used @Timer to try to get a handle on the event). We hadn't changed any code, just the fact that it was running on a VM now.


      Is there a way to be NOTIFIED if it stops running?

        • 1. Re: Events started with Events.raiseTimedEvent(String, TimerSchedule) silently dying!

          I don't think scheduled events just disappear like that without errors, most likely they are firing alright, it is the observer that is not being called.


          One way of telling is raising the log level for "org.jboss.seam.core.Events" to "TRACE". This will log a line "Processing event:...." for every event. This will probably make a rather verbose log so remember to lower the log level again once the problem is diagnosed. You'll probably see that the event is being fired at the expected time. If this is the case, then it is the observer component that is no longer registered in the observers list or the component was destroyed either manually or at the end of its lifecycle and the observer method is not marked "autoCreate".


          If you don't see the "Processing event:..." line, then you can presume there's a problem in the scheduling part of the process, but is much less likely.


          Finally, there could be an exception while executing the job and before the event is risen, this should be handled by the AsynchronousExceptionHandler class, which just logs the error (so you should notice the error in the logs).


          Hope this helps.

          • 2. Re: Events started with Events.raiseTimedEvent(String, TimerSchedule) silently dying!

            Well, this code has been running for years without this ever happening before, and three completely different services all stop firing at the same time.


            All three of them have the following annotations (different Names, of course).


            @Install(precedence = Install.BUILT_IN)


            They all have an @Create method to start the scheduled job with a line like this:


            Events.instance().raiseTimedEvent("utils.FileMonitor.checkFiles", new TimerSchedule(5000L, 5000L));


            And they all have an @Observer like this:


            public void checkFiles() {}


            They have run for months before without stopping, but now they stop firing (all at the same time) a couple of times a day.


            Now, our site HAS been getting a lot more traffic, but none of these are dependant on traffic, and we just upgraded to monster hardware a couple months ago. As I stated in my first message, we had just switched to running on a Server 2008 VM, but it still has 4 processors and 12G of ram, FAR better than the 5 year old 32-bit platform with 3.5G we had previously been running on.


            I WILL turn on the logging the next time it dies. I raised the thread count to 50 (sounds like really bad bed sheets, doesn't it), so there is a chance that if threading was the issue that it won't die again.


            I have also implemented Quartz with a JobStoreTX in the database. That is running on a test machine just fine. I had to inspect the current JobDetails to make sure I don't start a job a second time, but that is working fine. However, I can't reproduce the failure on test, so I don't know if it will make a difference. If there really is an issue with the timing system failing, then Quartz may not be able to poll the database for jobs. :-/