5 Replies Latest reply on Apr 12, 2003 7:44 AM by dcartier

    Deadlock in Clustering?

    dcartier

      Hello,

      I have an issue that I have been experiencing every few weeks. It has happened ~3 times. My cluster starts to slow down and process HTTP requests slower and slower, eventually causing the new requests to pile up in Apache. If I reboot the JBoss of the affected node, it takes forever to shutdown, minutes rather than seconds. When the rebooted node starts up, it hangs at the start of the clustering right after if prints the GMS address and mentions Ctrl-C. It hangs here indefinitely. The only way to get it to continue is to reboot all the remaining nodes of the cluster and then the hung nodes continue booting up. As long as 1 node of the cluster remains running, after the problem has occurred, any rebooted node will hang.

      Background:

      4 nodes running
      Apache 2.0.44
      Jboss 3.0.6
      mod_Jk 1.2.2
      ~400K requests per day

      The distributable sessions are using the Tomcat default (in memory Javagroups?). All sessions are set to timeout in 10 minutes.

      Questions:

      Is their anyway to look inside the clustering system to see if maybe stale or left over objects are accumulating?
      Anyone experience anything similar to this?

      Thanks,

      Dennis





        • 1. Re: Deadlock in Clustering?
          belaban

          > My cluster
          > starts to slow down and process HTTP requests slower
          > and slower, eventually causing the new requests to
          > pile up in Apache.


          I assume you are using Apache/mod_jk/Tomcat ?


          > If I reboot the JBoss of the
          > affected node, it takes forever to shutdown, minutes
          > rather than seconds.


          Can you take a stack trace next time this happens ? Also, a log would be helpful (check doc for how to turn tracing on).


          > When the rebooted node starts
          > up, it hangs at the start of the clustering right
          > after if prints the GMS address and mentions Ctrl-C.


          Again, take a stack trace to see where it's hanging.

          Can you also post cluster-service.xml ?

          Bela

          • 2. Re: Deadlock in Clustering?
            dcartier

            Hi Bela,

            > I assume you are using Apache/mod_jk/Tomcat ?

            Yes.

            > Can you take a stack trace next time this happens ? Also, a log would be helpful (check doc for how to turn tracing on).

            Ok, I will test turning tracing on now to make sure I know exactly how to do it. As for the stack trace, can you tell me what signal I can send to JBoss to prompt a stack trace. In windows, it is a Crtl-C, but I am running Linux and have it back grounded. I think it might be interrupt (INT or -2)?

            I can post the cluster-service.xml if you wish, but I have not altered it from the default that ships with 3.0.6. If you still need to see it let me know.

            Thanks,

            Dennis

            • 3. Re: Deadlock in Clustering?
              belaban

              It is signal 3 in Linux: kill -3 .

              Bela

              • 4. Re: Deadlock in Clustering?
                slaboure

                When this occurs, PLEASE generate a stacktrace dump (CTRL+BREAK in the console on Windows), without it, we won't be able to help you.

                Cheers,


                sacha

                • 5. Re: Deadlock in Clustering?
                  dcartier

                  Just thought I would update this report.

                  This has not occurred again after I brought a corrupt cookie issue to the attention of the Tomcat people and they patched it. I suspect it was this issue that was causing the deadlock, perhaps by generating an unusually high number of sessions (that all needed to be replicated on nodes joining).

                  This patch is in Tomcat 4.1.24 so I would recommend anyone using Tomcat with Apache and experiencing this to upgrade to JBoss 3.0.7 to get the fix.

                  Dennis