4 Replies Latest reply on Mar 31, 2006 3:08 AM by belaban

    FD_SIMPLE

      I've been reading many discussions around FD vs. FD_SOCK for the preferred failure detection protocol, but I've only seen FD_SIMPLE referred to in the JBoss AS Documentation on JBossCache and JGroups http://docs.jboss.org/jbossas/jboss4guide/r4/html/jbosscache.chapt.html

      Is FD_SIMPLE available in JGroups version 2.2.7 / JBoss 4.0.2? If so, is it recommended?

      The problem we're facing are false suspects amoungst some members of the cluster. I don't want to change jgroups versions, so I'm not ready to move to FD_SOCK due to the TCP KeepAlive issues. I was planning on increasing the timeout for the FD configuration, but FD_SIMPLE has caught my eye and deserves some investigation.

        • 1. Re: FD_SIMPLE
          belaban

          Hi Tyler,

          FD_SIMPLE is (a) somewhat indeterministic and (b) has never really been used in production.
          The current best practices recommendation is to use *both* FD_SOCK *and* FD.
          - FD_SOCK will detect immediately whether a server has crashed or not
          - FD will act as second line of defense, handling cases like switch or host crashes

          FD_SOCK in pre-2.3 doesn't set KEEP_ALIVE by default, so I suggest you look at the 2.3 sources and add this option yourself (setKeepAlive()).

          • 2. Re: FD_SIMPLE

            Thanks for the reply, Bela.

            As an aside, the wiki on Failure Detection

            http://wiki.jboss.org/wiki/Wiki.jsp?page=FDVersusFD_SOCK
            has a broken link at the bottom, pointing to the appropriate issue in JIRA. The href has a "." at the end, which makes it no good.

            I think it makes more sense for us to upgrade to 2.3 than to alter the source of 2.2.7.

            • 3. Re: FD_SIMPLE

              Just to clarify, I mean upgrade our production systems. I'm still going to mess around with 2.2.7 :)

              Thanks again for the advice.

              • 4. Re: FD_SIMPLE
                belaban

                Fixed the link, thanks,