2 Replies Latest reply on Nov 19, 2012 2:26 AM by dengyong

    Any way to configure duration time to delay backup activation

    dengyong

      Description:

       

      HornetQ: 2.3 BETA

      HA mode: share store

       

      I am using HornetQ HA. The topology is 1 master node and 1 backup node. The HA group has enabled master fallback option.

       

      The master and backup are all running. The backup is waiting to be activated. For now, when master dies, the backup will be activated in a short time.

      I want the duration to be configurable. So If master can recover in 2 min, I don't want to bother activate backup.

       

      I want this feature because it is usefull in below case.

      Like:

      1. My master is powerful.

      2. I have a watch dog for master HornetQ. When master dies, the watch dog will reboot the master HornetQ.

      3. I want the backup will only be activated unless master can not recover in a duration (like there is disk error or other fatal error)

       


      So can hornetq enhance to support this?

        • 1. Re: Any way to configure duration time to delay backup activation
          gaohoward

          Why dont you let the backup to take over automatically before you start up the live again? In real case I believe if your live crashes, it would take some time to investigate by you anyway. If your script just simply restarts it immediately, it could mostly result in a failure again and again.

          • 2. Re: Any way to configure duration time to delay backup activation
            dengyong

            Gao:

            HornetQ master may crash for different reason like JVM issue. A restart may make the master work. So if the watch dog can restart the hornetq successfully, I don't want backup to take over during the duration. Otherwise, there will be two time live nodes switch. First time, live node switches from master to backup. Second time, after master recovers, live node will swtiches from backup to master.

             

            In fact, I also think this is not hard to enhance. Here is the rough idea:

            In org.hornetq.core.server.impl.FileLockNodeManager.awaitLiveNode() codes, when backup nodes get the live lock and the lock status is alive, it will wait a duration. If it still can get the live lock after the duration, the API will return and backup will activate

             

                 What do you think of the idea?

             

             

            Yong Hao Gao wrote:

             

            Why dont you let the backup to take over automatically before you start up the live again? In real case I believe if your live crashes, it would take some time to investigate by you anyway. If your script just simply restarts it immediately, it could mostly result in a failure again and again.