1 2 Previous Next 19 Replies Latest reply on Jul 12, 2007 3:44 PM by sem

    strange failover behaviour in clustered config

      Hi,
      I'm testing failover situations with JBOSS Messaging server 1.3GA running on jboss4.2.1GA (got it from branch as it is not yet released, but almost ready for release) on two node cluster.
      I installed clustered jboss messaging server 1.3.0GA using release-admin.xml build script. And did additionally following changes on both nodes:
      1. replaced hsqldb-persistence-service.xml by clustered-oracle-persistence-service.xml
      2. added clustered queue in destination-service.xml:
      <mbean code="org.jboss.jms.server.destination.QueueService"
      name="jboss.messaging.destination:service=Queue,name=mytestqueue"
      xmbean-dd="xmdesc/Queue-xmbean.xml">
      <depends optional-attribute-name="ServerPeer">jboss.messaging:service=ServerPeer
      jboss.messaging:service=PostOffice
      true


      3. changed messagepullpolicy and clusterrouterfactory:
      org.jboss.messaging.core.plugin.postoffice.cluster.DefaultMessagePullPolicy
      org.jboss.messaging.core.plugin.postoffice.cluster.RoundRobinRouterFactory

      4. added attributes for ConnectiontFactory and XAConnectionFactory in connection-factories-service.xml
      true
      true

      In messaging-service.xml in one node put server peer id to 1 and on another node to 2.

      I started both nodes. Pushed about 1000 JMS messages to the queue.
      Deployed MDB which is listening on this queue and doing some staff with JMS messages.

      So far so good.
      Messages good spreaded across nodes. After some time I killed one node (will name it from now as node A).
      I saw that messages are now was correctly overtaken by remaining node (node B).

      And now I killed node B too. SO no nodes are running.

      Then I started node A and it was no messages arriving to my MDB. I checked database and saw that messages were still there in JBM_MSG table.

      Then I started node B and messages started to arrive to both node A and node B again.

      How to avoid situation that existing messages are not delivered to the MDB ?
      Is it by design ? or bug ?
      Thanks in advance, Ramil

        • 1. Re: strange failover behaviour in clustered config
          clebert.suconic

          When you killed nodeA all your message were merged into nodeB.

          When you killed nodeB, you didn't have any nodes to accept the failover.

          Later you started nodeA.. nothing was merged from nodeA...

          When you started nodeB back... messages were loaded on the cluster again.



          I would say you aways need at least one server up on the cluster.

          For nodeA to assume messages prior to nodeB being loaded we would need to merge messages from nodeB (the way you're describing)... but I'm not sure if that's a good idea, as we have no control when nodeB would be loaded. (Immagine if nodeB is being loaded just few seconds after nodeA).

          • 2. Re: strange failover behaviour in clustered config

            Thanks for explanation.
            Just some real case for clearance.
            Imagine node A is crashed completely (disk failure). It means that I will not be able to start it.
            Do I understand correctly that I have to install jboss messaging server on new machine and assign the same serverpeer id as it was on node A to be able process remaining message?

            • 3. Re: strange failover behaviour in clustered config
              clebert.suconic

              If nodeA was the last node to crash, and is lost forever (it will never come back again) you could update the serverID on the database (or to have another node with the same ID as you described).



              • 4. Re: strange failover behaviour in clustered config

                Sounds like potential problem because it would be difficult to monitor this things on production environment.
                Can you just have kind of heartbeat record in database about every node like Quartz does for example. In this case it's easy to detect dead nodes.


                Don't kill the messenger, just an idea.
                Thanks

                • 5. Re: strange failover behaviour in clustered config
                  timfox

                   

                  "sem" wrote:
                  Sounds like potential problem because it would be difficult to monitor this things on production environment.


                  Can you explain in more detail?

                  I didn't quite understand.

                  Thanks.

                  • 6. Re: strange failover behaviour in clustered config

                    Very simple.
                    For example we have 2 nodes in production. A lot of messages in the queue. Power goes done. 1 node is not starting up. How to detect if there are still messages for dead node? We have no direct access to database server.
                    We somehow need to right an application for querying JMS tables or so?

                    In quartz both nodes register themself in db. So if one is dead another one can detect it easily.

                    • 7. Re: strange failover behaviour in clustered config
                      timfox

                      If one node crashes and server side failover is enabled then the other node will take over the failed nodes messages.

                      If both nodes crash at exactly the same time e.g. power goes out to both nodes simultaneously.

                      When you turn the power back on you start up both nodes and you find one does not start - you want the other node to take over the messages from the node that does not start?

                      The problem with this is how can the node that does start know that the other node really can't start or it's just that the sysadmin hasn't started it yet.

                      E.g. if you had 10 nodes, each with their own messages, and they were all currently down.

                      The sysadmin then starts them one by one.

                      According to what you want, the first node would start, and then say "look... the other nodes aren't alive so I'm going to take over all their messages".

                      Then the sysadmin starts the other nodes, and you'd end up with all the messages on the one node (the first node started) - which is not good.

                      In most cases if the power went off, then it's more than likely the nodes will be startable after the power comes back on, and the sysadmin will just start them all - in this case we *don't* want nodes to take over other nodes message.

                      I think we should cater for the most common case (i.e. the nodes *are* startable after failure), and leave the less common case to require manual intervention (your case where the node isn't startable after failover).

                      If you can think of a way of automatically dealing with both cases then I am open to suggestion, although I can't think of one right now.

                      Adding a flag in the database for each node won't help since it doesn't tell you if the node is not startable.

                      • 8. Re: strange failover behaviour in clustered config

                        but as on all nodes routerpolicy configured properly (using roundrobin)
                        then it should not be a problem if all messages are overtaken by first started node.
                        Then other nodes will come up JMS messages will be spreaded again correctly to other nodes. Or I'm not correct ?

                        • 9. Re: strange failover behaviour in clustered config
                          timfox

                           

                          "ramazanyich" wrote:
                          but as on all nodes routerpolicy configured properly (using roundrobin)
                          then it should not be a problem if all messages are overtaken by first started node.
                          Then other nodes will come up JMS messages will be spreaded again correctly to other nodes. Or I'm not correct ?


                          Well, yes, IF you have configured it this way, but not everyone wants redistribution.

                          Secondly, this puts a big strain on the first node - the operation to merge queues in the database is fairly heavyweight, also it will have to load all these message on one node (memory issues), then all these messages have to be shifted off this node - which is very CPU and IO intensive.

                          • 10. Re: strange failover behaviour in clustered config

                            Don't you think for High Available environment there should be an option to do it?
                            We have very strict SLA's. Messages needs to be processed in appropriate amount of time, no matter what. With current JBossMQ (running as Singleton ) is not a problem. But with JBoss messaging we would need to right a lot of code to basically redistribute things manually somehow

                            • 11. Re: strange failover behaviour in clustered config
                              timfox

                              Well, we could add such a switch, but it doesn't tackle the problem of all messages going to one node every time you do a normal startup.

                              Also, going ahead, most people want to move away from the old style "JBoss MQ" model where you have a single shared database which all nodes use - since this turns into a performance bottleneck.

                              For better scalability, we're going to support each node having its own file based persistent storage, or it's own database. (Of course we'll support the old model too)

                              Typically all the file based stores would be persisted on some kind of SAN or shared file system with redundancy built in.

                              If a node fails, another one takes over.

                              When starting up from complete power failure, if a particular node doesn't start - e.g. the box is dead, then you could just start it on another node with the same server id, this could be done with a simple script.

                              You could probably do something similar then you wouldn't have to worry about starting the node from the same crashed box.

                              But having all messages on one node is not really a scalable solution.

                              • 12. Re: strange failover behaviour in clustered config
                                timfox

                                You should also consider that if power goes out on the entire cluster, then when it comes back on you have to start the servers again anyway.

                                So, when you try and start the server on node A and it fails since the node is hosed, then you could just start the same server on a different node?

                                It's exactly the same command line you'd be running, you'd just be running it on a different node.

                                • 13. Re: strange failover behaviour in clustered config
                                  clebert.suconic

                                  Just an idea: It would be possible to expose a method to mergeQueus from a dead node like they're describing, so a Human knowing the server will never come back could start the merge procedure. But I feel like this is way too dangerous and error prone!

                                  • 14. Re: strange failover behaviour in clustered config
                                    timfox

                                     

                                    "clebert.suconic@jboss.com" wrote:
                                    Just an idea: It would be possible to expose a method to mergeQueus from a dead node like they're describing, so a Human knowing the server will never come back could start the merge procedure. But I feel like this is way too dangerous and error prone!


                                    You woulldn't need to do that if you just started a new server with the same id on a different node - why is this such an issue?

                                    1 2 Previous Next