4 Replies Latest reply on Dec 4, 2013 5:46 AM by papegaaij

    Using mod_cluster for live updating

    papegaaij

      We are investigating live updating (updating without noticeable downtime to the end-user) for our application, and would like to use mod_cluster in this context. Our application runs on a JBoss AS 7 cluster with 4 hosts. 2 hosts serve a stateless RESTful webservice. The other 2 hosts serve several Wicket web applications. mod_cluster is used to balance load within these server groups.

       

      The last few weeks, we've conducted some experiments, trying various migration strategies. At the moment, we are able to perform live updates for our stateless RESTful webservice in the following way: start 2 new server instances on the 2 hosts with the contexts disabled. When the servers are fully started, open the mod_cluster-manager, enable the contexts for the new servers and disable the contexts for the old servers. All new requests will now be handled by the new version. This approach works very nice for the stateless webservice, but does not work for stateful web applications.

       

      Load balancing over our web applications uses sticky sessions. We do have session replication, but locality of the requests improves performance. One thing we cannot do is replicate sessions to a newer version of the application, because the serialized data often is not compatible. Therefore, we would like to let our old version drain with users logging out. New sessions should start on the new servers. After a while, the old servers should be (almost) idle, and we can stop them. However, we cannot find a way to do this with mod_cluster. We've tried to start the servers with a capacity of 0, but this is not allowed and you cannot change the capacity without restarting your servers. There seems to be no option to disable a context for new sessions, but keep it active for existing sessions. Is there anything we've missed?

        • 1. Re: Using mod_cluster for live updating
          jfclere

          You need to use sessionDrainingStrategy (session-draining-strategy) and stopContextTimeout (stop-context-timeout) big enough to prevent the restart of applications until all sessions are gone.

          You have to fail-over to one server (STOP the contexts of the other node) and keep this one up (DISABLED) until the sessions are drained additionally you need to configure the new node so that it won't form a cluster with the old one.

          Note that https://bugzilla.redhat.com/show_bug.cgi?id=922042 might prevent setting the session-draining-strategy but default value should work for that scenario.

          • 2. Re: Using mod_cluster for live updating
            papegaaij

            session-draining-strategy is indeed missing in JBoss AS7, but it's available in WildFly, which we plan to migrate to soon. If I understand you correctly, we have to stop the contexts, except the last, causing a fail-over to this last server. Then, we also stop this context, and wait for the sessions to drain.

             

            What do you mean with ' you need to configure the new node so that it won't form a cluster with the old one'? They have to be in the same LBGroup, don't they? At the moment, they do not form a cluster for session replication, because they are at different JGroups channels.

             

            The documentation at Chapter 9. Server-side Configuration Properties states that the default sessionDrainingStrategy (DEFAULT) drains sessions before web application undeploy only if the web application is non-distributable. Does non-distributable mean 'there is no server to distribute to (ie. the last running server)', in this context? Because, our web application is distributable, as marked with <distributable/> in web.xml.

             

            Thanks with the help so far. We are really impressed by flexibility mod_cluster offers, without complicating the configuration.

            • 3. Re: Using mod_cluster for live updating
              jfclere

              The idea is clustered node1 and node2 (otherwise you can't fail over the sessions)

              2 nodes node1 and node2 in a load-balancing-group group12

              1 - node1 | STOP

              The session have to be replicated to node2 and new requests are going to node2 too.

              To prevent sessions going to node1 reconfigure node1 as node3 and change the load-balancing-group to group32 make sure it is in a different  JGroups channel and restart it. Once it is restarted

              2 - node2 | DISABLE

              New requests will go to node3 and old will go to node2 (the sessions from node1 will be routed to node2 because they belong to group12).

              Once all the sessions are gone from node2, stop it and reconfigure it in the load-balancing-group group32 and in the same JGroups channel as node3 and restart it.

               

              The session-draining-strategy needs to be changed to ALWAYS make this move more automatic

              • 4. Re: Using mod_cluster for live updating
                papegaaij

                Ok, I think I understand what you mean. I'll give it a try later this week. Thanks for the help!