5 Replies Latest reply on Dec 14, 2011 7:35 PM by rhusar

    reployment errors

    johnflores

      Hi

       

      I have a cluster of jboss as7.0.2 servers which use mod_cluster with apache front end servers.  I use mod_cluster 1.1.3.

       

      The problem I have is when I redploy the root.war to all the servers. some of the current sesions get strange erros, mostly a 500 status code error messages.

       

      It's strange I thought this problem would be solved by disabling the relavent node through the cluster manger. I have script that calls the http listener page to disable each node intern. It then deployes the new archive which reactives the context in the apache side automatically.

       

      I saw that the pure apache proxy module as a failonstatus option to the proxypass configuration. This option was not recognized by the mod_cluster implmentation (its in from apache 2.2.20).

       

      Is there a way to stage redeploy across a cluster without getting these errors?

       

      Should I wait after disabling the context to allow the sesions to drain (put some sleep in between)? Can I force them to fail without getting errors?

       

      In my particular case my archive is not clustered, but I don't much care that the session fails, state is rebuilt automatically, but it would be nice to avoid the 500 erros plus sometimes the front page displaying by default at redeploy also (very strange). Any ideas?

       

      Regards

      John

        • 1. Re: reployment errors
          jfclere

          where does the 500 errors are coming from? AS7 or httpd.

           

          what you describe is weird mod_cluster should failover the request from one node to the other withouth problems.

          • 2. Re: reployment errors
            johnflores

            The 500 error comes from jboss, with the standard jboss error page formating. Yes, I would have thought it would catch that failover too.

             

            Is it becuase the redeploy issues are DISABLE_APP not a STOP_APP, or atleast not quickly enough? Maybe it should just call STOP_APP?

             

            My temporary work-around I did by changing my script to call DISABLE_APP on the web server instances, then wait for 5 minutes (my cilents are typlically small sessions [mobile]), then do a STOP_APP, followed by the redeploy. That way forcing any left over sessions to failover. What is interesting is when I then redeploy the archive the status goes back to DISABLE_APP, which technically would allow some outlier sessions to be routed through.

            • 3. Re: reployment errors
              jfclere

              The redeploy should do:

              DISABLE_APP

              wait for existing session to move to another node

              STOP_APP

               

              between the DISABLE_APP and STOP_APP only request with sessionid will be routed to the node.

               

              I am sure of the other problem:

              "when I then redeploy the archive the status goes back to DISABLE_APP"

              do you mean that if you do a undeploy myapp.war then a deploy myapp.war the status of the context is DISABLED for a while?

              • 4. Re: reployment errors
                johnflores

                Well to force the session movement I issue via a curl script the DISABLE_APP and STOP_APP commands to the apache server farm. Essentially, mimicing the undeploy mod_cluster commands that are be issued, but I wait for 5 minutes inbetween, then I do the actual redploy of the war. When the app is redeploying (it's a ROOT.war) the status goes back to DISABLED, i'm thinking the redploy doesn't check if it's in a STOP state already, then when it's redployed it back to ENABLED as expected.

                 

                In the case where I didn't have the script do the DISABLE, wait then STOP the 500 errors I belive where while the container was in the undeploy process. I would also sometimes get cdi errors when a page was being processed, as if the undeploy was occuring while a page request was happening.

                • 5. Re: reployment errors
                  rhusar

                  Any update? BTW what are the CDI errors?