10 Replies Latest reply on Oct 25, 2013 3:28 AM by pathduck

    Installing 4.9: Connection refused in server installer script

    pathduck

      Hello,

      I reported this in bz since I've had problems accessing the forums:

      https://bugzilla.redhat.com/show_bug.cgi?id=1019823

       

      When running the RHQ server installer with command:

      rhqctl upgrade --from-server-dir=/opt/rhq/rhq-server.OLD

       

      I get the following in the output and rhq-storage-installer.log and the RHQ server won't start.

       

      14:08:55,695 INFO [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused
      14:08:55,696 INFO [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 12000 ms...
      

       

      This keeps coming until I abort the installation. Cassandra however is running and logging without exceptions.

      Trying to start the RHQ server I get an error about the server not being installed.

       

      [rhqadmin@d26apvl007 rhq-server]$ ./bin/rhqctl start --server

      15:03:08,027 INFO [org.jboss.modules] JBoss Modules version 1.2.0.CR1

      15:03:08,265 WARN [org.rhq.server.control.command.Start] It appears that the server is not installed. The --server option will be ignored.

       

      [rhqadmin@d26apvl007 rhq-server]$ ./bin/rhqctl status

      15:03:45,388 INFO [org.jboss.modules] JBoss Modules version 1.2.0.CR1

      RHQ Storage Node (pid 10592 ) IS running

       

      Version-Release number of selected component (if applicable):

      RHQ Server 4.9 final

      Red Hat Enterprise Linux Server release 6.4 (Santiago)

      Linux d26apvl007.test.local 2.6.32-358.14.1.el6.x86_64 #1 SMP Mon Jun 17 15:54:20 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

       

      Steps to Reproduce:

      1. Back up old installation to rhq-server.old

      2. Do needed edits in rhq-server.properties

      3. Run command rhqctl upgrade --from-server-dir=/opt/rhq/rhq-server.OLD

       

      Actual results:

      Cassandra is started but installer cannot contact it, and fails.

       

      Expected results:

      Expect that installer also installs the RHQ-server.

       

      Additional info:

      Upgrading from RHQ 4.9 SNAPSHOT, might be relevant

       

      After it failed the first time, I decided I wanted to try starting with a "clean slate" so removed the RHQ storage data directory, mapped to /opt/rhq/rhq-storage.

      This was after running the "rhq48-storage-patch".

       

      Has anyone else seen this or know what it might be caused by?

       

      thanks,

      Stian

        • 1. Re: Installing 4.9: Connection refused in server installer script
          mazz

          I don't think you did that right. You aren't supposed to copy the rhq-server installation directory into a new location and then upgrade that new location - the different file path may screw up the installer.

           

          Rather, upgrade the original server that is in your original location. If you installed RHQ in /opt/rhq/rhq-server, then upgrade that one (rhqctl upgrade --from-server-dir=/opt/rhq/rhq-server).

           

          Try that and see.

          • 2. Re: Re: Installing 4.9: Connection refused in server installer script
            pathduck

            Hey Mazz -

            but that's the way I've always done it, and it's worked . Fair enough I thought - I will try it the other way round, although I despise having directories with version numbers in them - messes with init-scripts and what have you. So hopefully I can change 'rhq-server-4.9.0' to just 'rhq-server' after install is done?

             

            Here's the log;

             

            [rhqadmin@d26apvl007 rhq-server-4.9.0]$ ./bin/rhqctl upgrade --from-server-dir=/opt/rhq/rhq-server

            13:59:25,919 INFO  [org.jboss.modules] JBoss Modules version 1.2.0.CR1

            13:59:26,128 INFO  [org.rhq.server.control.command.Upgrade] Stopping any running RHQ components...

            13:59:26,676 INFO  [org.jboss.modules] JBoss Modules version 1.2.0.CR1

            13:59:26,962 INFO  [org.rhq.server.control.command.Upgrade] The old installation components have been stopped

            Starting RHQ Storage Installer ...

            13:59:37,577 INFO  [org.jboss.modules] JBoss Modules version 1.2.0.CR1

            13:59:37,716 INFO  [org.rhq.storage.installer.StorageInstaller] Running RHQ Storage Node installer...

            13:59:37,734 INFO  [org.rhq.cassandra.Deployer] Unzipping storage node to /opt/rhq/rhq-server-4.9.0/rhq-storage

            13:59:38,098 INFO  [org.rhq.cassandra.Deployer] Applying configuration changes to /opt/rhq/rhq-server-4.9.0/rhq-storage/conf/cassandra.yaml

            13:59:38,203 INFO  [org.rhq.cassandra.Deployer] Applying configuration changes to /opt/rhq/rhq-server-4.9.0/rhq-storage/conf/log4j-server.properties

            13:59:38,211 INFO  [org.rhq.cassandra.Deployer] Applying configuration changes to /opt/rhq/rhq-server-4.9.0/rhq-storage/conf/cassandra-jvm.properties

            13:59:38,220 INFO  [org.rhq.cassandra.Deployer] Updating file permissions in /opt/rhq/rhq-server-4.9.0/rhq-storage/bin

            13:59:38,573 INFO  [org.rhq.storage.installer.StorageInstaller] Updating rhq-server.properties...

            13:59:38,596 INFO  [org.rhq.storage.installer.StorageInstaller] Starting RHQ Storage Node

            13:59:46,195 INFO  [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused

            13:59:46,196 INFO  [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 3000 ms...

            13:59:49,198 INFO  [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused

            13:59:49,199 INFO  [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 6000 ms...

            13:59:55,202 INFO  [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused

            13:59:55,202 INFO  [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 9000 ms...

            14:00:04,206 INFO  [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused

            14:00:04,207 INFO  [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 12000 ms...

            14:00:16,209 INFO  [org.rhq.storage.installer.StorageInstaller] The storage node is not up: java.net.ConnectException: Connection refused

            14:00:16,212 INFO  [org.rhq.storage.installer.StorageInstaller] Checking storage node status again in 15000 ms...

            14:00:31,212 ERROR [org.rhq.storage.installer.StorageInstaller] Could not verify that the node is up and running.

            14:00:31,213 ERROR [org.rhq.storage.installer.StorageInstaller] Check the log file at /opt/rhq/rhq-server-4.9.0/logs/rhq-storage.log for errors.

            14:00:31,213 ERROR [org.rhq.storage.installer.StorageInstaller] The storage installer will now exit

            14:00:31,231 ERROR [org.rhq.server.control.command.Upgrade] An error occurred while running the storage node upgrade: Process exited with an error: 2 (Exit value: 2)

            14:00:31,231 ERROR [org.rhq.server.control.RHQControl] An error occurred while executing the upgrade command [Cause: org.apache.commons.exec.ExecuteException: Process exited with an error: 2 (Exit value: 2)]

             

             

             

             

            Also attaching relevant config files, cassandra and rhq.properties, and logs.

             

            EDIT:

            Here's a thing I notice - the rhq-server.properties of the new installation is edited by the installer. But little is changed, only these lines:

             

            # Note that this is actually an installer setting. Changing the value after

            # installation will have no effect.

            rhq.storage.nodes=10.51.9.38

             

            # The ports used by storage nodes to communicate with each other

            # and used by the RHQ server(s) to communicate with the cluster.

            # Both properties are required.

            #

            rhq.storage.cql-port=9142

            rhq.storage.gossip-port=7100

             

             

            These seem like "wrong" values? Since the default seems to be 9042/7000 ?

             


             

            • 3. Re: Re: Installing 4.9: Connection refused in server installer script
              mazz

              OK, maybe I misunderstood your original post. I thought you did this:

               

              1) Unzipped into /opt/rhq/rhq-server

              2) Installed successfully and ran it from there.

              3) Shutdown everything

              4) Renamed /opt/rhq/rhq-server to /opt/rhq/rhq-server.OLD

              5) Tried to upgrade /opt/rhq/rhq-server.OLD

               

              So, my point is, I don't think it would work if you installed and ran the server from some location, then, prior to upgrading, you renamed your install location and upgraded THAT renamed directory

               

              However, if, after you unzipped, you immediately renamed it and THEN ran the installer, then it is OK. Doesn't matter what the directory name is - feel free to rename "rhq-server-4.9.0" to whatever after you've unzipped, but do that before you run the installer. But once its installed, I don't think we want to rename it and then upgrade that changed location. I actually don't know if that would work or not, but that would be the first thing I would question if that was done.

               

              But again, if all you did prior to the initial install was rename the directory that came out of unzip, that should be good. But once you are installed and running, leave the name alone.

               

              Hopefully, that makes sense - I don't know if it did

              • 4. Re: Re: Re: Installing 4.9: Connection refused in server installer script
                pathduck

                Ok I will put exactly what I do here

                 

                I already have the old server installed under /opt/rhq/rhq-server. I first stop everything since I will be moving stuff around.

                Also, /opt/rhq/rhq-storage is created and empty, since I want to start fresh storage.

                Then;

                 

                $ pwd
                /opt/rhq
                $ ls
                java      rhq48-storage-patch.zip              rhq-server-4.9.0.zip
                rhq-server                           rhq-storage
                $ mv rhq-server rhq-server.old
                $ unzip rhq-server-4.9.0.zip
                (..... unzipping)
                $ mv rhq-server-4.9.0 rhq-server
                $ cd rhq-server/bin
                
                

                 

                I then edit the new /opt/rhq/rhq-server/bin/rhqctl since I don't have Java on the PATH, and add at the top:

                export JAVA_HOME=/opt/rhq/java
                export PATH=$JAVA_HOME/bin:$PATH
                

                 

                Then I run the update:

                $ ./rhqctl upgrade --from-server-dir=/opt/rhq/rhq-server.old
                

                 

                Bascially this is what I've always done, and it will end with an upgraded server in /opt/rhq/rhq-server.

                • 5. Re: Re: Installing 4.9: Connection refused in server installer script
                  mazz

                  I think I know what you did, but whether its causing a problem I dunnno.

                   

                  You always want the server installed in /opt/rhq/rhq-server, regardless of version. So, say you installed 4.8 - you unzip to /opt/rhq/rhq-server-4.8 and rename that to "rhq-server" without version string in the directory name.

                   

                  Then, to go to say 4.9, you rename the original location to rhq-server.OLD, thus freeing up the rhq-server location.  You can now unzip 4.9 and rename to rhq-server. So you now have /opt/rhq/rhq-server.OLD (which is the old 4.8 which used to be called /opt/rhq/rhq-server) and you have /opt/rhq/rhq-server (which is now the new 4.9, which used to contain the old 4.8 content).

                   

                  But it would not surprise me if that confuses and breaks the installer.

                   

                  Because now the place where we thought had the original content is no longer there - its rhq-server.OLD. What it thinks was the old location is now the new server content (rhq-server). So all the configuration files that might have full paths are pointing to the wrong place.

                   

                  This gets to that BZ that Larry wants us to fix - https://bugzilla.redhat.com/show_bug.cgi?id=1018213 - if we had relative paths everywhere in the config files, I think it *might* work but right now, we have in places full paths in the config files - and if you rename the install location, those config file absolute paths are now wrong.

                   

                  its just a guess, but that's what I would look at first

                  • 6. Re: Installing 4.9: Connection refused in server installer script
                    pathduck

                    Hm ... are you sure the server install dir contains such paths that would break the upgrade? I really have used the same procedure every time since I updated from 4.4. Well, maybe something changed in 4.9 upgrade process. I seem to remember the install docs earlier said to stop everything first, and since you specify the old server directory I assumed that when everything was stopped anyway, it would just copy what it needed to the new directory config files.

                     

                    LIke I said in the earlier post, I think it gets confused with the storage ports, and somehow confuses them, setting wrong values leading to it being unable to connect.

                    Cassandra should listen to 9142/7100 but somehow the installer thinks it should use 9042/7000.

                     

                    I have now figured I might as well do a complete reinstall of the server - most of the settings are in Oracle anyway. So I have done a completely new install and at least the installer went through without hitches. But now the server will not start fully - probably another issue (This:RHQ Web Console not available  (4.9.0) )

                     

                    So maybe I will need to rethink my upgrade procedure when 4.10 is out, and keep the version number in the directory names. Only major drawback of course is that I would need to edit the init script as well to refer to the new directory.

                     

                    -Stian

                    • 7. Re: Installing 4.9: Connection refused in server installer script
                      mazz

                      > Cassandra should listen to 9142/7100 but somehow the installer thinks it should use 9042/7000.

                       

                      That sounds familiar. I am going to try this myself and see if I can replicate.

                       

                      As for the directory names, can't you just have a symlink? "ln -s /opt/rhq/rhq-server-4.9.0 /opt/rhq/rhq-server" - from there all scripts will just work? Just a thought.

                      • 8. Re: Installing 4.9: Connection refused in server installer script
                        mazz

                        FWIW, I just installed rhq-server 4.9, then upgraded to a master build of 4.10 using this mechanism of renaming the latest server install dir to "rhq-server" and it all seemed to work. I got no errors and the server started up, I was able to log in and the agent and its resources were all up and green.

                         

                        Perhaps something is broke in 4.9 that has since been fixed?

                        1 of 1 people found this helpful
                        • 9. Re: Installing 4.9: Connection refused in server installer script
                          genman

                          mazz wrote:

                           

                          > Cassandra should listen to 9142/7100 but somehow the installer thinks it should use 9042/7000.

                           

                          That sounds familiar. I am going to try this myself and see if I can replicate.

                           

                          As for the directory names, can't you just have a symlink? "ln -s /opt/rhq/rhq-server-4.9.0 /opt/rhq/rhq-server" - from there all scripts will just work? Just a thought.

                           

                          I wanted to mention that creating a symlink doesn't really work well, because then the server thinks there's two different JBoss servers running, when there's only one.

                          1 of 1 people found this helpful
                          • 10. Re: Installing 4.9: Connection refused in server installer script
                            pathduck

                            Hey -

                            Thanks for checking it Mazz. I actually suspect now that it was my installing of an earlier snapshot of 4.9 where things might not have been fully tested yet, and this caused problems when upgrading to the final. I did a reinstall, not an upgrade and the installer went without problems.

                             

                            Now I am just stuck with another problem, that the login page does not show at all...

                             

                            By the way, I suspect 4.10 is just "around the corner"; so I might as well wait to see if things are better then. I guess 4.10 will be the template for JON 3.2 so it will be tested thoroughly

                             

                            -Stian