8 Replies Latest reply on Aug 20, 2013 12:42 PM by nstefan

    RHQ 4.8 - Storage questions

    genman

      I installed three servers, starting with a single Cassandra (storage node) instance, then adding two more.

       

      Following the instructions, here are my notes on what I ran:

       

      cd rhq-server-4.8.0/bin
      vim rhq-server.properties # enter password etc.
       ./rhq-storage-installer.sh     --commitlog /Users/elias_ross/cassandra/commitlog --data /Users/elias_ross/cassandra/data \
         --dir /Users/elias_ross/cassandra --saved-caches /Users/elias_ross/cassandra/saved_caches
      Starting RHQ Storage Installer ...
      
      # fix a bug???
      $ cd rhq-server-4.8.0/bin
      $ rm -rf rhq-storage/
      $ ln -s ~/cassandra rhq-storage
      

       

      I noticed a couple things:

      • Adding an additional storage node didn't update the cassandra.yaml file, nor did it add the server to the seed list in the properties file. I did see the servers listed in the storage list as "Installed" but not "Normal." Don't know why. I figured out how to manually edit these files, but it seems like the installer should know better. I noticed this when one server would show metrics, but not the other.
      • Something created duplicate 'rhq-storage' directories under rhq-server's directory. The installer setup rhq-storage where I told it to. But it doesn't seem like the installer stores this information anywhere. I created a symlink as it seems like 'rhqctl' can't find the '~/cassandra' dir.
      • With everything working (I think) my first storage node shows up as Normal, the others are Installed, but they appear to be storing data anyway.

       

      Unfortunately I didn't take great notes, so some of this could be sort of wrong, i.e. due to user error. It also is possible since I was doing load testing with the agent that this interfered with the agent's ability to update the storage node. I also don't get the sense that 'Installed' versus 'Normal' really means anything?

        • 1. Re: RHQ 4.8 - Storage questions
          genman

          Let me clarify a few things: Node 1 showed up as 'Normal', 2&3 as 'Installed'. And for some reason now 1&2 is 'Installed', 3 is 'Normal'.

          • 2. Re: RHQ 4.8 - Storage questions
            pilhuhn

            I let John Sanda answer this, but 4.8 was single Cassandra Node only. The (full) multi-node support is developed as we speak, so it may be that what you see is what is expected for 4.8.

            See fine print in the 4.8 release notes: https://docs.jboss.org/author/display/RHQ/Release+Notes+4.8.0#ReleaseNotes4.8.0-Cassandrabackendformetricstorage

            • 3. Re: RHQ 4.8 - Storage questions
              john.sanda

              The management functionality for multi-node support was not ready in 4.8. In 4.9 you won't have to perform any special configuration to get a multi-node cluster up and running. The server will configure the node once it is discovered and merged into inventory. If you really want to run multiple storage nodes with 4.8, the best thing to do is install the storage nodes prior to installing your RHQ server. I can help with that if you want.

              • 4. Re: RHQ 4.8 - Storage questions
                genman

                I managed to get it working (I think) with multiple nodes. It involved editing configuration files as I said. I'm thinking it works, but I'm not sure the replication is working as expected.

                 

                Is there going to be any way to set up data center affinity? I'm guessing not, but will RHQ like it if play with the configuration files to make it work.

                 

                One odd thing I've found is this:

                 

                18:16:30,294 INFO  [org.rhq.enterprise.server.storage.StorageClusterHeartBeatJob] (EJB default - 10) Moving Server[id=10003,name=...,securePort=7443] from MAINTENANCE to NORMAL

                 

                It looks like if I set a server to MAINT it will flip to NORMAL on its own. Is it possible to prevent this?

                • 5. Re: RHQ 4.8 - Storage questions
                  mazz

                  > It looks like if I set a server to MAINT it will flip to NORMAL on its own. Is it possible to prevent this?

                   

                  We just talked about this last week, and Stefan is fixing that. He can chime in wiht more details, but we're going to get it such that if the user manually put the server in MAINT mode, we won't automatically switch it back to NORMAL.

                  • 6. Re: RHQ 4.8 - Storage questions
                    genman

                    Thanks for the information.

                     

                    Can you address my additional question:

                    Is there going to be any way to set up data center affinity? I'm guessing not, but will RHQ like it if play with the configuration files to make it work.

                     

                    I also wonder how replication will be configured. I assume it is not configured at the moment.

                    • 7. Re: RHQ 4.8 - Storage questions
                      john.sanda

                      With respect to the RHQ storage node, there won't be any support for multi-data center deployments in 4.9. That does not mean it is not possible; rather, you would have to configure things yourself.

                       

                      Data replication is automatically configured. I am writing up some docs today and tomorrow. I will post the link in another thread as soon as I have something written up on replication.

                      • 8. Re: RHQ 4.8 - Storage questions
                        nstefan

                        In RHQ 4.8, it is not possible anymore to manually set the RHQ server in Maintenance node (a regression from RHQ 4.7). This bug was introduced by the storage cluster connectivity check. The RHQ server goes into maintenance mode and stays there if and only if there is no connectivity to the storage cluster. Also the server goes out of the maintenance mode automatically as soon as there is connectivity to the storage cluster. So, user actions are overriden each time a storage cluster connectivity check takes place.

                         

                        This has now been fixed in master and will be released in RHQ 4.9. Manually setting Maintenance mode (UI or properties file) persists until the user sets the server back to Normal mode via UI (same as RHQ 4.7). But there will be a semantic difference in RHQ 4.9 and beyond. The RHQ server will go automatically into Maintenance mode if there is no storage cluster connectivity. User actions make Maintenance mode persistent regardless of storage cluster connectivity; while storage cluster connectivity issues result in temporary Maintenance mode.

                         

                        I hope this all makes sense

                        1 of 1 people found this helpful