5 Replies Latest reply on Aug 8, 2013 10:44 AM by dreschler

    PostgreSQL vs. Cassandra

    dreschler

      Hi RHQ team,

       

      I have some questions regarding your further plans for using Cassandra.

       

      We are using RHQ for managing our own application server which is using MySQL as data storage.

      To avoid additional maintenance work for our customers by supporting additional DB (for backups etc.), we thought about exchanging PostgreSQL by MySQL and what might be necessary to achieve this.

       

      But as RHQ 4.8 introduced Cassandra as an additional database, this brought some questions:

       

      1. What will be the next data which will be migrated to Cassandra with the next RHQ release?
      2. Do you plan to move all data to Cassandra at some stage? If not, which data will stay in PostgreSQL and why?
      3. What was the initial reason or advantage of using PostgreSQL (compared to other DBs like MySQL)?

       

      Many thanks in advance for your answers!

       

      BR

      Gerhard

        • 1. Re: PostgreSQL vs. Cassandra
          alansantos

          Hey there -

           

          First,  RHQ will introduce Cassandra but it should be considered an implementation detail.  RHQ will assume control over the configuration and maintenance of nodes, it will also assume it's the only client of Cassandra.  Users aren't expected to manage or even be aware of Cassandras existence. 

           

           

          To your specific questions:

           

          1) What will be the next data which will be migrated to Cassandra with the next RHQ release?

          Someone else will probably need to answer this in more detail/confirm, but I expect traits and events to move into Cassandra after RHQ 4.10

           

          2) Do you plan to move all data to Cassandra at some stage? If not, which data will stay in PostgreSQL and why?

          Not necessarily. The plan is to move all storage/storage management requirements into RHQ such that it will not require a DBA or separate license (e.g. Oracle).

          Some data will move into Cassandra, other data may be moved into infinispan / filesystem, other data may be moved into something else (e.g. embedded h2). The plan is to use 1) the right type of storage for specific data types and usage patterns and 2) to eliminate the need to own and manage external storage system of record for use with RHQ.

           

          3) What was the initial reason or advantage of using PostgreSQL (compared to other DBs like MySQL)?

          I suspect expertise or availability.  Someone who was involved with the initial decision will need to answer.

          1 of 1 people found this helpful
          • 2. Re: PostgreSQL vs. Cassandra
            dreschler

            Hi Alan,

             

            first of all many thanks for your answers.

             

            But regarding your first sentence: I don't think that any type of data storage can be considered as implementation detail. In a production system, customers need to know where and how persistent data is stored, especially to be able to do maintenance things like backups. Or are there already some mechanisms implemented in RHQ for doing this?

            • 3. Re: PostgreSQL vs. Cassandra
              alansantos

              tools for backup, migration and general maintenance will be included as part of RHQ.

              1 of 1 people found this helpful
              • 4. Re: PostgreSQL vs. Cassandra
                pilhuhn

                Gerhard,

                for 1) Events and calltime metrics are indeed candidates. Traits would complete that picture, but are due to their seldom-change nature less demanding on the relational datastore right now.

                for 2) There is data, where the relatonal model fits well, so this is less likely to move to Cassandra. Other data like configuration with the properties would probably be a much better fit for C* than what we have right now.  There may come a point in time where we have so much stuff in C*, that we can easily store in some in-memory database and drop Postgres/Oracle.

                for 3) At the time we started with RHQ, MySQL was just not ready for the job. There is one community memeber (Changchun Hu), who has ported RHQ to MySQL and is waiting for us to move to github so that he can submit a pull-request.

                • 5. Re: PostgreSQL vs. Cassandra
                  dreschler

                  Alan, Heiko,

                   

                  thanks for clarification!