1 2 Previous Next 17 Replies Latest reply on Sep 23, 2014 10:15 AM by jbertram

2.5.x HA policy integration into WildFly

jmesnil Aug 29, 2014 9:08 AM

Hi, I'm integrating the HornetQ's 2.5.x branch in WildFly master and have some issues with the new HA policy in this version.

First, I don't understand what the different policies are.

Could you explain them in a few sentences (esp. the COLOCATED ones)?

I have some questions about their XML representation, their WildFly definitions and the HornetQ HAPolicy API.

Let's start with the HAPolicy API

I understand the idea is that we start from one of the defined types and are able to override the properties on top of that.

However, it's very hard to understand which properties are affected by which type. If I define a BACKUP_SHARED_STORE, what's the use of a replication-cluster-name property?

It'd be a good thing to move away from a flat API with dozens of unrelated properties to a well-defined API.

E.G. all scale-down properties looks related to the SCALE_DOWN strategy. This should be reflected by the API. If I use a FULL backup strategy, why are the scale-down properties even exposed?

Could you also make the API follow a fluent builder pattern. That'd prevent to break integration everytime a new property is added to the API and it'd make the code much more readable (and the public API well defined).

At first glance, it looks that some configuration are mutually exclusive:

live or backup

shared store or replication

full or scale down strategy

remote or colocated backups

Having all the configuration in a single level makes for an unmaintainable API.

I don't want to put validation code in WildFly to check if the configuration makes sense. As much as possible, it's the API job to provide only "valid" configuration.

For WildFly integration, we provide 2 ways to configure HornetQ. Using the management API (through the CLI) and through XML.

Please note that the canonical one is the management API (the XML only generates management operations that uses this API).

It's a good idea to try to build a hornet-server using the CLI to test whether the management API is usable or not

But let's start with XML first.

I want to move away from a flat list of unrelated elements to a well-structured tree (that kind of mirrors the fluent API I talked above).

The current XSD lists all ha-policy attributes but it's too hard to understand how they relate to each other (and it does not help, I don't understand the different policy types in the first place).

Guys, could you define a few use cases that are covered by the ha-policy.

Such as:

1. no HA

2. backup server with shared store

3. backup server with replication

4. colocated backup server with shared store

5. etc.

And what their XML looks like.

As an example, using a shared-store or replication is mutually exclusive. This should be reflected by the schema. For example I could have:

<hornetq-server name="B">

<ha-policy>

<shared-store />

</ha-policy>

</hornetq-server>

<hornetq-server name="B">

<ha-policy>

<group-name>group1</group-name>

<clustername>clusterA</clustername>

</replication>

</ha-policy>

</hornetq-server>

But it makes no sense to have a replication-cluster-name with a shared store.

If the XSD stipulates that I can have either a <shared-store> or a <replication>

that'd be much more simple to write the configuration.

But like I wrote, the XML is not the canonical way to build the configuration. It is WildFly management API. As an exercise, you can start WildFly with --admin-only so that there is no runtime operation, just enough to build the configuration.

As you will see, it's harder than it should be (e.g. why is the policy-type an attribute of the ha-policy if we use the policy-type resource name when the build the HAPolicy).

I can deal with updating the WildFly XML and management API but for that, I need a better understanding of the different use cases covered by the HA policy stuff.

And if you could use a fluent builder API that reflects the API that'd be much more simple to integrate and maintain.

As it stands now, I conside the ha-policy integration broken and we need to improve it before merging it in WildFly master.

1. Re: 2.5.x HA policy integration into WildFly

ataylor Aug 29, 2014 11:21 AM (in response to jmesnil)

Could you explain them in a few sentences (esp. the COLOCATED ones)?

      NONE - no policy
      REPLICATED - a replicated live server
      SHARED_STORE - a shared store live server
      BACKUP_REPLICATED - a replicated backup server
      BACKUP_SHARED_STORE - a replicated backup server
      COLOCATED_REPLICATED - a live server that can also maintain replicated backups
      COLOCATED_SHARED_STORE - a live server that can also maintain shared store backups

At first glance, it looks that some configuration are mutually exclusive:

live or backup

shared store or replication

full or scale down strategy

remote or colocated backups

yes Jeff you are correct

Having all the configuration in a single level makes for an unmaintainable API.

I don't want to put validation code in WildFly to check if the configuration makes sense. As much as possible, it's the API job to provide only "valid" configuration.

For WildFly integration, we provide 2 ways to configure HornetQ. Using the management API (through the CLI) and through XML.

Please note that the canonical one is the management API (the XML only generates management operations that uses this API).

It's a good idea to try to build a hornet-server using the CLI to test whether the management API is usable or not

But let's start with XML first.

I want to move away from a flat list of unrelated elements to a well-structured tree (that kind of mirrors the fluent API I talked above).

The current XSD lists all ha-policy attributes but it's too hard to understand how they relate to each other (and it does not help, I don't understand the different policy types in the first place).

Actually jeff I agree, tbh lots of this stuff was just at the top level anyway so it was just moved into its own element, the template attribute was there to make configuration easy.

I think your suggested alternatives are the best way to go, tbh we didnt really spend much time thinking about it, we just grouped all the existing ones along with the new ones.

Let me change this next week and we can improve things all round.
Actions
2. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 1, 2014 8:09 AM (in response to ataylor)

Ive come up with a better configuration for HA policy. Ive also added a few todo's we may fix, if everyone could comment please?

<ha-policy>
         
         
         <scale-down>
            
            <group-name>boo!</group-name>
            
            <discovery-group>wahey!</discovery-group>
            
            <connectors>
               <connector-ref>sd-connector1</connector-ref>
               <connector-ref>sd-connector2</connector-ref>
            </connectors>
         </scale-down>
         
         <replicated>
            
            <check-for-live-server>true</check-for-live-server>
            
            <allow-failback>false</allow-failback>
            
            <failback-delay>10000</failback-delay>
            
            
            <backup-group-name>backupGroupName</backup-group-name>
         </replicated>
         <replica>
            
            <backup-group-name>backupGroupName</backup-group-name>
            
            <replication-clustername>replicationClustername</replication-clustername>
            
            <max-saved-replicated-journals-size>3</max-saved-replicated-journals-size>
            
            <scale-down/>
         </replica>
         <shared-store-master>
            
            <allow-failback>false</allow-failback>
            
            <failback-delay>10000</failback-delay>
            
            
            <failover-on-shutdown>true</failover-on-shutdown>
         </shared-store-master>
         <shared-store-slave>
            
            
            <failback-delay>10000</failback-delay>
            
            <failover-on-shutdown>true</failover-on-shutdown>
            
            <scale-down/>
         </shared-store-slave>
         
         <colocated-replicated>
            
            <request-backup>true</request-backup>
            
            <backup-request-retries>33</backup-request-retries>
            
            <backup-request-retry-interval>1234</backup-request-retry-interval>
            
            <max-backups>12</max-backups>
            <backups>
               <backup-port-offset>1002</backup-port-offset>
               <remote-connectors>
                  <connector-ref>remote-connector1</connector-ref>
                  <connector-ref>remote-connector2</connector-ref>
               </remote-connectors>
               <backup-group-name>backupGroupName</backup-group-name>
               
               
               <scale-down/>
               
               <replication-clustername>replicationClustername</replication-clustername>
            </backups>
         </colocated-replicated>
         
         <colocated-shared-store/>

Also Im not 100% sure on the 2 colocated configurations, we could merge these into 1 as they are currently quite similar apart from and have a flag to distinguish between them, maybe something like

<colocated>
            
            <request-backup>true</request-backup>
            
            <backup-request-retries>33</backup-request-retries>
            
            <backup-request-retry-interval>1234</backup-request-retry-interval>
            
            <max-backups>12</max-backups>
            <backups>
               <backup-port-offset>1002</backup-port-offset>
               <remote-connectors>
                  <connector-ref>remote-connector1</connector-ref>
                  <connector-ref>remote-connector2</connector-ref>
               </remote-connectors>
               <backup-group-name>backupGroupName</backup-group-name>
               
               
               <scale-down/>
               
               <replica/>
               
               <shared-store-slave/>
            </backups>
         </colocated>
Actions
3. Re: Re: 2.5.x HA policy integration into WildFly

martyn-taylor Sep 1, 2014 10:41 AM (in response to ataylor)
Hi Andy,

It seems that there are 3 different types of policy defined above:
Replicated
Shared Store
Co-located
With in each of these types of policy there are 2 roles that a server can take, essentially the 2 roles are "master", "slave"(with slightly different semantics depending on the policy type).

I'd suggest grouping by policy type at level 1, then group by role at level 2. Example:

<ha-policy>
    <shared-store>
      
      </role>
      
    </shared-store>
</ha-policy>

or

<ha-policy>
    <shared-store>
      <master>
          
      </master>
      
    </shared-store>
</ha-policy>

I'd move the <scale-down> element away from top level and only allow it with in the specific policy configuration. If we do want to allow scale down to be defined for servers without and HA policy configured defined then I'd recommend creating a new HA policy type None, that can take scale down config. For example:

<none>
<scale-down>...</scale-down>
</none>

Once you have this in place I don't see any reason to prepend element names with policy. i.e.

<replication-clustername>replicationClustername</replication-clustername>

could just be:

<ha-policy><replicated><cluster-name>...

I can't comment on the indivual paramters since I don't understand what they are used for and in what circumstances. However, I think it should be possible to group them according to the format above. Other than the comments above I think this looks good.

Cheers
Martyn
Actions
4. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 1, 2014 10:52 AM (in response to martyn-taylor)
Hi Andy,

It seems that there are 3 different types of policy defined above:

Replicated

Shared Store

Co-located

With in each of these types of policy there are 2 roles that a server can take, essentially the 2 roles are "master", "slave"(with slightly different semantics depending on the policy type).

I'd suggest grouping by policy type at level 1, then group by role at level 2. Example:

<ha-policy>

    <shared-store>

      

      </role>

      

    </shared-store>

</ha-policy>

or

<ha-policy>

    <shared-store>

      <master>

          

      </master>

      

    </shared-store>

</ha-policy>

You couldnt do exactly that because this needs defining in a schema and the schema differs depending on the role, this is the flattened config we are trying to move away from, what you could do is this tho:

<master>
    
<master>

Im not sure how this would work in colcoated tho as it is actually both master and slave.

I'd move the <scale-down> element away from top level and only allow it with in the specific policy configuration. If we do want to allow scale down to be defined for servers without and HA policy configured defined then I'd recommend creating a new HA policy type None, that can take scale down config. For example:

<none>

<scale-down>...</scale-down>

</none>

Yeah i thought of that but couldn't decide, I'll go with what ever the majority want.

Once you have this in place I don't see any reason to prepend element names with policy. i.e.

<replication-clustername>replicationClustername</replication-clustername>

could just be:

<ha-policy><replicated><cluster-name>...

+1, i actually have changed most of them to be like that
Actions
5. Re: 2.5.x HA policy integration into WildFly

martyn-taylor Sep 1, 2014 11:00 AM (in response to ataylor)

Andy Taylor wrote:

You couldnt do exactly that because this needs defining in a schema and the schema differs depending on the role, this is the flattened config we are trying to move away from, what you could do is this tho:

<master>



<master>

Im not sure how this would work in colcoated tho as it is actually both master and slave.

I think you missed my second example
Actions
6. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 1, 2014 11:04 AM (in response to martyn-taylor)

yes i did
Actions
7. Re: 2.5.x HA policy integration into WildFly

martyn-taylor Sep 1, 2014 11:08 AM (in response to ataylor)

You couldnt do exactly that because this needs defining in a schema and the schema differs depending on the role, this is the flattened config we are trying to move away from, what you could do is this tho:

<master>



<master>

Im not sure how this would work in colcoated tho as it is actually both master and slave.

Sure, I can't see any issue with having both master and slave in the same config for co-located if that is what makes sense.
Actions
8. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 1, 2014 12:32 PM (in response to martyn-taylor)

'd move the <scale-down> element away from top level and only allow it with in the specific policy configuration. If we do want to allow scale down to be defined for servers without and HA policy configured defined then I'd recommend creating a new HA policy type None, that can take scale down config. For example:

<none>

<scale-down>...</scale-down>

</none>

Actually, the policy is scaledown not none so i think i prefer how i have it.
Actions
9. Re: 2.5.x HA policy integration into WildFly

martyn-taylor Sep 2, 2014 4:19 AM (in response to ataylor)

I got the impression that scale down was a behavioural option given to other HA policies? TBH I'm not sure that scale down makes sense as a HA policy, since it doesn't add any HA. To me scale down seems more like cluster configuration, i.e. how to dynamically remove servers from the cluster without losing data.
Actions
10. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 2, 2014 8:59 AM (in response to martyn-taylor)

I got the impression that scale down was a behavioural option given to other HA policies? TBH I'm not sure that scale down makes sense as a HA policy, since it doesn't add any HA. To me scale down seems more like cluster configuration, i.e. how to dynamically remove servers from the cluster without losing data.

Well it does provide a way of making a journal highly available in a non HA cluster, but i see you're point. Im not sure it belongs in a cluster configuration either tho.
Actions
11. Re: 2.5.x HA policy integration into WildFly

jmesnil Sep 8, 2014 5:08 AM (in response to ataylor)

+1 for the modifications, they make it easier to understand and figure out how to setup HornetQ for different use cases.
Actions
12. Re: 2.5.x HA policy integration into WildFly

ataylor Sep 23, 2014 3:34 AM (in response to jmesnil)

So I am very near to completing the refactoring and want to run the final configuration past every one to make sure all are happy, I will explain each one:

<ha-policy>

      

      

      <live-only>

         <scale-down>

            

            <group-name>boo!</group-name>

            

            <discovery-group>wahey</discovery-group>

         </scale-down>

      </live-only>

   </ha-policy>

This is a live only policy, no ha but can support scale down of a live server.

<ha-policy>

      <shared-store-master>

         <failback-delay>3456</failback-delay>

         <failover-on-shutdown>false</failover-on-shutdown>

      </shared-store-master>

   </ha-policy>

A shared store live server

<ha-policy>

      <shared-store-slave>

         <failback-delay>9876</failback-delay>

         <failover-on-shutdown>false</failover-on-shutdown>

         <restart-backup>false</restart-backup>

         <scale-down>

            

            <group-name>boo!</group-name>

            

            <discovery-group>wahey</discovery-group>

         </scale-down>

      </shared-store-slave>

   </ha-policy>

A shared store backup server

<ha-policy>

      <replicated>

         <allow-failback>true</allow-failback>

         <group-name>purple</group-name>

         <check-for-live-server>true</check-for-live-server>

         <failback-delay>1111</failback-delay>

         <clustername>abcdefg</clustername>

      </replicated>

   </ha-policy>

A replicated live server

<ha-policy>

      <replica>

         <group-name>tiddles</group-name>

         <max-saved-replicated-journals-size>22</max-saved-replicated-journals-size>

         <clustername>33rrrrr</clustername>

         <restart-backup>false</restart-backup>

         <allow-failback>true</allow-failback>

         <failback-delay>444</failback-delay>

         <scale-down>

            

            <group-name>boo!</group-name>

            

            <discovery-group>wahey</discovery-group>

         </scale-down>

      </replica>

   </ha-policy>

A replicated backup server

<ha-policy>

      <colocated>

         <backup-request-retries>44</backup-request-retries>

         <backup-request-retry-interval>33</backup-request-retry-interval>

         <max-backups>3</max-backups>

         <request-backup>false</request-backup>

         <backup-port-offset>33</backup-port-offset>

         <replication>

            <replicated>

               <allow-failback>true</allow-failback>

               <group-name>purple</group-name>

               <check-for-live-server>true</check-for-live-server>

               <failback-delay>1111</failback-delay>

               <clustername>abcdefg</clustername>

            </replicated>

            <replica>

               <group-name>tiddles</group-name>

               <max-saved-replicated-journals-size>22</max-saved-replicated-journals-size>

               <clustername>33rrrrr</clustername>

               <restart-backup>false</restart-backup>

               <scale-down>

                  

                  <group-name>boo!</group-name>

                  

                  <discovery-group>wahey</discovery-group>

               </scale-down>

            </replica>

         </replication>

      </colocated>

   </ha-policy>

A colocated server that uses replication

<ha-policy>

      <colocated>

         <backup-request-retries>44</backup-request-retries>

         <backup-request-retry-interval>33</backup-request-retry-interval>

         <max-backups>3</max-backups>

         <request-backup>false</request-backup>

         <backup-port-offset>33</backup-port-offset>

         <shared-store>

            <shared-store-master>

               <failback-delay>1234</failback-delay>

               <failover-on-shutdown>false</failover-on-shutdown>

            </shared-store-master>

            <shared-store-slave>

               <failback-delay>44</failback-delay>

               <failover-on-shutdown>false</failover-on-shutdown>

               <restart-backup>false</restart-backup>

               <scale-down/>

            </shared-store-slave>

         </shared-store>

      </colocated>

   </ha-policy>

A colocated server that uses shared store.

Martyn, you had some ideas on a slightly different approach, could you post some sample config and we can decide which is best.
Actions
13. Re: 2.5.x HA policy integration into WildFly

gaohoward Sep 23, 2014 4:32 AM (in response to ataylor)

I think may be it's good idea to put some common concepts into one group, like:

    <server-capabilities>
        <live-only ... />
        <scale-down ... />
        <slave ... />
        <master ... />
        <colocated ... />
    </server-capabilities>

    <discovery>
    </discovery>

    <data-synchronization>
      <shared-store ... />
      <replicated .../>
    </data-synchronization>

It would make it easy to change/edit. For example if you can to change a server from live-only to master, you only need to change the <server-capabilities> and other parts remain intact.

Just my 2 cents. Not think it through however.
Actions
14. Re: 2.5.x HA policy integration into WildFly

martyn-taylor Sep 23, 2014 5:32 AM (in response to ataylor)

Hi Andy,

The only change I 'd suggest is to split the HA Policy type and the server role with in that policy type out into separate elements.

For example:

<ha-policy>
<replicated> 
<replicated-master> 

I think splitting the policy type and the server role out is more declarative. It allows us to split any configuration associated with the policy type vs the particular server role out from within the policy element. I realise that there is only currently 1 or 2 places where this applies, but keeping the policy/role separate gives us more flexibility moving forward (when adding new ha-policies) and also keeps the option open for potential features like cluster role negotiation. I could imagine something like this:

<replicated>
<min-replicas>2</min-replicas>
<replicatied>

The servers in the cluster then figure out how to best setup the topology using some voting system.

In short I think there are 2 things we are defining/configuring.

1. The HA Policy Type: This is akin to standard well known policy models like replication, shared-store etc...
2. The Role that the server will take with in that HA policy type. i.e. how each server is configured to realise the policy: replication-slave, shared-store-master. etc...

I think by grouping these 2 things into one configuration element we are adding complexity and restricting ourselves for future enhancements.

This is the only suggestion I have. Other than this I think the the new configuration structure is great.

Thanks
Martyn
Actions

1 2 Previous Next

Go to original post