Clustering Management Abstractions
brian.stansberry Apr 19, 2006 3:02 AMI'm opening this thread to start discussion of what the key clustering constructs are that are of interest to the clustering and JBoss ON teams. My primary goal is to gain clarity around the key concepts so we can use them in discussions of the Profile Service and the MetaDataRepository. As a side benefit I'd like to derive some well-understood terms that we can use to name these constructs in docs, training materials, etc.
Basic concepts I see are:
1) Domain -- all of the servers relevant to the management tools. (I'd love to see a more precise definition of this if there is one somewhere).
2) Node -- an individual JBoss instance.
3) Cluster -- a set of nodes able to recognize and communicate with each other for the purposes of providing HA and scalability. E.g. a set of nodes with a JGroups channel that has the same configuration. We also often call this a Partition.
4) ??? (call it a Cell for now) -- a set of nodes within a Cluster that are meant to communicate with each other to support HA and/or scalability for one or more services. Goes beyond the requirements of a Cluster in that members of a Cell are expected to be able to interoperate in a more complex way than mere exchange of messages initiated by the JGroups protocols. E.g. Cell members may need to replicate state, and therefore the classes needed to unmarshal the state would need to be available on all nodes in the cell.
Many Clusters will probably only have one Cell. Reasons for multiple Cells include:
a) Heterogeneous deployments (See some of the discussion in the JBossCache Buddy Replication thread (http://www.jboss.com/index.html?module=bb&op=viewtopic&t=78308 related to ReplicationGroups and marshalling).
b) Topology issues, e.g. the Cluster spans a WAN but we want to keep as much traffic local as possible, or we want state replication to be done to nodes on another power supply.
Nodes can belong to more than one Cell and more than one Cluster.
Another concept is "all nodes discoverable by the management tool". Perhaps this is the "Domain", although my feeling is "Domain" is meant to be more abstract. Perhaps "Cluster" is the correct term for "all nodes discoverable by the management tool", and "Partition" should be used for the Cluster concept I defined above.