This page describe the internal logic of mod_cluster on the httpd site.
mod_cluster in fact is the just a sophisticated proxy balancer provider. It uses a provider for "shared" memory handling a post_read_request and one handler.
Structure of the httpd part:
mod_cluster is made of 3 modules:
mod_sharedmem: Shared memory and persistence handler.
mod_proxy_cluster: A mod_proxy balancer that allows dynamic creation of balancers and workers.
mod_manager: A handler that process messages coming for the ModCLusterService and fill the shared memory according to the messages.
a patch to the actual 2.2.x mod_proxy code.
The modules use provider (via ap_lookup_provider/ap_register_provider) to interact. For first version only AJP will be supported.
Normal request processing (mod_proxy_cluster):
The translate_name hook will check that the url and vhost correspond to mappable location according the data from the shared memory.
If it is mappable r->handler will be set to "proxy-server" and the r->filename to proxy:cluster://URI. So that mod_proxy logic
could process the request and give it our balancer.
The proxy_handler will process the request (because of the value of r->handler).
In our find_best_bytraffic the following is done:
Update the workers according to the table in shared memory. (create or remove workers)
While not ok and can retry. (only on valid workers).
start proxying request.
ap_proxy_pre_request will call our balancer that will return the worker to use.
proxy_run_scheme_handler will forward the request to our balancer logic (via the canon_handler hook). That will find the cluster node and use again the scheme_handler to get it sending the request and reading the response.
Done
When a request is received mod_cluster will try to map it to a host of the virtual hosts table is that fails the request is DECLINED so the rest of httpd may process it. If the request corresponds to a virtual host then the URL of the request is used to see if it corresponds to an context in the Contexts table if that fails DECLINED is returned (or do we want 404?) otherwise the request is processed according to the status of the context in the table. If the status is ENABLED the request is marked to be forwarded to the cluster. If the status is DISABLE and the request contains a sessionid the request is forwarded to the cluster if it doesn't contain a sessionid 500 is returned. If the status is STOPPED, 500 is returned. That is done in an handler to prevent useless processing of request that can't be processed by the cluster.
When choosing a node (worker) in cluster_bytraffic the host and corresponding context information are checked to prevent sending a request to a node that can't process it (wrong host or application not deployed etc).
Asynchronous requests processing (mod_manager):
The asynchronous requests are processed by the mod_manager part of mod-cluster. As the protocol uses "special" method names the
translate_name hook will set r->handler to "mod-cluster" when detecting those method.
The asynchronous requests are used to fill the shared area information according to the cluster information. The STATUS messages are handled a special way they validate the information according to the result of an "asynchronous ping/pong". Once the information is validated it is stored in the shared memory.
Tables in the shared memory:
The CONFIG messages allow to fill the 3 tables of shared memory:
nodes
That is a part of the CONFIG messages. An id is added to this description to identify the corresponding virtual hosts.
JvmRoute: <JvmRoute> | Domain: <Domain> | <Host: <Node IP> | Port: <Connector Port> | Type: <Type of the connector> | (<node conf>) | nodeid | balancer name |
virtual hosts:
That is a part of the CONFIG messages. A node id and vhost id are added to this description to identify the context and the node. For each virtual host of the Alias: an entry is created in the virtual hosts table.
host | nodeid | vhostid |
Contexts:
That is a part of the CONFIG messages. A host id and status are added to this description. The status can have the following values: DISABLED/ENABLED/STOPPED. The contexts are created in the state STOPPED. For each context of the Context: an entry is create in the Contexts table.
context | vhostid | status |
Balancers:
That is a part of the CONFIG or ManagerBalancerName directives.
balancer | balancerid | (<balancer conf>) |
Directives of the mod_cluster:
MemManagerFile filename. Base name to access to shared memory.
ManagerBalancerName name NoMapping. Name of the balancer to use with ProxyPass directive. Without NoMapping the balancer will be created automatically and mod_cluster will automatically maps the context deployed in the nodes.
Maxcontext number. Max number of contexts mod_cluster can handle.
Maxnode number. Max number of nodes mod_cluster can handle.
Maxhost number. Max number of aliases (virtual hosts) mod_cluster can handle.
Maxbalancer. Max number of balancer mod_cluster can handle.
Examples:
LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_ajp_module modules/mod_proxy_ajp.so LoadModule sharedmem_module modules/mod_sharedmem.so LoadModule proxy_cluster_module modules/mod_proxy_cluster.so LoadModule manager_module modules/mod_manager.so ProxyPass /myapp balancer://mycluster/myapp lbmethod=cluster_bytraffic ManagerBalancerName mycluster NoMapping
The nodes information of the balancer mycluster is filled by the mod_cluster logic corresponding to mycluster.
LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_ajp_module modules/mod_proxy_ajp.so LoadModule sharedmem_module modules/mod_sharedmem.so LoadModule proxy_cluster_module modules/mod_proxy_cluster.so LoadModule manager_module modules/mod_manager.so
The balancer is created with default values and the mapping to the contexts is done dynamically according to the CONFIG messages.
Hooks of mod_sharedmem:
post_config: Register a cleanup for the pools and memory.
pre_config: Create a global pool for shared memory handling.
provides a slotmem_storage_method (do (call a callback routine on each existing slot), create, attach and mem (returns a pointer to a slot)).
Hooks of mod_proxy_cluster:
post_config: Find the providers that handle access to balancers, hosts, contexts and node (from mod_manager).
child_init: Create the maintenance task. The maintenance task checks regularly that the child balancers and workers corresponds to
the shared memory information and create/delete/recreate new balancers or workers if needed. Additionally from time to time
it checks the connection to node and force the cleaning of elapsed TTL connections (by cleaning the one it has used to test the
node.
translate_name: Check if the request corresponds to a URL mod_cluster could handle if yes sets the r->handler to "proxy-server" and
r->filename to "proxy:cluster://balancer_name so that our pre_request hook will process it.
pre_request (proxy_hook_pre_request): It finds the worker to use and rewrite the URL to give the request to the corresponding
scheme handler. This hook is called by proxy_handler() of mod_proxy.
canon_handler: It process the canonicalising of the URL.
provides a proxy_cluster_isup() to check that a node is reachable. (Using a "asynchronous" ping/pong for example).
Hooks of mod_manager:
post_config: Create shared memory for balancers, hosts, contexts and nodes (using mod_sharedmem).
translate_name: Check the method of the request and if it is one defined by the protocol it sets r->handler to "mod-cluster".
handler: It process the commands received and update the shared memory.
provides a storage_method for balancers, hosts, contexts and nodes. A storage_method contains a read(), ids_used(), get_max_size() and remove().
The read accepts 2 parameters a id or key value and does 2 things reads a record using the slot number in the shared memory or the key corresponding to the second parameters.
Processing REMOVE-APP :
Remove REMOVE-APP requires a "special" handling in mod_manager the context and host corresponding to the node will be removed from the shared memory and the node will marked as "removed". The logic to remove the node from the shared memory will be the following:
remove = mark removed (can't be updated in future operations).
Any CONFIG corresponding to this node will now insert a new one = create a new id (use it to create the worker).
the maintenance threads will remove the information of the "mark removed" workers.
the maintenance threads will create the new worker.
After a while the "mark removed" entry will be removed by the one of maintenance threads only at that point the information of the node is removed from the shared memory.
Comments