Want to know more, explained by the developper see:
and read the resulting QA:
Q: Is the demo application available?
A: Yes, it's part of the mod-cluster download (under demo/client). The
SessionDemo itself is not available, but it's a simple demo
adding data to an HTTP session. I can make it available if
necessary...
Q: Are the slides available?
A: www.jboss.org/webinars
Q: Is this a direct competition to Terracotta's offering?
A: No; mod-cluster is about (1) dynamic discovery of workers, (2) web
applications, and (3) intelligent load balancing. Clustering is an
orthogonal aspect; as a matter of fact, clustering could be used
among a number of workers which are not clustered.
Q: Is the clustering between jboss instances within a domain done @ JVM level?
A: No; we use JGroups (www.jgroups.org) and JBossCache
(jboss.org/jbosscache) to replicate sessions. In JBoss 6, we've
replaced JBossCache with Infinispan (infinispan.org) to replicate
and/or distribute sessions among a cluster.
Q: Why should the deployment topology use httpd? Can't the tomcat (bundled in JBoss) use APR.
A: Yes, JBossWeb can use APR, and as a matter of fact does use it if
the shared APR lib is found on the library path. However, using APR
and httpd are orthogonal issues; while the mod-cluster module could
theoretically be used in JBossWeb directly, we haven't tried it
out, as many deployments still use httpd in production.
Note that JBossWen cannot be used as a reverse proxy.
Q: What are the steps involved to migrate a setup which is on mod_jk to mod_cluster?
A: There are only a few steps involved (more details can be found on
jboss.org/mod_cluster):
- Use the httpd modules downloadable from jboss.org/mod_cluster
- Configure httpd.conf accordingly
- Drop workers.properties and uriworkermap.properties
- Configure JBoss AS to include the addresses of the httpd
daemon(s) running
- (Optional) Configure the domain for the JBoss AS instance
The steps are described in detail in
http://docs.jboss.org/mod_cluster/1.1.0/html/mod_jk.html
Q: Are there seperate logging mechanism for mod_cluster like we use to have for mod_jk
A: No; mod-cluster uses the normal httpd log, and this is configured in httpd.conf (similar
to mod-jk / mod-proxy). On the JBoss AS side, the normal AS logging
is used (e.g. conf/log4j.xml)
Q: Is the mod_cluster the same as mod_proxy_balancer?
A: No; mod_proxy_balancer requires manual configuration
(e.g. hosts to be balanced over). Also, web applications have to be
present on all hosts, and don't register themselves
automatically. Plus, mod_proxy_balancer doesn't have any notion of
load balance factors sent to it by the workers.
Q: I have an application that uses an HASingleton(ejbtimer). In case of a multidomains architecture, my application would fail because I would have an ejbtimer in each domain. How would you get a large cluster work in this scenario.
A: If one singleton timer per domain is not desired, then one could
place the singleton timer into a separate cluster, which spans
multiple domains. Note that an HASingleton ejb timer and
distributed cache will use separate channels by default.
Q: Is it not efficient to avoid sticky-sessions? If we avoided sticky sessions, then We could use hardware based load-balancers which did load balancing @ Transport (TCP/IP) layer rather than Application layer.
A: Making sessions non-sticky means that access to sessions can be random, ie. requests
for an HTTP session can go to any node within a domain. However, this means that we should
not use asynchronous replication, as a write to an attribute followed by an immediate read
of the same attribute but on a different node might lead to the reading of stale data.
However, using synchronous replication is slower because every write incurs a round trip to
the cluster, and the caller blocks until all responses have been received.
Our recommendation is to use sticky sessions and asynchronous replication, for the best
performance.
Q: Is it possible to configure mod_cluster or mod_jk in a way that certain IPs requests go to just a particular domain
A: Not easily. One could configure virtual hosts in httpd.conf, and
workers connect to certain virtual hosts only, but there is no
enforcement of which domains are hit from the httpd side.
Q: We used appliance for loadbalancing. Can we use mod-cluster for dynamic configuration instead of using static properties?
A: No, mod-cluster requires the httpd to run. We intend to talk to load balancer vendors and
get them to implement the MCMP protocol, so that their balancers could be used with
mod-cluster enabled workers.
Q:Is mod_cluster delivered as a native module in Apache just as mod_proxy?
A: Yes, on the httpd side. On the JBoss AS side, we use a service archive (mod_cluster.sar), in /deploy
Q:a little more general clustering question. What about distributing jboss servers across datacenters but that belong to the same cluster?
A: This is possible, however, in most cases IP multicasting would not be available over a
WAN. Therefore, the configuration of JBoss AS should use a TCP based stack rather than a UDP
based stack.
Q: Can you suggest the pattern to cluster the Apache server for Fail over when acting as Load balancer for Jboss Cluster
A: This is very simple: just start multiple httpds and add them to JBoss AS,
e.g. mod_cluster.proxyList=host1:8000,Host2:8000 etc
Workers (JBoss instances) will then register themselves and their applications with all
httpds in the list.
Q: Is mod_cluster available with JBoss AS (community) or JBoss Enterprise Application Platform from Red Hat?
A: Currently, mod-cluster 1.1.0.CR3 will ship as part of JBoss AS 6. The mod-cluster
functionality is part of EAP 5.0.1 and will also be part of JBoss EAP 5.1.
Q: Can the worker nodes be configured from JON?
A: Not yet (with respect to mod-cluster configuration). This is on the roadmap.
Q: What is the configuration for dynamically adding nodes as load increases?
A: This feature is not available. It might be available as part of our Deltacloud product. Currently, third party vendor's products, such as RightScale, could be used to do this.
Q: Which version of mod_cluster do you use ? in my version i cannot see the sessions
A: To see sessions in mod_cluster_manager, the following entry has to be added to httpd.conf:
<IfModule mod_manager.c>
MaxsessionId 50
</IfModule>
Note that sessions are by default not shown in mod_cluster_manager.
Refer to the documentation at jboss.org/mod_cluster for details.
Q: can you show config quickly how mod_cluster automatically detect new hosts
A: When a new JBoss instance is started, as soon as the mod_cluster.sar service is deployed,
the host and all of its applications will be registered with all httpds, so this happens
immediately.
Q: What do you advice in a multi-datacenter setup? Can we use mod_cluster and won't this cause a event-storm when one of the datacenters goes down?
A: When you have domains across multiple data centers, and one data center goes down, then the
other data center has to accommodate the traffic from the data center which is down. This
causes more traffic to the surviving data center, so when doing capacity planning this
should be taken into account. If the nodes in a domain run in the cloud, then one could
envisage automatically starting new virtualized instances to accommodate the handling of
this increased traffic.
Q: Are there seperate logging mechanism for mod_cluster like we use to have for mod_jk
A: mod-cluster is configured through the usual mechanism in httpd.conf
Q: Do we need a mod_cluster manager on all nodes [in the cluster]?
A: Note that mod_cluster_manager is only available on the httpd side
Q: Is gossiprouter high available?
A: Yes, multiple GossipRouters can be started. Note that, if running only on EC2, then a
protocol called S3_PING can be used as an alternative. It uses an S3 bucket to store cluster
topology information.
Q: For the group of HTTP daemons in front of the clusters, I assume those can be round robin'd DNS, or any other method of load balancing them?
A: No, DNS round robin (or a hardware load balancer fronting the httpds) works. When using
sticky sessions, the jsessionid is sent with each request (cookie or URL rewriting) and it
is suffixed with the jvmRoute of the node which hosts a given session.
Q: Does JBoss support UNICAST messaging?
A: Yes; JGroups would have to be configured appropriately to do that. When using TCP, this is
done automatically. When using UDP, ip_mcast="false" would have to be set.
Q: Is there support for mount point exclusions like JkUnMount in mod-jk?
A: Yes, use <property name="excludedContexts">jmx-console,web-admin,ROOT</property> in
/deploy/mod_cluster.sar/META-INF/mod_cluster-jboss-beans.xml
Q: What are the steps involved to migrate a setup which is on mod_jk to mod_cluster?
A: See the previous answer above (http://docs.jboss.org/mod_cluster/1.1.0/html/mod_jk.html)
Q: There is implicit, a concept, of starting connections from the jboss "backend" to the frontend" ,this seems odd to me
A: This is only conceptual; workers will *not* create a socket connection to
httpd. Instead httpd connects to the workers (ie. JBoss AS instances) and the workers use
the same channel to send status updates, registration of web applications etc.
Q: Can you use buddy list to replicate session accross domains?
A: Yes, that can be done, as a domain doesn't need to have the same scope as a cluster; a
cluster can span multiple domains. However, for scalability purposes, we recommend to
restrict a cluster to a domain
Q: How does full replication in each domain compare to using buddy replication and just one cluster/domain?
A: The scalability of full replication is a function of cluster size and average data size, so
if we have many nodes and/or large data sets, then we hit a scalability ceiling.
If DATA_SIZE * NUMBER_OF_HOSTS is smaller than the memory available to each host, the full
replication is preferred, as reads are always local. If this is not the case, then we can
use multiple domains, or we can use one single cluster, but switch from full replication to
either buddy replication (JBossCache) or distribution (Infinispan).
Distribution only stores N copies of a session, therefore scales much better than full
replication.
Q: Is there any turorial provided?
A: There's a quick start guide available at jboss.org/mod_cluster
Q: Is it possible to limit which hosts are allowed to join the cluster easily?
A: Yes. This can be done at the JGroups level, by using a protocol called AUTH
(http://community.jboss.org/wiki/JGroupsAUTH). It provides passwords, X.509 certificates,
host lists and simple MD5 hashes as authentication, but it is pluggable, so other mechanisms
can be included. Post questions on AUTH to the JGroups mailing list (jgroups.org).
Q: to uprade without downtime you have to have at least two domains for each application, right?
A: Yes
Q: Is there any method/workaround to avail Session Replication across Domains?
A: A cluster isn't restricted in scope to a domain, it can span multiple domains. However, that
defeats the purpose of a domain (divide-and-conquer), and makes rolling upgrade more
difficult. For instance, if a cluster spans 2 domains, then it is better to club the 2
domains together into one.
Q: I missed some of the demo - I saw the session replication/migration in the demo, but wanted to know if I have 2 apache servers in front of the jboss cluster and a network load balancer doing round robins will mod_cluster maintain the session across them?
A: Yes. The jvmRoute is appended to the jsessionid and identifies the node in a given domain
uniquely. See also the question above on DNS round robin.
Q: on apache side, which is required versions? 2.2 or also 2.0?
A: 2.2.8 or higher
Q: Im using JBoss 4.2.2 GA.. Should I migrate to JBoss6?
A: JBoss 5 or higher. You *can* use mod_cluster with JBoss 4.2.2 - but
you'd need to configure it as you would for JBoss Web standalone
(or Tomcat) - and consequently has slightly limited functionality,
e.g. no HA-mode, limited to 1 load metric.
Q: UDP broadcast?
A: The ability to send a packet to all hosts on a given subnet. IP multicasting is more
efficient because a packet is only sent to subscribed hosts. IP multicasting is more efficient
than TCP is large clusters, because the switch copies the packet to all recipients, whereas
with TCP a packet has to be sent N-1 times (where N is the cluster size)
Q: Normally how much time it takes for new node to be detected by mod_cluster..is it configurable?
A: No, it is not configurable. As soon as the JBoss instance is started, it (and its webapps)
will get registered.
The time required to do this depends on how the node finds out
about the proxy. If you've configured mod_cluster with a static
proxy list, then it registers with the httpd proxy upon startup. If
you configured mod_cluster server-side to use an HASingleton (via
HAModClusterService), then it knows about the proxy upon joining
the cluster - also upon startup. Otherwise, you are relying on the
advertise mechanism - so the time required to register with the
proxy is a product of the advertise interval (AdvertiseFrequency,
configured in httpd.conf), and the status interval
(Engine.backgroundProcessorDelay, configured in server.xml)
Q: how the new servers got added pick up the sessions? are they new or existing sessions?
A: The new servers use a mechanism provided by JGroups called state transfer (see
http://www.jgroups.org/manual/html/user-channel.html#GetState), which copies the existing
sessions into a new server. This way, the new server can be failed over to should an
existing server crash.
Note that state transfer is not needed if we use distribution instead of replication (see
above).
Q: When performing rolling upgrades, how do you mitigate issues where the database schema changes? So certain domains may be using JNDI to hook into one core db - if another domain is upgraded in a roll out then hibernate will update / alter those tables
A: Schema migration is a difficult topic, outside the scope of mod-cluster. One possible way
could be to have a separate DB in the new domain, drain the old domain, and - when the old
domain is shut down - transfer the data from the old to the new DB. But, again, this is very
application dependent, and generic advise moot.
Q: Is mod_cluster also wokring with JBoss 5.1 with the same power, or does it require Jboss 6?
A: mod-cluster works with 5.1, but is already integrated into AS 6 out
of the box.
The latest mod_cluster 1.1.0.CR3 release will work with JBoss 5.1
with no configuration changes - just drop in the mod_cluster.sar
into the $JBOSS_HOME/server/all/deploy directory.
Q: How do nodes identify other nodes within their cluster? In other words how do EC2 nodes only cluster with EC2 nodes etc.?
A: Nodes find other nodes through JGroups (www.jgroups.org). On EC2, we can either use a
GossipRouter, which is a separate lookup process, or S3_PING which is based on S3 buckets.
A cluster is defined via (a) the same configuration and (b) the same cluster name. All nodes
which have (a) and (b) form a cluster. Nodes which have (a) but a different cluster name for
a different cluster.
Q: Is it possible to shutdown and drain a single web app?
A: Yes. The steps are:
- Disable the app
- Wait until the sessions for the app have drained
- Undeploy the app
- Deploy the new app
Note that the old and new webapp needs to be compatible, ie. classes cannot change between
redeployments.
If there is an incompatible change, I recommend to drain all webapps of the same type
(context) in a domain.
Note that undeploy of a web application will perform the above
operations automatically! Use the
stopContextTimeout/stopContextTimeoutUnit config properties to
control the default drain timeout. If you're using session
replication, then you don't need to wait for all sessions to drain
- just all current requests to complete, since those session will
be available elsewhere. The method of draining is determined by
whether or not the target web application is distributable or
not. Additionally, the sessionDrainingStrategy config property can
be used to always force session draining, even for distributable
web applications.
Alternatively, you can stop a single context manually in once step
via the stopContext(...) JMX operation.
Q: Is mod_cluster delivered as a native module in Apache , just as mod_proxy?
A: Yes
Q: Does the "load balancer demo app" come with mod_cluster?
A: Yes, under /demo/client
Q: Can you configure the jboss nodes to announce themselves to the httpd servers over a local/private network keeping that communication private and seperate from the public access to the application?
A: Yes. You can - since a separate connection is used, provided these
routes exist. This private network address/port would be provided by
the advertise mechanism or via the server-side proxyList.
The private and public network could be created in httpd.conf,
using virtual hosts.
Q: When a new version of a web app is deployed, how does JBoss/mod_cluster know how to replicate between old versions and new versions.
A: The webapp needs to be compatible to existing versions. If it isn't, deploy it into a new
domain, or redeploy all existing webapps of the same type.
Q: Can mod_clustered enabled when v use configure Elastic Load Balance?
A: Yes, but this doesn't make much sense. Compared to ELB, mod-cluster is (1) cloud independent
(ELB only exists in EC2), (2) allows for dynamic registration of workers (this is static in
ELB), (3) allows for dynamic registration/de-registration of webapps (ELB doesn't) and (4)
sends dynamic load balancer information back to httpd (ELB has some built-in LB
functionality, but it is not extensible).
Q: what about the performance when we divide one large cluster in to small clusters ?
A: Performance is probably better, for various reasons. For example, if we use TCP, cluster
wide calls (RPCs) have a cost of N-1. With smaller N's, these calls become less costly.
Comments