mod_cluster 1.1.1 Moving to Production
joaocunhalopes Mar 28, 2011 5:59 PMAfter testing mod_cluster 1.1.1 on several different environments we decided to move it to production.
On the frontend we are running one Apache HTTP 2.2.17 server, running on Windows Server 2008 R2 (64 bit).
The installed Apache is the 32 bit version. The mod_cluster modules installed are also the 32 bit version modules.
On the backend we have two app servers running JBoss 5.1.
Between the frontend and the backend we have a firewall but there are no known rules implemented that would make this test fail. The Apache server is able to talk the the JBoss servers and the JBoss servers are able to talk to the Apache server.
Here's what our tests are showing, when trying out the load balancing demo app.
It seems that the Apache server starts well and that after 10 sessions it just stops responding.
Our Apache test configuration is:
ServerRoot "D:/Apache2.2"
# Required for Apache startup
LoadModule authz_host_module modules/mod_authz_host.so
# Module for server-status
LoadModule status_module modules/mod_status.so
ExtendedStatus On
# Modules for JBoss mod_cluster
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
LoadModule slotmem_module modules/mod_slotmem.so
LoadModule manager_module modules/mod_manager.so
LoadModule proxy_cluster_module modules/mod_proxy_cluster.so
LoadModule advertise_module modules/mod_advertise.so
Listen 192.168.150.6:6666
<VirtualHost 192.168.150.6:6666>
KeepAliveTimeout 60
MaxKeepAliveRequests 0
ManagerBalancerName ApacheHttpdBalancer
ServerAdvertise Off
</VirtualHost>
Listen 192.168.150.6:80
<VirtualHost *:80>
<Location /mcm>
SetHandler mod_cluster-manager
Order deny,allow
Deny from all
#Allow from 192.168.120. 192.168.150.
Allow from all
</Location>
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
#Allow from 192.168.120. 192.168.150.
Allow from all
</Location>
<Location /load-demo>
Order deny,allow
Deny from all
#Allow from 192.168.120. 192.168.150.
Allow from all
</Location>
</VirtualHost>
Please notice that we are not using the advertise feature.
On the JBoss end we have configured the servers so they don't use advertise. The following changes were made to the file "mod_cluster-jboss-beans.xml" (the bean changed was "ModClusterConfig"):
<!--<property name="proxyList">${jboss.mod_cluster.proxyList,jboss.modcluster.proxyList:}</property>-->
<property name="proxyList">192.168.150.6:6666</property>
and
<!--<property name="advertise">${jboss.mod_cluster.advertise:true}</property>-->
<property name="advertise">false</property>
The changes above were made acording to "What to do if I don't want to use Advertise (multicast)":
http://docs.jboss.org/mod_cluster/1.1.0/html/faq.html#d0e4112
I checked the Apache HTTP files for errors (after the problem described above) and couldn't find any relevant for this problem:
httpd.exe: Could not reliably determine the server's fully qualified domain name, using 192.168.150.6 for ServerName
[Mon Mar 28 22:50:23 2011] [notice] Advertise initialized for process 2864
[Mon Mar 28 22:50:23 2011] [notice] Apache/2.2.17 (Win32) mod_cluster/1.1.x configured -- resuming normal operations
[Mon Mar 28 22:50:23 2011] [notice] Server built: Oct 18 2010 01:58:12
[Mon Mar 28 22:50:23 2011] [notice] Parent: Created child process 1868
httpd.exe: Could not reliably determine the server's fully qualified domain name, using 192.168.150.6 for ServerName
httpd.exe: Could not reliably determine the server's fully qualified domain name, using 192.168.150.6 for ServerName
[Mon Mar 28 22:50:23 2011] [notice] Child 1868: Child process is running
[Mon Mar 28 22:50:23 2011] [notice] Child 1868: Acquired the start mutex.
[Mon Mar 28 22:50:23 2011] [notice] Child 1868: Starting 64 worker threads.
[Mon Mar 28 22:50:23 2011] [notice] Child 1868: Starting thread to listen on port 80.
[Mon Mar 28 22:50:23 2011] [notice] Child 1868: Starting thread to listen on port 6666.
Also, no errors on the event log.
After the problem, mod_cluster seems normal:
Node: [1],Name: si_part1_node1,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.43,Port: 8109,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 0,Read: 0,Transfered: 0,Connected: 0,Load: 96
Node: [2],Name: pc_part1_node2,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.44,Port: 8209,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 0,Read: 0,Transfered: 0,Connected: 0,Load: 97
Node: [3],Name: pc_part1_node1,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.43,Port: 8209,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 0,Read: 0,Transfered: 0,Connected: 0,Load: 97
Node: [4],Name: ne_part1_node2,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.44,Port: 8309,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 138,Read: 3588,Transfered: 0,Connected: 0,Load: 97
Node: [5],Name: ne_part1_node1,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.43,Port: 8309,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 130,Read: 3380,Transfered: 0,Connected: 0,Load: 97
Node: [6],Name: ph_part1_node2,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.44,Port: 8009,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 0,Read: 0,Transfered: 0,Connected: 0,Load: 97
Node: [7],Name: ph_part1_node1,Balancer: ApacheHttpdBalancer,LBGroup: ,Host: 192.168.150.43,Port: 8009,Type: ajp,Flushpackets: Off,Flushwait: 10,Ping: 10,Smax: 65,Ttl: 60,Elected: 0,Read: 0,Transfered: 0,Connected: 0,Load: 96
Vhost: [1:1:1], Alias: localhost
Vhost: [2:1:2], Alias: localhost
Vhost: [3:1:3], Alias: localhost
Vhost: [4:1:4], Alias: localhost
Vhost: [5:1:5], Alias: localhost
Vhost: [6:1:6], Alias: localhost
Vhost: [7:1:7], Alias: localhost
Context: [1:1:1], Context: /femss, Status: ENABLED
Context: [2:1:2], Context: /fepcwcm, Status: ENABLED
Context: [3:1:3], Context: /fepcwcm, Status: ENABLED
Context: [4:1:4], Context: /load-demo, Status: ENABLED
Context: [5:1:5], Context: /load-demo, Status: ENABLED
Context: [6:1:6], Context: /fephn, Status: ENABLED
Context: [7:1:7], Context: /fephn, Status: ENABLED
Really puzzled about this problem, since preliminary tests went great.
Will look into this tomorrow.
Some possibilities:
The ASA firewall is cutting the trafic.
Some wrong configuration.
32bit vs 64bit.
Any pointer/sugestion on where to start and what to look for would be great.
Thank you.