1 Reply Latest reply on Mar 1, 2013 9:18 AM by michal_rakoczy

    Apache mod_jk - does not balance after node recovery

    michal_rakoczy

      Hello I have Apache (2.4.3-x64) fronted JBoss 5.1 app server - 4 nodes in total - through mod_jk module (1.2.37). Each JBoss instance works with different application, different context, except node2 and node4 - see configuration below :

       

      JBoss Node1 - /app1

      JBoss Node2 - /app2 /app4

      JBoss Node4 - /app4

      JBoss Node5 - /app5

       

      App4 works on two JBoss nodes - node2 and node4. Node4 is the main node for that application (it takes 95% requests to app4 - worker.node4.lbfactor=19 ) when node2 should take only 5% requests to app4, or all requests to app4 in case node4 fails. It will be changed in the future for 50/50 in normal conditions, but currently i'm testing the config that's why 5% is taken.

       

      Almoust everything works as expected : request to app4 are distributed between nodes properly (95%/5%) when both nodes works properly (OK); all requests are redirected to failover node2 in case of node4 failure (OK). The only problem i'm facing is that when node4 is recovered after failure, all requests are redirected to node4 - no more requests for Node2. It looks like after node4 recovery lbfactor settings is ignored and 100% requests goes to node4. To be honest i don't even know how to force the request distribution again.

       

      My worker.properties :

       

      worker.list=loadbalancer,node1,node2,node4,node5,status

       

      worker.loadbalancer.type=lb

      worker.loadbalancer.balance_workers=node2,node4

      worker.loadbalancer.sticky_session=1

      worker.loadbalancer.method=R

       

      worker.template.type=ajp13

      worker.template.socket_connect_timeout=5000

      worker.template.socket_keepalive=true

      worker.template.ping_mode=A

      worker.template.ping_timeout=10000

      worker.template.connection_pool_minsize=0

      worker.template.connection_pool_timeout=60

      worker.template.reply_timeout=300000

      worker.template.recovery_options=3

       

      worker.status.type=status

       

      worker.node1.port=8009

      worker.node1.host=127.0.0.1

      worker.node1.type=ajp13

      worker.node1.lbfactor=1

      worker.node1.connect_timeout=15000

      worker.node1.socket_timeout=60

      worker.node1.connection_pool_timeout=60

      worker.node1.socket_keepalive=True

       

      worker.node2.reference=worker.template

      worker.node2.port=8109

      worker.node2.host=127.0.0.1

      worker.node2.lbfactor=1

      worker.node2.redirect=node4

       

      worker.node4.reference=worker.template

      worker.node4.port=8309

      worker.node4.host=127.0.0.1

      worker.node4.lbfactor=19

      worker.node4.redirect=node2

       

      worker.node5.port=8409

      worker.node5.host=127.0.0.1

      worker.node5.type=ajp13

      worker.node5.lbfactor=1

      worker.node5.connect_timeout=15000

      worker.node5.socket_timeout=60

      worker.node5.connection_pool_timeout=60

      worker.node5.socket_keepalive=True