4 Replies Latest reply on Oct 2, 2005 7:19 AM by mikefinn

    Load balancing and Failover

    gunjan_iitk

      Hi all,
      I have two jboss4.0.2 instances running on two different machines with ip 10.1.1.131 and 10.1.1.69. I am using mod_jk_1.2 load balancer and Apache webserver. I have deployed my ear on node1 which is 10.1.1.131 and its successfully deployed on node2 also. I have got my EJBs clustered by setting true and defaultpartition tag in jboss.xml. when i run my application seperately on both machines everything is working fine. Both machines are recognising each other.
      But when i run both the jboss instances and then i send my request to the load balancer then it dispatches the request to any one node.
      But in the webclient when i click on any link i get session timeout.
      But if now i close this node then request is sent to node2 that works fine.
      Now if again i run node1 , i mean now again node1 and node2 are running and everything works fine.
      Can anyone advice why load balancing and failover doesnt work initially when i first time start both nodes and then loadbalancer send request to node 1 but i get session timeout on which link i click.

      My workers.properties files is
      # Define list of workers that will be used
      # for mapping requests
      worker.list=loadbalancer,status
      # Define Node1
      worker.node1.port=8009
      worker.node1.host=10.1.1.131
      worker.node1.type=ajp13
      worker.node1.lbfactor=1
      #worker.node1.local_worker=1 (1)
      worker.node1.cachesize=10

      # Define Node2
      worker.node2.port=8009
      worker.node2.host=10.1.1.69
      worker.node2.type=ajp13
      worker.node2.lbfactor=1
      #worker.node2.local_worker=1 (1)
      worker.node2.cachesize=10

      # Load-balancing behaviour
      worker.loadbalancer.type=lb
      worker.loadbalancer.balance_workers=node1, node2
      worker.loadbalancer.sticky_session=1
      worker.loadbalancer.local_worker_only=1
      worker.list=loadbalancer

      # Status worker for managing load balancer
      worker.status.type=status

      my server.xml is


      <!-- Use a custom version of StandardService that allows the
      connectors to be started independent of the normal lifecycle
      start to allow web apps to be deployed before starting the
      connectors.
      -->


      <!-- A HTTP/1.1 Connector on port 8080 -->


      <!-- A AJP 1.3 Connector on port 8009 -->


      <!-- SSL/TLS Connector configuration using the admin devl guide keystore

      -->



      <!-- The JAAS based authentication and authorization realm implementation
      that is compatible with the jboss 3.2.x realm implementation.
      - certificatePrincipal : the class name of the
      org.jboss.security.auth.certs.CertificatePrincipal impl
      used for mapping X509[] cert chains to a Princpal.
      -->

      <!-- A subclass of JBossSecurityMgrRealm that uses the authentication
      behavior of JBossSecurityMgrRealm, but overrides the authorization
      checks to use JACC permissions with the current java.security.Policy
      to determine authorized access.

      -->



      <!-- Uncomment to enable request dumper. This Valve "logs interesting
      contents from the specified Request (before processing) and the
      corresponding Response (after processing). It is especially useful
      in debugging problems related to headers and cookies."
      -->
      <!--

      -->

      <!-- Access logger -->
      <!--

      -->

      <!-- Uncomment to enable single sign-on across web apps
      deployed to this host. Does not provide SSO across a cluster.

      If this valve is used, do not use the JBoss ClusteredSingleSignOn
      valve shown below.
      -->
      <!--

      -->

      <!-- Uncomment to enable single sign-on across web apps
      deployed to this host AND to all other hosts in the cluster
      with the same virtual hostname.

      If this valve is used, do not use the standard Tomcat SingleSignOn
      valve shown above.

      This valve uses JGroups to communicate across the cluster. The
      JGroups Channel used for this communication can be configured
      by editing the "sso-channel.xml" file found in the same folder
      as this file. If this valve is running on a machine with multiple
      IP addresses, configuring the "bind_addr" property of the JGroups
      UDP protocol may be necessary. Another possible configuration
      change would be to enable encryption of intra-cluster communications.
      See the sso-channel.xml file for more details.

      Besides the attributes supported by the standard Tomcat
      SingleSignOn valve (see the Tomcat docs), this version also supports
      the following attribute:

      partitionName the name of the cluster partition in which
      this node participates. If not set, the default
      value is "sso-partition/" + the value of the
      "name" attribute of the Host element that
      encloses this element (e.g. "sso-partition/localhost")
      -->
      <!--

      -->

      <!-- Uncomment to check for unclosed connections and transaction terminated checks
      in servlets/jsps.
      Important: You need to uncomment the dependency on the CachedConnectionManager
      in META-INF/jboss-service.xml

      -->









      my cluster-service.xml is
      <!-- -->
      <!-- Sample Clustering Service Configuration -->
      <!-- -->
      <!-- ===================================================================== -->





      <!-- ==================================================================== -->
      <!-- Cluster Partition: defines cluster -->
      <!-- ==================================================================== -->



      <!-- Name of the partition being built -->
      ${jboss.partition.name:DefaultPartition}

      <!-- The address used to determine the node name -->
      ${jboss.bind.address}

      <!-- Determine if deadlock detection is enabled -->
      False

      <!-- Max time (in ms) to wait for state transfer to complete. Increase for large states -->
      30000

      <!-- The JGroups protocol configuration -->

      <!--
      The default UDP stack:
      - If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
      appropriate NIC IP address, e.g bind_addr="192.168.0.2".
      - On Windows machines, because of the media sense feature being broken with multicast
      (even after disabling media sense) set the UDP protocol's loopback attribute to true
      -->

      <UDP mcast_addr="228.1.2.3" mcast_port="45566"
      ip_ttl="8" ip_mcast="true"
      mcast_send_buf_size="800000" mcast_recv_buf_size="150000"
      ucast_send_buf_size="800000" ucast_recv_buf_size="150000"
      loopback="true" bind_addr="10.1.1.131"/>
      <PING timeout="2000" num_initial_members="3"
      up_thread="true" down_thread="true"/>
      <MERGE2 min_interval="10000" max_interval="20000"/>
      <FD shun="true" up_thread="true" down_thread="true"
      timeout="2500" max_tries="5"/>
      <VERIFY_SUSPECT timeout="3000" num_msgs="3"
      up_thread="true" down_thread="true"/>
      <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
      max_xmit_size="8192"
      up_thread="true" down_thread="true"/>
      <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
      down_thread="true"/>
      <pbcast.STABLE desired_avg_gossip="20000"
      up_thread="true" down_thread="true"/>
      <FRAG frag_size="8192"
      down_thread="true" up_thread="true"/>
      <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
      shun="true" print_local_addr="true"/>
      <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>


      <!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
      <!--

      <TCP bind_addr="thishost" start_port="7800" loopback="true"/>
      <TCPPING initial_hosts="thishost[7800],otherhost[7800]" port_range="3" timeout="3500"
      num_initial_members="3" up_thread="true" down_thread="true"/>
      <MERGE2 min_interval="5000" max_interval="10000"/>
      <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
      <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
      <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
      retransmit_timeout="3000"/>
      <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
      <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
      print_local_addr="true" down_thread="true" up_thread="true"/>
      <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>

      -->




      <!-- ==================================================================== -->
      <!-- HA Session State Service for SFSB -->
      <!-- ==================================================================== -->


      jboss:service=${jboss.partition.name:DefaultPartition}
      <!-- Name of the partition to which the service is linked -->
      ${jboss.partition.name:DefaultPartition}
      <!-- JNDI name under which the service is bound -->
      /HASessionState/Default
      <!-- Max delay before cleaning unreclaimed state.
      Defaults to 30*60*1000 => 30 minutes -->
      0


      <!-- ==================================================================== -->
      <!-- HA JNDI -->
      <!-- ==================================================================== -->


      jboss:service=${jboss.partition.name:DefaultPartition}
      <!-- Name of the partition to which the service is linked -->
      ${jboss.partition.name:DefaultPartition}
      <!-- Bind address of bootstrap and HA-JNDI RMI endpoints -->
      ${jboss.bind.address}
      <!-- Port on which the HA-JNDI stub is made available -->
      1100
      <!-- Accept backlog of the bootstrap socket -->
      50
      <!-- The thread pool service used to control the bootstrap and
      auto discovery lookups -->
      <depends optional-attribute-name="LookupPool"
      proxy-type="attribute">jboss.system:service=ThreadPool

      <!-- A flag to disable the auto discovery via multicast -->
      false
      <!-- Set the auto-discovery bootstrap multicast bind address. If not
      specified and a BindAddress is specified, the BindAddress will be used. -->
      ${jboss.bind.address}
      <!-- Multicast Address and group port used for auto-discovery -->
      230.0.0.4
      1102
      <!-- The TTL (time-to-live) for autodiscovery IP multicast packets -->
      16

      <!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. -->
      0
      <!-- Client socket factory to be used for client-server
      RMI invocations during JNDI queries
      custom
      -->
      <!-- Server socket factory to be used for client-server
      RMI invocations during JNDI queries
      custom
      -->



      ${jboss.bind.address}
      <!--
      0
      custom
      custom
      -->


      <!-- the JRMPInvokerHA creates a thread per request. This implementation uses a pool of threads -->

      1
      300
      300
      60000
      ${jboss.bind.address}
      4446
      ${jboss.bind.address}
      0
      false
      <depends optional-attribute-name="TransactionManagerService">jboss:service=TransactionManager


      <!-- ==================================================================== -->

      <!-- ==================================================================== -->
      <!-- Distributed cache invalidation -->
      <!-- ==================================================================== -->


      jboss:service=${jboss.partition.name:DefaultPartition}
      jboss.cache:service=InvalidationManager
      jboss.cache:service=InvalidationManager
      ${jboss.partition.name:DefaultPartition}
      DefaultJGBridge




      I will be very thankful if nayone can help me.
      Thanks and Regards,
      Gunjan

        • 1. Re: Load balancing and Failover
          davewebb

          Please post you tc5-cluster-service.xml file.

          • 2. Re: Load balancing and Failover
            gunjan_iitk

            Hi David,
            Thanks for your reply.
            My tc5-cluster-service.xml is
            <?xml version="1.0" encoding="UTF-8"?>

            <!-- ===================================================================== -->
            <!-- -->
            <!-- Customized TreeCache Service Configuration for Tomcat 5 Clustering -->
            <!-- -->
            <!-- ===================================================================== -->





            <!-- ==================================================================== -->
            <!-- Defines TreeCache configuration -->
            <!-- ==================================================================== -->



            jboss:service=Naming
            jboss:service=TransactionManager

            <!-- Configure the TransactionManager -->
            org.jboss.cache.JBossTransactionManagerLookup

            <!--
            Isolation level : SERIALIZABLE
            REPEATABLE_READ (default)
            READ_COMMITTED
            READ_UNCOMMITTED
            NONE
            -->
            REPEATABLE_READ

            <!--
            Valid modes are LOCAL, REPL_ASYNC and REPL_SYNC
            -->
            REPL_ASYNC

            <!-- Name of cluster. Needs to be the same for all clusters, in order
            to find each other
            -->
            Tomcat-Cluster

            <!-- JGroups protocol stack properties. Can also be a URL,
            e.g. file:/home/bela/default.xml

            -->


            <!--
            The default UDP stack:
            - If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
            appropriate NIC IP address, e.g bind_addr="192.168.0.2".
            - On Windows machines, because of the media sense feature being broken with multicast
            (even after disabling media sense) set the UDP protocol's loopback attribute to true
            -->

            <UDP mcast_addr="230.1.2.7" mcast_port="45577"
            ip_ttl="8" ip_mcast="true"
            mcast_send_buf_size="150000" mcast_recv_buf_size="80000"
            ucast_send_buf_size="150000" ucast_recv_buf_size="80000"
            loopback="true" bind_addr="10.1.1.131"/>
            <PING timeout="2000" num_initial_members="3"
            up_thread="false" down_thread="false"/>
            <MERGE2 min_interval="10000" max_interval="20000"/>
            <FD_SOCK/>
            <VERIFY_SUSPECT timeout="1500"
            up_thread="false" down_thread="false"/>
            <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
            max_xmit_size="8192" up_thread="false" down_thread="false"/>
            <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
            down_thread="false"/>
            <pbcast.STABLE desired_avg_gossip="20000"
            up_thread="false" down_thread="false"/>
            <FRAG frag_size="8192"
            down_thread="false" up_thread="false"/>
            <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
            shun="true" print_local_addr="true"/>
            <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>


            <!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
            <!--

            <TCP bind_addr="thishost" start_port="7810" loopback="true"/>
            <TCPPING initial_hosts="thishost[7810],otherhost[7810]" port_range="3" timeout="3500"
            num_initial_members="3" up_thread="true" down_thread="true"/>
            <MERGE2 min_interval="5000" max_interval="10000"/>
            <FD shun="true" timeout="2500" max_tries="5" up_thread="true" down_thread="true" />
            <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
            <pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100"
            retransmit_timeout="3000"/>
            <pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false" />
            <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="false"
            print_local_addr="true" down_thread="true" up_thread="true"/>
            <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>

            -->



            <!-- Max number of milliseconds to wait for a lock acquisition -->
            15000




            • 3. Re: Load balancing and Failover
              gunjan_iitk

              Hi,
              I think this problem is due to JAAS related configurations on node 2 because this problem occurs when the request goes on node2 then the login page is displayed correctly but when i login the request goes to node 1.
              My login-config.xml on node 2 is
              <?xml version='1.0'?>
              <!DOCTYPE policy PUBLIC
              "-//JBoss//DTD JBOSS Security Config 3.0//EN"
              "http://www.jboss.org/j2ee/dtd/security_config.dtd">

              <!-- The XML based JAAS login configuration read by the
              org.jboss.security.auth.login.XMLLoginConfig mbean. Add
              an application-policy element for each security domain.

              The outline of the application-policy is:
              <application-policy name="security-domain-name">

              <login-module code="login.module1.class.name" flag="control_flag">
              <module-option name = "option1-name">option1-value</module-option>
              <module-option name = "option2-name">option2-value</module-option>
              ...
              </login-module>

              <login-module code="login.module2.class.name" flag="control_flag">
              ...
              </login-module>
              ...

              </application-policy>

              $Revision: 1.12.2.2 $
              -->


              <!-- Used by clients within the application server VM such as
              mbeans and servlets that access EJBs.
              -->
              <application-policy name = "client-login">

              <login-module code = "org.jboss.security.ClientLoginModule"
              flag = "required">
              </login-module>

              </application-policy>

              <!-- Security domain for JBossMQ -->
              <application-policy name = "jbossmq">

              <login-module code = "org.jboss.security.auth.spi.DatabaseServerLoginModule"
              flag = "required">
              <module-option name = "unauthenticatedIdentity">guest</module-option>
              <module-option name = "dsJndiName">java:/DefaultDS</module-option>
              <module-option name = "principalsQuery">SELECT PASSWD FROM JMS_USERS WHERE USERID=?</module-option>
              <module-option name = "rolesQuery">SELECT ROLEID, 'Roles' FROM JMS_ROLES WHERE USERID=?</module-option>
              </login-module>

              </application-policy>

              <!-- Security domain for JBossMQ when using file-state-service.xml
              <application-policy name = "jbossmq">

              <login-module code = "org.jboss.mq.sm.file.DynamicLoginModule"
              flag = "required">
              <module-option name = "unauthenticatedIdentity">guest</module-option>
              <module-option name = "sm.objectname">jboss.mq:service=StateManager</module-option>
              </login-module>

              </application-policy>
              -->

              <!-- Security domains for testing new jca framework -->
              <application-policy name = "HsqlDbRealm">

              <login-module code = "org.jboss.resource.security.ConfiguredIdentityLoginModule"
              flag = "required">
              <module-option name = "principal">sa</module-option>
              <module-option name = "userName">sa</module-option>
              <module-option name = "password"></module-option>
              <module-option name = "managedConnectionFactoryName">jboss.jca:service=LocalTxCM,name=DefaultDS</module-option>
              </login-module>

              </application-policy>

              <application-policy name = "JmsXARealm">

              <login-module code = "org.jboss.resource.security.ConfiguredIdentityLoginModule"
              flag = "required">
              <module-option name = "principal">guest</module-option>
              <module-option name = "userName">guest</module-option>
              <module-option name = "password">guest</module-option>
              <module-option name = "managedConnectionFactoryName">jboss.jca:service=TxCM,name=JmsXA</module-option>
              </login-module>

              </application-policy>

              <!-- A template configuration for the jmx-console web application. This
              defaults to the UsersRolesLoginModule the same as other and should be
              changed to a stronger authentication mechanism as required.
              -->
              <application-policy name = "jmx-console">

              <login-module code="org.jboss.security.auth.spi.UsersRolesLoginModule"
              flag = "required">
              <module-option name="usersProperties">props/jmx-console-users.properties</module-option>
              <module-option name="rolesProperties">props/jmx-console-roles.properties</module-option>
              </login-module>

              </application-policy>

              <!-- A template configuration for the web-console web application. This
              defaults to the UsersRolesLoginModule the same as other and should be
              changed to a stronger authentication mechanism as required.
              -->
              <application-policy name = "web-console">

              <login-module code="org.jboss.security.auth.spi.UsersRolesLoginModule"
              flag = "required">
              <module-option name="usersProperties">web-console-users.properties</module-option>
              <module-option name="rolesProperties">web-console-roles.properties</module-option>
              </login-module>

              </application-policy>

              <!-- A template configuration for the JBossWS web application (and transport layer!).
              This defaults to the UsersRolesLoginModule the same as other and should be
              changed to a stronger authentication mechanism as required.
              -->
              <application-policy name="JBossWS">

              <login-module code="org.jboss.security.auth.spi.UsersRolesLoginModule"
              flag="required">
              <module-option name="unauthenticatedIdentity">anonymous</module-option>
              </login-module>

              </application-policy>

              <!-- The default login configuration used by any security domain that
              does not have a application-policy entry with a matching name
              -->
              <application-policy name = "other">
              <!-- A simple server login module, which can be used when the number
              of users is relatively small. It uses two properties files:
              users.properties, which holds users (key) and their password (value).
              roles.properties, which holds users (key) and a comma-separated list of
              their roles (value).
              The unauthenticatedIdentity property defines the name of the principal
              that will be used when a null username and password are presented as is
              the case for an unuathenticated web client or MDB. If you want to
              allow such users to be authenticated add the property, e.g.,
              unauthenticatedIdentity="nobody"
              -->

              <login-module code = "org.jboss.security.auth.spi.UsersRolesLoginModule"
              flag = "required" />

              </application-policy>



              <application-policy name = "xellerate">

              <login-module code="com.thortech.xl.security.jboss.XLClientLoginModule" flag="required">
              </login-module>
              <login-module code=
              "com.thortech.xl.security.jboss.UsernamePasswordLoginModule"
              flag = "required" >
              <module-option name =
              "unauthenticatedIdentity">Unknown</module-option>
              <module-option name =
              "data-source">java:/jdbc/xlDS</module-option>
              </login-module>

              </application-policy>



              Can anyone advice me what chnages i ned to do on node2 for login configurations.

              • 4. Re: Load balancing and Failover
                mikefinn

                Gunjan - did you ever resolve this? I am seeing something similar happening w/ 4.0.2. At least in my case, jvmRoute is not being set ( I think on the Tomcat side ). I have 3.2.2 servers defined in my LB as well, and everything works perfect when I map to those. As soon as I map back to 4.0.2 servers, no workie.

                Without the jvmRoute being passed, the browser - LB cannot determine proper node to send the request to, so it goes to the 'next' node, (in my case) causing a second login page. Look in the returned HTTP headers and see if the jvmRoute is being returned.

                Also, I don't see the domain attribute being set in your workers.properties, which sets the jvmRoute name that is associated with a LB node.

                I saw a reference in 4.0.3 RC release notes about a fix in the jvmRoute stuff, so am going to try 4.0.3RC2.

                Mike