1 Reply Latest reply on Feb 5, 2007 10:37 AM by davewebb

    JBoss 4.0.5 Clustering Behavior

    davewebb

      I have 2 physical servers running the same configuration

      JBoss 4.0.5
      J2SDK1.4.2_13
      OpenSuse 10.2
      


      I have clustered an application on the 2 physical servers. Both servers startup fine, but when reviewing the logs I see the following:

      2007-02-03 13:09:24,507 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] New cluster view for partition DefaultPartition: 8 ([192.168.1.73:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099] delta: 1)
      2007-02-03 13:09:24,507 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] I am (192.168.1.74:1099) received membershipChanged event:
      2007-02-03 13:09:24,507 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] Dead members: 0 ([])
      2007-02-03 13:09:24,507 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] New Members : 0 ([])
      2007-02-03 13:09:24,507 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] All Members : 9 ([192.168.1.73:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099])
      2007-02-03 13:09:31,799 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: detected 1 failed or restarted instances.
      2007-02-03 13:09:31,800 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: Scanning for instance "browning1170525665233"'s failed in-progress jobs.
      2007-02-03 13:09:39,307 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: detected 1 failed or restarted instances.
      2007-02-03 13:09:39,307 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: Scanning for instance "browning1170525665233"'s failed in-progress jobs.
      2007-02-03 13:09:46,811 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: detected 1 failed or restarted instances.
      2007-02-03 13:09:46,811 INFO [org.quartz.impl.jdbcjobstore.JobStoreTX] ClusterManager: Scanning for instance "browning1170525665233"'s failed in-progress jobs.
      2007-02-03 13:09:52,046 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|1])
      2007-02-03 13:09:52,048 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|2])
      2007-02-03 13:09:52,049 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|3])
      2007-02-03 13:09:52,049 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|4])
      2007-02-03 13:09:52,051 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|5])
      2007-02-03 13:09:52,051 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|6])
      2007-02-03 13:09:52,052 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|7])
      2007-02-03 13:09:52,053 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32848 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|8], new vid: [browning:32852 (additional data: 17 bytes)|8])
      


      Followed by

      2007-02-03 13:12:33,063 INFO [org.jboss.ha.framework.interfaces.HAPartition.DefaultPartition] New cluster view for partition DefaultPartition: 11 ([192.168.1.73:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099] delta: 1)
      2007-02-03 13:12:33,063 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] I am (192.168.1.74:1099) received membershipChanged event:
      2007-02-03 13:12:33,063 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] Dead members: 0 ([])
      2007-02-03 13:12:33,063 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] New Members : 0 ([])
      2007-02-03 13:12:33,063 INFO [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] All Members : 12 ([192.168.1.73:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099, 192.168.1.74:1099])
      2007-02-03 13:12:37,008 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|1])
      2007-02-03 13:12:37,009 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|2])
      2007-02-03 13:12:37,011 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|3])
      2007-02-03 13:12:37,011 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|4])
      2007-02-03 13:12:37,012 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|5])
      2007-02-03 13:12:37,012 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|6])
      2007-02-03 13:12:37,014 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|7])
      2007-02-03 13:12:37,014 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|8])
      2007-02-03 13:12:37,017 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|9])
      2007-02-03 13:12:37,018 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|10])
      2007-02-03 13:12:37,018 ERROR [org.jgroups.protocols.pbcast.GMS] [mossberg:32863 (additional data: 17 bytes)] received view <= current view; discarding it (current vid: [browning:32852 (additional data: 17 bytes)|11], new vid: [browning:32852 (additional data: 17 bytes)|11])
      


      While there are only 2 physical server and only 1 JVM running on each server, the servers keep adding members to the cluster with the same IP/Port combination.

      Here is my cluster-service.xml

      <?xml version="1.0" encoding="UTF-8"?>
      
      <!-- ===================================================================== -->
      <!-- -->
      <!-- Sample Clustering Service Configuration -->
      <!-- -->
      <!-- ===================================================================== -->
      
      <server>
      
       <!-- ==================================================================== -->
       <!-- Cluster Partition: defines cluster -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.framework.server.ClusterPartition"
       name="jboss:service=${jboss.partition.name:DefaultPartition}">
      
       <!-- Name of the partition being built -->
       <attribute name="PartitionName">${jboss.partition.name:DefaultPartition}</attribute>
      
       <!-- The address used to determine the node name -->
       <attribute name="NodeAddress">${jboss.bind.address}</attribute>
      
       <!-- Determine if deadlock detection is enabled -->
       <attribute name="DeadlockDetection">False</attribute>
      
       <!-- Max time (in ms) to wait for state transfer to complete. Increase for large states -->
       <attribute name="StateTransferTimeout">30000</attribute>
      
       <!-- The JGroups protocol configuration -->
       <attribute name="PartitionConfig">
       <!--
       The default UDP stack:
       - If you have a multihomed machine, set the UDP protocol's bind_addr attribute to the
       appropriate NIC IP address, e.g bind_addr="192.168.0.2".
       - On Windows machines, because of the media sense feature being broken with multicast
       (even after disabling media sense) set the UDP protocol's loopback attribute to true
       -->
       <Config>
       <UDP mcast_addr="${jboss.partition.udpGroup:228.1.69.1}" mcast_port="45566"
       ip_ttl="${jgroups.mcast.ip_ttl:8}" ip_mcast="true"
       mcast_recv_buf_size="2000000" mcast_send_buf_size="640000"
       ucast_recv_buf_size="2000000" ucast_send_buf_size="640000"
       loopback="false"/>
       <PING timeout="2000" num_initial_members="3"
       up_thread="true" down_thread="true"/>
       <MERGE2 min_interval="10000" max_interval="20000"/>
       <FD_SOCK down_thread="false" up_thread="false"/>
       <FD shun="true" up_thread="true" down_thread="true"
       timeout="10000" max_tries="5"/>
       <VERIFY_SUSPECT timeout="3000" num_msgs="3"
       up_thread="true" down_thread="true"/>
       <pbcast.NAKACK gc_lag="50" retransmit_timeout="300,600,1200,2400,4800"
       max_xmit_size="8192"
       up_thread="true" down_thread="true"/>
       <UNICAST timeout="300,600,1200,2400,4800" window_size="100" min_threshold="10"
       down_thread="true"/>
       <pbcast.STABLE desired_avg_gossip="20000" max_bytes="400000"
       up_thread="true" down_thread="true"/>
       <FRAG frag_size="8192"
       down_thread="true" up_thread="true"/>
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
       shun="true" print_local_addr="true"/>
       <pbcast.STATE_TRANSFER up_thread="true" down_thread="true"/>
       </Config>
      
       <!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
       <!--
       <Config>
       <TCP bind_addr="thishost" start_port="7800" loopback="true"
       recv_buf_size="2000000" send_buf_size="640000"
       tcp_nodelay="true" up_thread="false" down_thread="false"/>
       <TCPPING initial_hosts="thishost[7800],otherhost[7800]" port_range="3" timeout="3500"
       num_initial_members="3" up_thread="false" down_thread="false"/>
       <MERGE2 min_interval="5000" max_interval="10000"
       up_thread="false" down_thread="false"/>
       <FD_SOCK down_thread="false" up_thread="false"/>
       <FD shun="true" up_thread="false" down_thread="false"
       timeout="10000" max_tries="5"/>
       <VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false" />
       <pbcast.NAKACK up_thread="false" down_thread="false" gc_lag="100"
       retransmit_timeout="300,600,1200,2400,4800"/>
       <pbcast.STABLE desired_avg_gossip="20000" max_bytes="400000"
       down_thread="false" up_thread="false" />
       <pbcast.GMS join_timeout="5000" join_retry_timeout="2000" shun="true"
       print_local_addr="true" up_thread="false" down_thread="false"/>
       <FC max_credits="2000000" down_thread="false" up_thread="false"
       min_threshold="0.10"/>
       <FRAG2 frag_size="60000" down_thread="false" up_thread="true"/>
       <pbcast.STATE_TRANSFER up_thread="false" down_thread="false"/>
       </Config>
       -->
       </attribute>
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- ==================================================================== -->
       <!-- HA Session State Service for SFSB -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.hasessionstate.server.HASessionStateService"
       name="jboss:service=HASessionState">
       <depends>jboss:service=Naming</depends>
       <!-- We now inject the partition into the HAJNDI service instead
       of requiring that the partition name be passed -->
       <depends optional-attribute-name="ClusterPartition"
       proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}</depends>
       <!-- JNDI name under which the service is bound -->
       <attribute name="JndiName">/HASessionState/Default</attribute>
       <!-- Max delay before cleaning unreclaimed state.
       Defaults to 30*60*1000 => 30 minutes -->
       <attribute name="BeanCleaningDelay">0</attribute>
       </mbean>
      
       <!-- ==================================================================== -->
       <!-- HA JNDI -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.ha.jndi.HANamingService"
       name="jboss:service=HAJNDI">
       <!-- We now inject the partition into the HAJNDI service instead
       of requiring that the partition name be passed -->
       <depends optional-attribute-name="ClusterPartition"
       proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}</depends>
       <!-- Bind address of bootstrap and HA-JNDI RMI endpoints -->
       <attribute name="BindAddress">${jboss.bind.address}</attribute>
       <!-- Port on which the HA-JNDI stub is made available -->
       <attribute name="Port">1100</attribute>
       <!-- RmiPort to be used by the HA-JNDI service once bound. 0 => auto. -->
       <attribute name="RmiPort">1101</attribute>
       <!-- Accept backlog of the bootstrap socket -->
       <attribute name="Backlog">50</attribute>
       <!-- The thread pool service used to control the bootstrap and
       auto discovery lookups -->
       <depends optional-attribute-name="LookupPool"
       proxy-type="attribute">jboss.system:service=ThreadPool</depends>
      
       <!-- A flag to disable the auto discovery via multicast -->
       <attribute name="DiscoveryDisabled">false</attribute>
       <!-- Set the auto-discovery bootstrap multicast bind address. If not
       specified and a BindAddress is specified, the BindAddress will be used. -->
       <attribute name="AutoDiscoveryBindAddress">${jboss.bind.address}</attribute>
       <!-- Multicast Address and group port used for auto-discovery -->
       <attribute name="AutoDiscoveryAddress">${jboss.partition.udpGroup:230.0.0.4}</attribute>
       <attribute name="AutoDiscoveryGroup">1102</attribute>
       <!-- The TTL (time-to-live) for autodiscovery IP multicast packets -->
       <attribute name="AutoDiscoveryTTL">16</attribute>
       <!-- The load balancing policy for HA-JNDI -->
       <attribute name="LoadBalancePolicy">org.jboss.ha.framework.interfaces.RoundRobin</attribute>
      
       <!-- Client socket factory to be used for client-server
       RMI invocations during JNDI queries
       <attribute name="ClientSocketFactory">custom</attribute>
       -->
       <!-- Server socket factory to be used for client-server
       RMI invocations during JNDI queries
       <attribute name="ServerSocketFactory">custom</attribute>
       -->
       </mbean>
      
       <mbean code="org.jboss.invocation.jrmp.server.JRMPInvokerHA"
       name="jboss:service=invoker,type=jrmpha">
       <attribute name="ServerAddress">${jboss.bind.address}</attribute>
       <attribute name="RMIObjectPort">4447</attribute>
       <!--
       <attribute name="RMIClientSocketFactory">custom</attribute>
       <attribute name="RMIServerSocketFactory">custom</attribute>
       -->
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- the JRMPInvokerHA creates a thread per request. This implementation uses a pool of threads -->
       <mbean code="org.jboss.invocation.pooled.server.PooledInvokerHA"
       name="jboss:service=invoker,type=pooledha">
       <attribute name="NumAcceptThreads">1</attribute>
       <attribute name="MaxPoolSize">300</attribute>
       <attribute name="ClientMaxPoolSize">300</attribute>
       <attribute name="SocketTimeout">60000</attribute>
       <attribute name="ServerBindAddress">${jboss.bind.address}</attribute>
       <attribute name="ServerBindPort">4446</attribute>
       <attribute name="ClientConnectAddress">${jboss.bind.address}</attribute>
       <attribute name="ClientConnectPort">0</attribute>
       <attribute name="EnableTcpNoDelay">false</attribute>
       <depends optional-attribute-name="TransactionManagerService">jboss:service=TransactionManager</depends>
       <depends>jboss:service=Naming</depends>
       </mbean>
      
       <!-- ==================================================================== -->
      
       <!-- ==================================================================== -->
       <!-- Distributed cache invalidation -->
       <!-- ==================================================================== -->
      
       <mbean code="org.jboss.cache.invalidation.bridges.JGCacheInvalidationBridge"
       name="jboss.cache:service=InvalidationBridge,type=JavaGroups">
       <!-- We now inject the partition into the HAJNDI service instead
       of requiring that the partition name be passed -->
       <depends optional-attribute-name="ClusterPartition"
       proxy-type="attribute">jboss:service=${jboss.partition.name:DefaultPartition}</depends>
       <depends>jboss.cache:service=InvalidationManager</depends>
       <attribute name="InvalidationManager">jboss.cache:service=InvalidationManager</attribute>
       <attribute name="BridgeName">DefaultJGBridge</attribute>
       </mbean>
      
      </server>
      


      Any help is appreciated. Thank you!