3 Replies Latest reply on Dec 1, 2006 2:03 PM by brian.stansberry

    configuration question: how to limit size of NAKACK structur

    bruyeron

      I am running into an issue in the case when something is wrong with several nodes in the cluster, and the surviving node somehow does not evict the troublesome nodes and starts accumulating messages.

      The current config looks like this:

       <property name="isolationLevel" value="REPEATABLE_READ" />
       <property name="cacheMode" value="REPL_ASYNC" />
       <property name="clusterName" value="${treeCache.clusterName}" />
       <property name="useReplQueue" value="false" />
       <property name="replQueueInterval" value="0" />
       <property name="replQueueMaxElements" value="0" />
       <property name="fetchInMemoryState" value="true" />
       <property name="initialStateRetrievalTimeout" value="20000" />
       <property name="syncReplTimeout" value="20000" />
       <property name="lockAcquisitionTimeout" value="5000" />
       <property name="useRegionBasedMarshalling" value="false" />
       <property name="clusterProperties"
       value="${treeCache.clusterProperties}" />
       <property name="serviceName">
       <bean class="javax.management.ObjectName">
       <constructor-arg value="jboss.cache:service=${treeCache.clusterName},name=${treeCache.instanceName}"/>
       </bean>
       </property>
       <property name="evictionPolicyClass" value="org.jboss.cache.eviction.LRUPolicy"/>
       <property name="maxAgeSeconds" value="${treeCache.eviction.maxAgeSeconds}"/>
       <property name="maxNodes" value="${treeCache.eviction.maxNodes}"/>
       <property name="timeToLiveSeconds" value="${treeCache.eviction.timeToLiveSeconds}"/>
      


      The jgroups stack is this:
      treeCache.clusterProperties=UDP(ip_mcast=true;ip_ttl=64;loopback=false;mcast_addr=${treeCache.mcastAddress};mcast_port=${treeCache.mcastPort};mcast_recv_buf_
      size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000;bind_addr=${treeCache.bind_addr}):\
      PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):\
      MERGE2(max_interval=20000;min_interval=10000):\
      FD_SOCK(down_thread=false;up_thread=false):\
      VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):\
      pbcast.NAKACK(down_thread=false;gc_lag=50;retransmit_timeout=600,1200,2400,4800;up_thread=false):\
      pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):\
      UNICAST(down_thread=false;;timeout=600,1200,2400):\
      FRAG(down_thread=false;frag_size=8192;up_thread=false):\
      pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):\
      pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
      


      The cluster has 12 nodes, and I had this situation occur when 3 of the nodes failed, which provoked the ops team into restarting 9 of them. The remaning 3 all went OOM quickly. Analysing the heap dump post-mortem, I see this:

      org.jgroups.protocols.pbcast.NAKACK retained size=245MB

      My first step is to add FD into the stack to adress the issue of failure detection not working properly in some cases. Then I would like to limit the size of the NAKACK structure (even if this means losing consistency accross the cluster): is this possible at all? What are your suggestions?