0 Replies Latest reply on Feb 27, 2009 9:25 AM by mpastorino

    Problems in websphere cluster environment

    mpastorino

      We are having problems with JbossCache v1.4.1.SP9 and JGroups 2.4.1 in Websphere 6.1 cluster environment.

      Randomly, we get the following errors in the log:
      JChannel I org.jgroups.JChannel handleExit received an EXIT event, will leave the channel
      JChannel I org.jgroups.JChannel$CloserThread run closing the channel
      NAKACK W org.jgroups.protocols.pbcast.NAKACK handleMessage node1] discarded message from non-member node1, my view is MergeView::[node1|30] [node1, node2], subgroups=[[node1|0] [node1], [node2|29] [node2]]

      Also we get:
      NAKACK W org.jgroups.protocols.pbcast.NAKACK send [node1] discarded message as start() has not been called, message: [dst: , src: (1 headers), size = 146 bytes]

      The cluster has 2 nodes which are called node1 and node2 for simplicity in the post.

      After a couple of seconds in which node1 continues throwing those messages, we get:
      GMS W org.jgroups.stack.Protocol receiveDownEvent exception: QueueClosedException
      ProtocolStack E org.jgroups.stack.ProtocolStack down no down protocol available !
      ...
      UNICAST W org.jgroups.protocols.UNICAST setProperties window_size is deprecated and will be ignored
      UNICAST W org.jgroups.protocols.UNICAST setProperties min_threshold is deprecated and will be ignored
      FRAG2 I org.jgroups.protocols.FRAG2 setProperties frag_size=8192, overhead=200, new frag_size=7992
      ...
      UDP I org.jgroups.protocols.UDP createSockets sockets will use interface node1
      UDP I org.jgroups.protocols.UDP createSockets socket information:
      local_addr=node1, mcast_addr=228.2.2.2:45522, bind_addr=/node1, ttl=64
      sock: bound to node1, receive buffer size=131071, send buffer size=131071
      mcast_recv_sock: bound to node1, send buffer size=131071, receive buffer size=131071
      mcast_send_sock: bound to node1, send buffer size=131071, receive buffer size=131071
      SystemOut O
      -------------------------------------------------------
      GMS: address is node1
      -------------------------------------------------------
      TreeCache I org.jboss.cache.TreeCache viewAccepted viewAccepted(): [node10] [node1]
      JChannel I org.jgroups.JChannel$CloserThread run fetching the state (auto_getstate=true)
      JChannel I org.jgroups.JChannel$CloserThread run state transfer failed

      And no more messages for node1. In node2 we have no special messages of this problem.

      The configuration file of jbosscache is:

      <UDP mcast_addr="228.2.2.2" mcast_port="45522"
      ip_ttl="64" ip_mcast="true"
      mcast_send_buf_size="150000" mcast_recv_buf_size="2000000"
      ucast_send_buf_size="150000" ucast_recv_buf_size="2000000"
      loopback="false"/>
      <PING timeout="2000" num_initial_members="2"
      up_thread="false" down_thread="false"/>
      <MERGE2 min_interval="10000" max_interval="20000"/>
      <FD timeout="10000" shun="true" up_thread="true" down_thread="true"/>
      <VERIFY_SUSPECT timeout="1500"
      up_thread="false" down_thread="false"/>
      <pbcast.NAKACK gc_lag="50" retransmit_timeout="600,1200,2400,4800"
      up_thread="false" down_thread="false" discard_delivered_msgs="true"/>
      <pbcast.STABLE desired_avg_gossip="20000"
      up_thread="false" down_thread="false" max_bytes="250000"/>
      <UNICAST timeout="600,1200,2400" window_size="100" min_threshold="10"
      down_thread="false"/>
      <FC max_credits="500000" down_thread="false" min_threshold="0.25"/>
      <FRAG2 frag_size="8192"
      down_thread="false" up_thread="false"/>
      <pbcast.GMS join_timeout="5000" join_retry_timeout="2000"
      shun="true" print_local_addr="true"/>
      <pbcast.STATE_TRANSFER up_thread="false" down_thread="false"/>


      I will appreciate any help. Thanks.