5 Replies Latest reply on Apr 13, 2006 9:06 AM by ratang2000

    JBOSS Cache Cluster formation problem in Linux box

    ratang2000

      We have 2 CentOS4.0 Linux boxes running JBOSS 3.2.7.We are using JBOSS TreeCache AOP(JBOSS Cache 1.2.0).
      When we start both of our boxes, they seem not to be able to communicate with each other and form a cluster.
      It works perfectly on the windows machines.
      Our boxes have 2 network interfaces on each machine.We want the communication to happen on the interface that are internal.
      We have tried to open the possible ports in the firewall that are being used in the communication..

      These are the startup messages we see on the boxes.

      BOX1:
      {
      11:13:42,613 INFO [TreeCache] setting cluster properties from xml to: UDP(bind_addr=192.168.1.1;bind_port=9080;ip_mcast=true;ip_ttl=64;loopback=true;mcast_addr=228.1.2.3;mcast_port=48866;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=2;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
      11:13:42,615 WARN [TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
      11:13:42,635 INFO [TreeCache] interceptor chain is:
      class org.jboss.cache.interceptors.CallInterceptor
      class org.jboss.cache.interceptors.PessimisticLockInterceptor
      class org.jboss.cache.interceptors.UnlockInterceptor
      class org.jboss.cache.interceptors.ReplicationInterceptor
      11:13:42,636 INFO [TreeCache] cache mode is REPL_ASYNC
      11:13:42,801 INFO [STDOUT] ************ Email Task Completed *************
      11:13:42,946 INFO [UDP] unicast sockets will use interface 192.168.1.1
      11:13:42,957 INFO [UDP] socket information:
      local_addr=192.168.1.1:9080, mcast_addr=228.1.2.3:48866, bind_addr=/192.168.1.1, ttl=64
      sock: bound to 192.168.1.1:9080, receive buffer size=80000, send buffer size=131071
      mcast_recv_sock: bound to 192.168.1.1:48866, send buffer size=131071, receive buffer size=80000
      mcast_send_sock: bound to 192.168.1.1:33022, send buffer size=131071, receive buffer size=80000
      11:13:42,961 INFO [STDOUT]
      -------------------------------------------------------
      GMS: address is 192.168.1.1:9080
      -------------------------------------------------------
      11:13:44,989 INFO [TreeCache] my local address is 192.168.1.1:9080
      11:13:44,990 INFO [TreeCache] viewAccepted(): [192.168.1.1:9080|0] [192.168.1.1:9080]
      11:13:45,008 INFO [TreeCache] state could not be retrieved (must be first member in group)

      11:13:42,946 INFO [UDP] unicast sockets will use interface 192.168.1.1
      11:13:42,957 INFO [UDP] socket information:
      local_addr=192.168.1.1:9080, mcast_addr=228.1.2.3:48866, bind_addr=/192.168.1.1, ttl=64
      sock: bound to 192.168.1.1:9080, receive buffer size=80000, send buffer size=131071
      mcast_recv_sock: bound to 192.168.1.1:48866, send buffer size=131071, receive buffer size=80000
      mcast_send_sock: bound to 192.168.1.1:33022, send buffer size=131071, receive buffer size=80000
      }

      BOX2:
      {
      1:21:18,544 INFO [TreeCache] setting cluster properties from xml to: UDP(bind_addr=192.168.1.3;bind_port=9081;ip_mcast=true;ip_ttl=64;loopback=true;mcast_addr=228.1.2.3;mcast_port=48866;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=2;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD_SOCK:VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
      11:21:18,547 WARN [TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
      11:21:18,573 INFO [TreeCache] interceptor chain is:
      class org.jboss.cache.interceptors.CallInterceptor
      class org.jboss.cache.interceptors.PessimisticLockInterceptor
      class org.jboss.cache.interceptors.UnlockInterceptor
      class org.jboss.cache.interceptors.ReplicationInterceptor
      11:21:18,573 INFO [TreeCache] cache mode is REPL_ASYNC
      11:21:18,881 INFO [STDOUT] ************ Email Task Completed *************
      11:21:18,953 INFO [UDP] unicast sockets will use interface 192.168.1.3
      11:21:18,960 INFO [UDP] socket information:
      local_addr=192.168.1.3:9081, mcast_addr=228.1.2.3:48866, bind_addr=/192.168.1.3, ttl=64
      sock: bound to 192.168.1.3:9081, receive buffer size=80000, send buffer size=131071
      mcast_recv_sock: bound to 192.168.1.3:48866, send buffer size=131071, receive buffer size=80000
      mcast_send_sock: bound to 192.168.1.3:32836, send buffer size=131071, receive buffer size=80000
      11:21:18,962 INFO [STDOUT]
      -------------------------------------------------------
      GMS: address is 192.168.1.3:9081
      -------------------------------------------------------
      11:21:20,993 INFO [TreeCache] my local address is 192.168.1.3:9081
      11:21:20,997 INFO [TreeCache] viewAccepted(): [192.168.1.3:9081|0] [192.168.1.3:9081]
      11:21:20,998 INFO [TreeCache] state could not be retrieved (must be first
      }

      As you can see both say that "State could not be retrieved(must be first in the group/cluster".

      I have opened ports 9080,9081, and 48866 on both machines.

      My questions are:
      1)I have set the mcast_address to 228.1.2.3, do I need to change this to something else??
      2)I noticed "mcast_send_sock: bound to 192.168.1.3:32836"...I havent opened port 32836, but i think this is randomly assigned.Do I need to open some range of ports or something?

      Have I missed something out..Please let me know ASAP..Thanks!!