0 Replies Latest reply on Oct 30, 2006 5:06 PM by peso60

    How to debug multicast hang?

    peso60

      I am running oracle application server 10.1.3.0 which, apparently uses jgoups. All of a sudden on my Red Hat 3 (Red Hat Enterprise Linux AS release 3 (Taroon Update 6)) machine the process starts, but doesn't respond to any requests.

      There were no changes to configuration or code, or machine, tried reboot and make sure no other java processes are running.

      The last out put before it stops responding is this:

      2006-10-30 11:14:22.726 WARNING option GET_STATE_EVENTS has been deprecated (it is always true now); this option is ignored
      Oct 30, 2006 11:14:22 AM org.jgroups.protocols.pbcast.NAKACK handleConfigEvent
      INFO: max_xmit_size=64000
      Oct 30, 2006 11:14:22 AM org.jgroups.protocols.UDP createSockets
      INFO: sockets will use interface 140.87.10.48
      Oct 30, 2006 11:14:22 AM org.jgroups.protocols.UDP createSockets
      INFO: socket information:
      local_addr=140.87.10.48:1045, mcast_addr=234.5.5.5:24667, bind_addr=/140.87.10.48, ttl=32
      sock: bound to 140.87.10.48:1045, receive buffer size=524288, send buffer size=524288
      mcast_recv_sock: bound to 140.87.10.48:24667, send buffer size=524288, receive buffer size=524288
      mcast_send_sock: bound to 140.87.10.48:1046, send buffer size=524288, receive buffer size=524288


      'ps' shows that the java process that was supposed to start is in fact running, and the main HTTP listening port (8888) is shown as LISTEN by netstat and lsof; lsof output:

      lsof|grep 8888
      java 12642 opeschan 35u IPv4 9721891 TCP *:8888 (LISTEN)

      But the port is not responding, simple 'kill' does not kill the java process, but 'kill -5/-9' kills it.

      Where do I even start looking for a reason?
      I tried reboot, it didn't help.

      THAAANKS!