3 Replies Latest reply on Jun 15, 2005 6:11 AM by sinus

    Problem with AutoDiscovery in JBoss-4.0.2 (on RH EL 3.0)

    sinus

      Hello,

      we have two separate clusters running where a client from the first cluster wants to look up a bean in HA-JNDI from the second one.
      After an upgrade from 4.0.0 to 4.0.2, the AutoDiscovery feature doesn't work anymore.

      JBoss 4.0.0 starts up with this message:
      2005-06-06 09:51:31,359 INFO [org.jboss.ha.jndi.DetachedHANamingService$AutomaticDiscovery] Listening on /192.168.0.1:1102, group=230.0.0.4, HA-JNDI address=192.168.0.1:1100

      netstat -a -n gives:
      udp 0 0 0.0.0.0:1102 0.0.0.0:*

      and everything is fine.

      JBoss 4.0.2 comes up with that message:
      2005-06-06 09:25:23,868 INFO [org.jboss.ha.jndi.DetachedHANamingService$AutomaticDiscovery] Listening on /0.0.0.0:1102, group=230.0.0.4, HA-JNDI address=192.168.0.1:1100

      netstat -a -n gives:
      udp 0 0 192.168.0.1:1102 0.0.0.0:*

      The (new) attribute AutoDiscoveryBindAddress in cluster-service.xml is set to:

      <attribute name="AutoDiscoveryBindAddress">${jboss.bind.address}</attribute>

      Changing this attribute has no impact.
      A network trace shows the client sending a UDP packet "GET_ADDRESS:TestPartition" to 230.0.0.4:1102, but no server answeres.
      Clustering is working fine. OS is RH EL 3.0.

      Any hints are really appreciated.

      Thanks,
      Mathias.

        • 1. Re: Problem with AutoDiscovery in JBoss-4.0.2 (on RH EL 3.0)
          sinus

          This problem seems to correspond with bug report JBAS-1843:
          http://jira.jboss.com/jira/browse/JBAS-1843

          Mathias

          • 2. Re: Problem with AutoDiscovery in JBoss-4.0.2 (on RH EL 3.0)
            starksm64

            Yes. What does the multicast routing table look like for the 192.168.0.1 interface?

            • 3. Re: Problem with AutoDiscovery in JBoss-4.0.2 (on RH EL 3.0)
              sinus

              Multicasting was ok and the server answered to a "ping 230.0.0.4".
              The problem persisted on SuSE Linux 9.1 kernel 2.6.4-52-default, but not on WinXP.

              I had a look in the source org.jboss.ha.jndi.DetachedHANamingService.java, class AutomaticDiscovery, method start() and found two issues.
              The original code:

              stopping = false;
               // Use the jndi bind address if there is no discovery address
               if (discoveryBindAddress != null)
               discoveryBindAddress = bindAddress;
               InetSocketAddress bindAddr = new InetSocketAddress(discoveryBindAddress,
               adGroupPort);
               socket = new MulticastSocket(bindAddr);
               socket.setTimeToLive(autoDiscoveryTTL);
               group = InetAddress.getByName(adGroupAddress);
               socket.joinGroup(group);
              


              First, discoveryBindAddress should likely be tested as equal to null. It makes no sense to set it to bindAddr if one already exists.
              Second, there seems to be a bug in the constructor of MulticastSocket when instantiated with an InetAddress. I changed the code to first instantiate with a port number and then set the interface in a second call.
              Changed code:

              stopping = false;
               // Use the jndi bind address if there is no discovery address
               if (discoveryBindAddress == null)
               discoveryBindAddress = bindAddress;
               socket = new MulticastSocket(adGroupPort);
               socket.setInterface(discoveryBindAddress);
               socket.setTimeToLive(autoDiscoveryTTL);
               group = InetAddress.getByName(adGroupAddress);
               socket.joinGroup(group);
              


              This code works on the Linux server. I had not the time to test other platforms. Please check this out.

              Mathias