7 Replies Latest reply on Nov 7, 2007 7:54 AM by akostadinov

    JBoss server mcast socket bound to network interface

    akostadinov

      Hallo,

      We currently have an issue with running JBoss AS 5 test suite on RHEL 4U5 and above with one node bound to localhost. Please see http://www.jboss.com/index.html?module=bb&op=viewtopic&t=123056 for full details.

      Generally On RHEL 4U5 and above when a multicast listener binds to a network interface and a sender is bound to the same interface, the listener can't see any messages unless the multicast route goes through the same interface. (that is not the case with RHEL 4U4 where listener sees messages)

      I think because of that if a server is bound to localhost, it can't see mcast messages sent by himself. Thus AS 5 is starting very slowly.

      So what I suggest is setting IP_MULTICAST_LOOP on the jgroups mcast socket so it hopefully fix that issue.
      As well not bind the mcast socket to the interface where server was specified to bind with the "-b" option. That way listener will see mcast messages no matter what multicast route the host has.

      That is a better behavior besides that it makes test suite runs much more convenient and less error prone. When for example a network admin reroutes the network he will expect that the running AS server will not stop sending/receiving multicast messages properly without needing to add special parameters.
      As well I don't see any disadvantages in not specifying network interface to bind to. And if there are rare cases with such, the user can specify a startup option to fix that. And behavior will not be much different compared with that on RHEL 4U4 and below.

      Please let me know what do you think about that.

        • 1. Re: JBoss server mcast socket bound to network interface
          belaban

          Note that multicast loopback is not changed by JGroups - by default it is *enabled* in the OS, so setting IP_MULTICAST_LOOP is superfluous.

          I don't understand the other argument: do you want to the OS to pick the NIC, or do you want to set it youself ? If you want to ignore the interface defined by JBoss through -b, then set system property jgroups.ignore.bind_addr to true

          • 2. Re: JBoss server mcast socket bound to network interface
            akostadinov

             

            "bela@jboss.com" wrote:
            Note that multicast loopback is not changed by JGroups - by default it is *enabled* in the OS, so setting IP_MULTICAST_LOOP is superfluous.


            Yes, I see that now. Do you have a test to check it is working so a RHEL bugzilla can be filed?

            "bela@jboss.com" wrote:
            I don't understand the other argument: do you want to the OS to pick the NIC, or do you want to set it youself ? If you want to ignore the interface defined by JBoss through -b, then set system property jgroups.ignore.bind_addr to true


            Yes, I mean network interface not be specified as things seem to always work that way. And that leads to multicast setup not being so much error prone.
            Do you see any drawbacks of having that by default for JBoss AS 5?

            thanks

            • 3. Re: JBoss server mcast socket bound to network interface
              belaban

               

              "akostadinov" wrote:


              Yes, I see that now. Do you have a test to check it is working so a RHEL bugzilla can be filed?


              I don't think this is a bug, as each member does receive their own messages. However, the *peer* members don't receive messages !



              Yes, I mean network interface not be specified as things seem to always work that way. And that leads to multicast setup not being so much error prone.
              Do you see any drawbacks of having that by default for JBoss AS 5?



              Not picking a NIC leads to issues in the ATL labs (at least clusterxx.qa.atl.jboss.com): you *have* to use the virtual NICs assigned to you (e.g. MYTESTIP_1), otherwise the members won't see each other.

              In addition, you also have to pick the 'right' IP multicast address, in order to send your traffic to the correct switch/router (MB, GB).

              I don't know if this changed recently. If not, maybe we should ask the IT folks to assign a *reserved* NIC for Hudson's test runs and possibly a routing entry for a specific multicast address...
              Comments ?

              • 4. Re: JBoss server mcast socket bound to network interface
                brian.stansberry

                Thanks for opening this thread, Alex. Your interface binding suggestion has implications for end users, so I wanted it discussed here rather than in the more narrowly-read QA forum.

                For background on issues QA is having, see http://jira.jboss.com/jira/browse/JBAS-4939
                and
                http://www.jboss.com/index.html?module=bb&op=viewtopic&t=123056 .

                On this thread I'd just like to focus on whether having the -b switch *not* set system property jgroups.bind_address makes sense from the viewpoint of AS users, not AS testers. If a solution doesn't make sense for users, it's not right to do it; the testsuite should just find workarounds. We can sort any testsuite workarounds on the QA forum thread.

                Reasons why I don't like the idea of -b not setting system property jgroups.bind_address:

                1) All other service bindings in the AS are controlled by -b. Having an exception for JGroups is confusing.

                2) If you don't tell JGroups what address to bind to, it will bind to the first non-loopback interface it finds when iterating over NetworkInterface.getNetworkInterfaces(). So, not clear that will be the desired interface. Even if JGroups or the AS were changed to pick the machine's default interface, it's not certain that would be the interface that supports multicast either.

                3) This would be a significant change in behavior from previous releases, so we would have to spend significant effort educating users/SEs/consultants, altering docs, wikis, training course materials and certification exams etc.

                • 5. Re: JBoss server mcast socket bound to network interface
                  akostadinov

                  Bela, is it possible to tell JGroups to not set IP_MULTICAST_IF so OS chooses the default one. I let to you choose if that will bring up better user experience.

                  Bela, about loopback: You are right, we were misleaded.

                  The initial issue appeared to be host machine needing a reboot. Odd that the McastReceiver test misleaded us to think multicast is not working properly. As actually if one node is bound to localhost and one to IP a cluster gets formed.

                  • 6. Re: JBoss server mcast socket bound to network interface
                    belaban

                     

                    "akostadinov" wrote:
                    Bela, is it possible to tell JGroups to not set IP_MULTICAST_IF so OS chooses the default one. I let to you choose if that will bring up better user experience.


                    Well, JGroups needs the bind address, as this forms the address of a member. However, if we only used the bind_addr to determine the local_addr and used no bind_addr for the datagram socket creation, the OS would *not* pick an interface, but instead use the wild card interface 0.0.0.0, so packets would get received over any interface if the port matches.
                    This is doable, but wouldn't help us with the issue at hand.

                    Bela

                    • 7. Re: JBoss server mcast socket bound to network interface
                      akostadinov

                      We don't have an issue any more. See my comment on JBAS-4939 for explanation. Thanks for your feedback that helped sort things out.

                      I think that using 0.0.0.0 by default could make more sense for some users.