10 Replies Latest reply on Jun 24, 2012 9:10 AM by belaban

    Suspicious multicast flood by one of the App servers

    dpetrov

      Good day everyone!

       

      This is my first post here and I really hope it won't violate any rules of the community.

       

      I am newbie with JBoss, not familiar with it at all if I have to be honest... The story is that I am part of the networking engineering team and along with colleagues from development department facing very serious issue. We did some researches, gathered quite alot of data using wireshark software and fortunately we found some patern that could be a root couse of the problem. First I'll try to explain the topology:

       

      We have 2 web servers and 4 app servers. They are all connected to three network switches. JGroup is configured with UDP & Multicast. We are using the 228.1.2.13 group for that purpose. What we observed during the time of crashes, one of the application servers is flooding the network with traffic destined to the JGroup (228.1.2.13) with speed of more than 100mbps (please see the attached pictures)

       

      1%5Ftshark%5FLIVE%5F00001%5F20120614084729.cap.jpg 2.jpg 3.jpg 4.jpg

      As soon as this flood appears, some of application servers crashes and all the users attached to it lost connectivity Does anyone else have experienced something like that? What could cause the app server to flood the group with such huge data & speed?

       

      Any thoughts are more than welcome!

        • 1. Re: Suspicious multicast flood by one of the App servers
          belaban

          Which JGroups release do you have ? I hope a more recent one...

           

          You mentioned the 4 appservers are connected to 3 switches ? Did you mean that each instance has 3 NICs, each connecting to a different switch ?

           

          If that's the case, make sure you don't have multicast loops in your network, where switches forward IP multicast traffic to each other until the TTL is decremented to 0.

           

          Can you reproduce this ? If so, can you connect all 4 hosts directly to 1 switch only, to see if this happens again ?

          • 2. Re: Suspicious multicast flood by one of the App servers
            dpetrov

            Thank you Bela for this fast response!

             

            Sorry for misleading you with the upper information. Actualy, they are all connected to the same switch, but all of them has an interface to the other switches as well (in case of network device failure) It's something like that:

             

                        Switch1

                     /              \

            APP1---Switch2---APP2

             

            So all app servers operational interfaces is located on same switch. There is no looped connection in between switches, so it's not a loop issue (I can confirm that by physically check the cables and looking log files from wireshark - no loops there).


            Sincerely,

            Dani

            • 3. Re: Suspicious multicast flood by one of the App servers
              belaban

              Can you answer my other questions too ? :-)

               

              This is one of 2 mcast flooding reports I've ever gotten in +10 years, so it must be something in your system. The other report was a multicast cycle...

               

              You could try enabling tracing to see what's going on when this happens. I'd like to see who's sending what messages.

               

              What's your config btw ?

              • 4. Re: Suspicious multicast flood by one of the App servers
                dpetrov

                Hello again,

                unfortunately we are unable to reproduce that issue We don't know what is causing it. All 4 servers are currently connected to only one switch.

                What do you mean with "you could try enabling tracing" - is this some configuration on JBoss? The messages are sent only from the 4th app server with ip 10.21.1.57. It's UDP traffic destined to the multicast group 228.1.2.13.

                Do you need any specific part of configuration, because I don't really got that question too

                 

                Thank you so much, I really appraciate your time and effort!

                • 5. Re: Suspicious multicast flood by one of the App servers
                  belaban
                  1. Which JGroups release do you have ? Or which version of JBoss ?
                  2. Config: I need the JGroups configuration. If you run JBoss, are you using JBoss AS7 ? Then I need standalone-ha.xml (if you use that one)
                  3. Tracing: you can enable tracing for org.jgroups in log4j.xml. That's going to generate lots of log messages....
                  • 6. Re: Suspicious multicast flood by one of the App servers
                    dpetrov

                    I will try to gather this information from my colleagues, who are dealing with JBoss itself.


                    However, I just realized that the other application servers also generates from time to time such burst floods, again resulting in crash .. I am really wondering, what could cause this huge amount of traffic flood to be send with 130mbps .. Have you Bela, ever tried to observe this traffic passed to the muticast group? I can't actually understand what type of traffic should be distributed via multicast, when and why? I thought that this mcast group is used only as heartbeat, not to share any data with the rest of servers.. but it looks like I am wrong... Now I start wondering - wether this is the cause or the result from the crashes.... We would probably find this answer within the tracelogs you requested...

                     

                    BR,

                    Dani

                    • 7. Re: Suspicious multicast flood by one of the App servers
                      dpetrov

                      Alright, here are the answers:

                       

                      1. The version of JBoss is 4.2.2

                      2. Unfortunately there is no such file for this version

                      3. I still wait for my colleagues to tell wether is possible to enable the traces

                       

                      Best wishes,

                      Dani

                      • 8. Re: Suspicious multicast flood by one of the App servers
                        belaban

                        I suggest use a more recent JBoss release, e.g. JBoss AS7 or EAP6. I don't want to spend time on an issue, only to find out it was fixed in a later release...

                        • 9. Re: Suspicious multicast flood by one of the App servers
                          dpetrov

                          I see...

                           

                          Sorry to hear that.. However, could you share your thoughts is it a hard job to upgrade from 4.2.2 to 6.x or 7.x ? Could that cause any serious issues with the already written code and so on?

                           

                          Sincerely,

                          Dani

                          • 10. Re: Suspicious multicast flood by one of the App servers
                            belaban

                            It depends on the apps you're running. If it's a simple webapp with session replication, a migration should be easy. If there are beans involved, it might be a bit more complicated.

                             

                            Before you do that, you could try upgrading to JGroups 2.12.x, but I'm not sure it'll work with JBoss 4.x. It would only take you 5 minutes to find out, so perhaps that's the better avenue to choose at this time.