4 Replies Latest reply on Oct 17, 2019 4:23 PM by Gareth Wilson

    Possible file handle leak in WildFly 12+

    Gareth Wilson Newbie

      Hi All,

       

      I am trying to identify if behaviour that I am observing is correct or if WildFly is leaking file handle descriptors.

       

      During our standard performance testing after upgrading from WildFly 11 to 14 we ran into an issue regarding too many open files. After digging into things a bit more it looks like it is actually the number of pipes that WildFly has open that is increasing.

       

      To help reproduce the problem I have created a simple JSF 2.2 application that contains a large image (100mb to simplify testing). I can make this available if required.

       

      I am retrieving the image using the standard JSF resource URL: /contextroot/javax.faces.resource/css/images/big-image.png.xhtml

      And have also tried adding omnifaces and using the unmapped resource handler URL: /contextroot/javax.faces.resource/css/images/big-image.png

       

      Adding Omnifaces did not change the behaviour I am seeing, and I have only included it as we first thought that it might have been a contributing factor.

       

       

      Behaviour I am seeing

      • WildFly starts and jstack reports that it has two threads matching default task-* which is the default value for task-core-threads
      • If I send in 5 concurrent requests for my large image, 3 new default task-* threads are spawned to serve the requests. 3 new Linux pipes will also be created. 
      • If I stop my requests and wait for 2 minutes (the default value for task-keepalive) 3 of the threads will be removed. The pipes remain open.
      • Periodically - I believe about every 4.5 minutes some kind of clean up occurs and the pipes that were left over from the step above are removed.
      • However... If one of the original 2 worker threads is removed, e.g. task-1, task-3 and task-4 are removed, leaving task-2 and task-5, the pipe associated with task-1 is never cleaned up.

       

      Over time these pipes add up and as far as I can tell they are never removed. Is this a leak somewhere, and if so where? JSF? WildFly? Undertow?

       

       

      Things I have tried

      • I have used this tool to try and identify the file leaks: http://file-leak-detector.kohsuke.org/
      • WildFly 14, 17 and 18 - All exhibit the same behaviour
      • With and without Omnifaces (2.7 and 3.3) - Changes the stack trace from the file leak tool, but effectively the same behaviour
      • Changing the min and max threads to be the same - this prevents handles building up, but I'd rather not go down this route

       

       

      Additional Information

       

      I am running WildFly on CentOS 7.7, we have seen the same behaviour in RHEL 7.7.

      Java version is openjdk version "1.8.0_222"

       

      I have attached

      • lsof output of a freshly restarted WildFly 18 with my sample application (no omnifaces) deployed
      • lsof output after a few cycles of 5 concurrent requests pause for minutes, 5 concurrent requests. I waited 5 minutes after the test to let any clean up happen.
      • A sample stacktrace from the file-leak-detector tool
      • jstack output showing which threads are still active after the cycles of requests

       

      Thanks!

        • 1. Re: Possible file handle leak in WildFly 12+
          Jaikiran Pai Master

          Hello Gareth,

           

          Welcome to the forums.

           

          I think it's better to repost that question here. Additionally please also include the output of "lsof" command against this process to show us which file handle seem to be leaking.

          • 2. Re: Possible file handle leak in WildFly 12+
            Jaikiran Pai Master

            Also, it will be very convenient if you can test and get the results (of lsof) against WildFly 18, since that's the latest released version and makes it easier for us to debug and compare the code against.

            • 3. Re: Possible file handle leak in WildFly 12+
              Scott Marlow Master

              It sounds like your seeing some resources being used and possibly kept open for reuse (e.g. socket is kept open), which is typical + expected behavior.  If you put a constant load on the server, for a long period of time and get out of memory or out of file handle failures, that would likely be a leak.

               

              I see nothing interesting in the after.txt, it is easier to read that if you sort on the name column.  I am attaching the after.txt sorted with just the name column (because that was easy to do with "vim" + "sort").  I was looking to see if any particular jars are loaded repeatedly. I do see that most jars are mentioned twice, which doesn't seem likely to indicate a leak. 

               

              I don't see much difference between fresh.txt + after.txt.

               

              I also don't see much of interest in threads.txt or stack-trace.txt.

               

              Scott

              • 4. Re: Possible file handle leak in WildFly 12+
                Gareth Wilson Newbie

                Hi Scott,

                 

                I initially thought that it was caching of resources, and that maybe with the new threading implementation that was added in WildFly 12 we would just need to tune our ulimit (currently set to 4096 files for the user that runs WildFly).

                 

                I agree with you that the open files all look pretty normal, however I think you might have missed my pointer towards it being the number of pipes that are growing.

                 

                The before file shows 14 open pipes (2 handles per pipe)

                cat fresh.txt | grep pipe | wc -l

                28

                The after file shows 19 open pipes

                cat after.txt | grep pipe | wc -l

                38

                 

                That was after only about 15 minutes of me sending load at my sample application.

                When we run our regression suite (A selenium grid based setup with 20 runner threads) we see an increase of pipes at a rate of about 2 or 3 a minute over the one hour duration of the test.

                 

                Cheers,

                Gareth