11 Replies Latest reply on Feb 4, 2010 3:28 PM by nicokiki

    Production system crash: Too many open files

    toby.tobias.hill.gmail.com

      We just experienced a severe crash this morning in our production JBoss cluster where our Seam application is deployed.


      All our nodes went down one by one with exception saying Too many open files. The application itself does not explicitly open more than a few files - it's just a rather vanilla Seam-application with JMS-backed HibernateSearch.


      The too many open files we have seen from time to time lately in our test suite (350 seam tests) too so we started looking back in our automatic build system trying to discover at what build/commit this error happened for the first time. It appears that they started occurring after a commit containing 208 small icon-images. So we reverted that to see if it would change things. After having run the test-suite several times before and after the revert we can now with great confidence say that these files indeed matters ... No image files, no too many open files. Image files, too many open files-errors. The strange things is that these files are not even used in the application. They are served by apache from a static content dir. They just happens to be part of the ear too as a consequence of our build-process.


      How and if this carries over to to the too many open files-errors we saw in production this morning is of course hard to say.


      Similar problems are reported in this thread:
      TooManyOpenFilesJeopardizingSeamApplication
      There I suggest an ulimit-patch. This we of course have applied ourselves (setting it to 5000). Now it seems that this patch might be just sort of buying time and that there is something fundamentally wrong with how files are being handled within JBoss or possibly within Seam.


      Anyone who has any insights into this matter or have experienced anything related?, I'd be happy to get some feedback from you.


      /Tobias


      Environment:   
       Seam 2.0.2.SP1
       Hibernate Search 3.0.1
       JBoss 4.2.1.GA
       Ubuntu Linux
      















        • 1. Re: Production system crash: Too many open files
          palacete

          We have also experienced the same (or similar problem) when deploying a Seam app to our production servers.
          We are using Seam 2.0.1.GA, WebSphere 6.1.0.9 running on RH Linux.


          java.io.FileNotFoundException: spektraBackendWebNG-3.1.2.war/img/icon_start.png (Too many open files)
                  at java.io.FileInputStream.open(Native Method)


          • 2. Re: Production system crash: Too many open files
            toby.tobias.hill.gmail.com

            Some more info: when running the test suite with the images files as part of the deployment the first IOException we get it this:


            org.jboss.deployers.spi.DeploymentException: java.io.IOException: Error listing files: /var/backups/hudson/jobs/Trunk_embedded/workspace/trunk/build/build-test/img/gmap-icons-orange-mini
               [testng]      at org.jboss.deployment.AnnotationMetaDataDeployer.deploy(AnnotationMetaDataDeployer.java:170)
               [testng]      at org.jboss.deployment.AnnotationMetaDataDeployer.deploy(AnnotationMetaDataDeployer.java:90)
               [testng]      at org.jboss.deployers.plugins.deployers.DeployerWrapper.deploy(DeployerWrapper.java:169)
               [testng]      at org.jboss.deployers.plugins.deployers.DeployersImpl.doInstallParentFirst(DeployersImpl.java:853)
               [testng]      at org.jboss.deployers.plugins.deployers.DeployersImpl.install(DeployersImpl.java:794)
               [testng]      at org.jboss.dependency.plugins.AbstractControllerContext.install(AbstractControllerContext.java:327)
               [testng]      at org.jboss.dependency.plugins.AbstractController.install(AbstractController.java:1309)
               [testng]      at org.jboss.dependency.plugins.AbstractController.incrementState(AbstractController.java:734)
               [testng]      at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:862)
               [testng]      at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:784)
               [testng]      at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:622)
               [testng]      at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:411)
               [testng]      at org.jboss.deployers.plugins.deployers.DeployersImpl.process(DeployersImpl.java:498)
               [testng]      at org.jboss.deployers.plugins.main.MainDeployerImpl.process(MainDeployerImpl.java:506)
               [testng]      at org.jboss.embedded.DeploymentGroup.process(DeploymentGroup.java:127)
               [testng]      at org.jboss.embedded.Bootstrap.deployResourceBases(Bootstrap.java:289)
               [testng]      at org.jboss.seam.mock.EmbeddedBootstrap.startAndDeployResources(EmbeddedBootstrap.java:15)
               [testng]      at org.jboss.seam.mock.BaseSeamTest.startJbossEmbeddedIfNecessary(BaseSeamTest.java:1041)
               [testng]      at org.jboss.seam.mock.BaseSeamTest.startSeam(BaseSeamTest.java:935)
               [testng]      at org.jboss.seam.mock.BaseSeamTest.init(BaseSeamTest.java:923)
               [testng]      at org.jboss.seam.mock.SeamTest.init(SeamTest.java:42)
               [testng]      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
               [testng]      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
               [testng]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
               [testng]      at java.lang.reflect.Method.invoke(Method.java:585)
               [testng]      at org.testng.internal.MethodHelper.invokeMethod(MethodHelper.java:604)
               [testng]      at org.testng.internal.Invoker.invokeConfigurationMethod(Invoker.java:394)
               [testng]      at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:142)
               [testng]      at org.testng.internal.Invoker.invokeConfigurations(Invoker.java:79)
               [testng]      at org.testng.internal.TestMethodWorker.invokeBeforeClassMethods(TestMethodWorker.java:165)
               [testng]      at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:103)
               [testng]      at org.testng.TestRunner.runWorkers(TestRunner.java:678)
               [testng]      at org.testng.TestRunner.privateRun(TestRunner.java:624)
               [testng]      at org.testng.TestRunner.run(TestRunner.java:495)
               [testng]      at org.testng.SuiteRunner.runTest(SuiteRunner.java:300)
               [testng]      at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:295)
               [testng]      at org.testng.SuiteRunner.privateRun(SuiteRunner.java:275)
               [testng]      at org.testng.SuiteRunner.run(SuiteRunner.java:190)
               [testng]      at org.testng.TestNG.createAndRunSuiteRunners(TestNG.java:792)
               [testng]      at org.testng.TestNG.runSuitesLocally(TestNG.java:765)
               [testng]      at org.testng.TestNG.run(TestNG.java:699)
               [testng]      at org.testng.TestNG.privateMain(TestNG.java:824)
               [testng]      at org.testng.TestNG.main(TestNG.java:802)
               [testng] Caused by: java.io.IOException: Error listing files: /var/backups/hudson/jobs/Trunk_embedded/workspace/trunk/build/build-test/img/gmap-icons-orange-mini
               [testng]      at org.jboss.virtual.plugins.context.file.FileHandler.getChildren(FileHandler.java:146)
               [testng]      at org.jboss.virtual.plugins.context.AbstractVFSContext.getChildren(AbstractVFSContext.java:109)
               [testng]      at org.jboss.virtual.plugins.context.AbstractVFSContext.visit(AbstractVFSContext.java:165)
               [testng]      at org.jboss.virtual.plugins.context.AbstractVFSContext.visit(AbstractVFSContext.java:202)
               [testng]      at org.jboss.virtual.plugins.context.AbstractVFSContext.visit(AbstractVFSContext.java:202)
               [testng]      at org.jboss.virtual.plugins.context.AbstractVFSContext.visit(AbstractVFSContext.java:134)
               [testng]      at org.jboss.virtual.VFS.visit(VFS.java:313)
               [testng]      at org.jboss.virtual.VirtualFile.visit(VirtualFile.java:363)
               [testng]      at org.jboss.deployment.AnnotationMetaDataDeployer.deploy(AnnotationMetaDataDeployer.java:148)
               [testng]      ... 42 more


            • 3. Re: Production system crash: Too many open files
              dhinojosa

              I got the same result by increasing the ulimit. 


              I was wondering if you and anyone else with this issue can post your findings to the JIRA Issue . It can probably be very helpful to them to know what we are up against in our production environment. ;)


               

              • 4. Re: Production system crash: Too many open files
                andreas75

                Daniel Hinojosa wrote on Jun 17, 2008 18:44:


                I got the same result by increasing the ulimit. 

                I was wondering if you and anyone else with this issue can post your findings to the JIRA Issue . It can probably be very helpful to them to know what we are up against in our production environment. ;)




                Did you mean you got this error by -increasing- the ulimit??
                Sounds weird... but may explain a bit for me, since we recently had to increase the limit of another reason...., and then got the error
                /Andreas

                • 5. Re: Production system crash: Too many open files
                  pgmjsd

                  Have you tried turning off the deployment scanner?


                  http://wiki.jboss.org/wiki/TurnDeploymentScannerDown

                  • 6. Re: Production system crash: Too many open files
                    marcioendo.marcioendo.gmail.com

                    The too many open files we have seen from time to time lately in our test suite (350 seam tests) too so we started looking back in our automatic build system trying to discover at what build/commit this error happened for the first time. It appears that they started occurring after a commit containing 208 small icon-images. So we reverted that to see if it would change things. After having run the test-suite several times before and after the revert we can now with great confidence say that these files indeed matters ... No image files, no too many open files. Image files, too many open files-errors. The strange things is that these files are not even used in the application. They are served by apache from a static content dir. They just happens to be part of the ear too as a consequence of our build-process.


                    I have recently ran into this problem and got a solution for it.


                    Edit the following file inside your Test Suite JBoss Embedded stuff:


                    /bootstrap/deployers/ejb3-deployers-beans.xml



                    Add the following property inside the Ejb3Deployer bean tag


                    <bean name="Ejb3Deployer" class="org.jboss.ejb3.deployers.Ejb3Deployer"> 
                       (...)
                       <property name="allowedSuffixes">
                          <set elementClass="java.lang.String">
                             <value>class</value>
                             <value>xml</value>
                          </set>
                       </property>
                       (...)
                    </bean>
                    


                    So it limits the scanning to the suffixes configured.

                    • 7. Re: Production system crash: Too many open files
                      alvo
                      Hi,

                      i have a different problem (not related to the amount of images inside a deployment archive):

                      environment:
                      * linux
                      * jboss 4.2.2GA
                      * seam version: 2.1.2

                      Every time i hot-deploy my war archive (seam), the amount of occupied filehandles of JBoss increases.

                      When i do an lsof for JBoss, i see the jar-files are not released.

                      So normally, i can do only like 5 hot-deployments, then restarting jboss. Each hotdeploy crates like ~180 handles.


                      Temporary solution:
                        Increase the file handle limit


                      Cheers

                      • 8. Re: Production system crash: Too many open files
                        muhviehstarr

                        You should keep in mind that Too many open files means you have too many file handles in use. (mostly is 1024 the limit, which can increased with ulimit -n on linux systems).
                        A file handle is used if you open a file but also if you open a connection. So this can also be too many open RMI-Connections, HTTP-Connections or others.


                        You can use the tool lsof on linux to display all open files and connections of a process. (lsof -n -p PIDOFJBOSS)


                        A other tool is netstat which can be used. This can help you to finde the heavy consumer. I think 1024 open handles is a lot. Especially if you do not serve 1000 http requests at the same time.

                        • 9. Re: Production system crash: Too many open files
                          alvo

                          Hi,
                          thanks for your reply.


                          Yes i checked with lsof before that, and it is the jars from previous hot-deployments which are not released.


                          • 10. Re: Production system crash: Too many open files
                            alvo

                            Damn this post was truncated after html chars, here comes the rest:


                              lsof -p 3078 (pipe) wc -l
                              5221


                              lsof -p 3078 (pipe) grep .jar (pipe) wc -l
                              5096


                            So 5221 handles in use, 5096 of them are jar-filehandles, estimated 4800 jar-filehandles from rpevious hot-deployments.


                            I confugred ulimit now to 16000, that is enough for about 80 hot-deployments.
                              ulimit -n 16000



                            Cheers
                            Owe

                            • 11. Re: Production system crash: Too many open files
                              nicokiki

                              Hi,


                              Has anyone find a solution to this problem? or just increased the number of files and waited for another too many open files error?


                              Thanks in advance,


                              Nico