6 Replies Latest reply on Sep 22, 2006 2:31 PM by starksm64

    Failure during deployment scan results in JBoss undeployment

    mwaters

      Recently we had a running JBoss instance undeploy itself for unknown reasons. Upon inspecting the the log file we came across

      Could not list directory '/usr/local/jboss-4.0.3/server/default/deploy', reason unknown


      I traced the code and came upon the FileURLLister, which appears to have a flaw, as well as the URLDeploymentScanner, which also appears to have a bug. The source used to investigate the issue was 4.0.3sp1 and 4.0.4rc2, both of which contain the problems.

      FileURLLister has a method listFiles that recurses a directory structure defined by an incoming base URL. To receive an array of directory contents it invokes the list() method on a File object that wraps the URL object. Sometimes the list() method returns null, and this is where the problem occurs.

      common/src/main/org/jboss/net/protocol/file/FileURLLister.java
      private void listFiles(final URL baseUrl, final URLFilter filter, boolean scanNonDottedSubDirs, ArrayList resultList)
      {
       final File baseDir = new File(baseUrl.getPath());
       String[] filenames = baseDir.list(new FilenameFilter(...));
       if (filenames == null)
       {
       // This happens only when baseDir not a directory, or some unknown
       // IOException happens internally (e.g. run out of file descriptors?)
       // Unfortunately the File API doesn't provide a way to know.
       log.warn("Could not list directory '" + baseDir + "', reason unknown");
       }
       else
       { ... }
      }


      The out variable resultList will not have any entries added to it should the first condition be true. The result for us was a listing of our deployment directory that contained no entries. As such, the URLDeploymentScanner attempted to undeploy everything currently deployed. Bad.

      As per the JDK, list() returns null when, "if this abstract pathname does not denote a directory, or if an I/O error occurs." Since sometimes null indicates a deployment directory doesn't exist rather than an exception, simply not adding to the result set seems reasonable. However, if you check the URL for denoting an existing directory prior to executing anything inside the listFiles() method, you can always know the null value actually means an IOException occured, and halt execution of the program.

      private void listFiles(final URL baseUrl, final URLFilter filter, boolean scanNonDottedSubDirs, ArrayList resultList) throws IOException
      {
       final File baseDir = new File(baseUrl.getPath());
       if (baseDir.isDirectory()) {
       String[] filenames = baseDir.list(new FilenameFilter(...));
       if (filenames == null)
       {
       throw new IOException("Failure listing directory '" + baseDir + "', reason unknown");
       }
       else
       { ... }
       }
      }


      Throwing an IOException doesn't get you all the way out of the woods though. URLDeploymentScanner doesn't correctly handle the case of an IOException, which means for other reasons when an IOException is thrown JBoss can go sideways.

      system\src\main\org\jboss\deployment\scanner\URLDeploymentScanner.java
      public synchronized void scan() throws Exception
      {
       ...
       List urlsToDeploy = new LinkedList();
       ...
       synchronized (urlList)
       {
       for (Iterator i = urlList.iterator(); i.hasNext();)
       {
       URL url = (URL) i.next();
       try
       {
       if (url.toString().endsWith("/"))
       {
       // treat URL as a collection
       URLLister lister = listerFactory.createURLLister(url);
      
       // listMembers() will throw an IOException if collection url does not exist
       urlsToDeploy.addAll(lister.listMembers(url, filter, doRecursiveSearch));
       }
       else
       {
       // treat URL as a deployable unit
      
       // throws IOException if this URL does not exist
       url.openConnection().connect();
       urlsToDeploy.add(url);
       }
       }
       catch (IOException e)
       {
       // log exception and continue
       log.debug("Scan URL, caught " + e.getClass().getName() + ": " + e.getMessage());
       }
       }
       }
       ...
      }


      As shown above, when an IOException is encountered when invoking listMembers(), scan() isn't halted, and the empty-list value of urlsToDeployed is used in the remainder of the function. Just below this in the same function we have

      LinkedList urlsToRemove = new LinkedList();
      LinkedList urlsToCheckForUpdate = new LinkedList();
      synchronized (deployedSet)
      {
       // remove previously deployed URLs no longer needed
       for (Iterator i = deployedSet.iterator(); i.hasNext();)
       {
       DeployedURL deployedURL = (DeployedURL) i.next();
       if (urlsToDeploy.contains(deployedURL.url))
       {
       urlsToCheckForUpdate.add(deployedURL);
       }
       else
       {
       urlsToRemove.add(deployedURL);
       }
       }
      }
      
      // ********
      // Undeploy
      // ********
      
      for (Iterator i = urlsToRemove.iterator(); i.hasNext();)
      {
       DeployedURL deployedURL = (DeployedURL) i.next();
       if (trace)
       {
       log.trace("Removing " + deployedURL.url);
       }
       undeploy(deployedURL);
      }


      Since urlsToDeploy still retains the empty-list default value, every element in the deployedSet container will be inserted into urlsToRemove. The obvious next step is to undeploy anything in this list, which is what happens, and what did happen yesterday. The obvious fix for this one would be to rethrow, or not even catch, the IOException in directory listing, but there may be concerns beyond what I've looked at.