Failure during deployment scan results in JBoss undeployment
mwaters Apr 13, 2006 6:28 PMRecently we had a running JBoss instance undeploy itself for unknown reasons. Upon inspecting the the log file we came across
Could not list directory '/usr/local/jboss-4.0.3/server/default/deploy', reason unknown
I traced the code and came upon the FileURLLister, which appears to have a flaw, as well as the URLDeploymentScanner, which also appears to have a bug. The source used to investigate the issue was 4.0.3sp1 and 4.0.4rc2, both of which contain the problems.
FileURLLister has a method listFiles that recurses a directory structure defined by an incoming base URL. To receive an array of directory contents it invokes the list() method on a File object that wraps the URL object. Sometimes the list() method returns null, and this is where the problem occurs.
common/src/main/org/jboss/net/protocol/file/FileURLLister.java private void listFiles(final URL baseUrl, final URLFilter filter, boolean scanNonDottedSubDirs, ArrayList resultList) { final File baseDir = new File(baseUrl.getPath()); String[] filenames = baseDir.list(new FilenameFilter(...)); if (filenames == null) { // This happens only when baseDir not a directory, or some unknown // IOException happens internally (e.g. run out of file descriptors?) // Unfortunately the File API doesn't provide a way to know. log.warn("Could not list directory '" + baseDir + "', reason unknown"); } else { ... } }
The out variable resultList will not have any entries added to it should the first condition be true. The result for us was a listing of our deployment directory that contained no entries. As such, the URLDeploymentScanner attempted to undeploy everything currently deployed. Bad.
As per the JDK, list() returns null when, "if this abstract pathname does not denote a directory, or if an I/O error occurs." Since sometimes null indicates a deployment directory doesn't exist rather than an exception, simply not adding to the result set seems reasonable. However, if you check the URL for denoting an existing directory prior to executing anything inside the listFiles() method, you can always know the null value actually means an IOException occured, and halt execution of the program.
private void listFiles(final URL baseUrl, final URLFilter filter, boolean scanNonDottedSubDirs, ArrayList resultList) throws IOException { final File baseDir = new File(baseUrl.getPath()); if (baseDir.isDirectory()) { String[] filenames = baseDir.list(new FilenameFilter(...)); if (filenames == null) { throw new IOException("Failure listing directory '" + baseDir + "', reason unknown"); } else { ... } } }
Throwing an IOException doesn't get you all the way out of the woods though. URLDeploymentScanner doesn't correctly handle the case of an IOException, which means for other reasons when an IOException is thrown JBoss can go sideways.
system\src\main\org\jboss\deployment\scanner\URLDeploymentScanner.java public synchronized void scan() throws Exception { ... List urlsToDeploy = new LinkedList(); ... synchronized (urlList) { for (Iterator i = urlList.iterator(); i.hasNext();) { URL url = (URL) i.next(); try { if (url.toString().endsWith("/")) { // treat URL as a collection URLLister lister = listerFactory.createURLLister(url); // listMembers() will throw an IOException if collection url does not exist urlsToDeploy.addAll(lister.listMembers(url, filter, doRecursiveSearch)); } else { // treat URL as a deployable unit // throws IOException if this URL does not exist url.openConnection().connect(); urlsToDeploy.add(url); } } catch (IOException e) { // log exception and continue log.debug("Scan URL, caught " + e.getClass().getName() + ": " + e.getMessage()); } } } ... }
As shown above, when an IOException is encountered when invoking listMembers(), scan() isn't halted, and the empty-list value of urlsToDeployed is used in the remainder of the function. Just below this in the same function we have
LinkedList urlsToRemove = new LinkedList(); LinkedList urlsToCheckForUpdate = new LinkedList(); synchronized (deployedSet) { // remove previously deployed URLs no longer needed for (Iterator i = deployedSet.iterator(); i.hasNext();) { DeployedURL deployedURL = (DeployedURL) i.next(); if (urlsToDeploy.contains(deployedURL.url)) { urlsToCheckForUpdate.add(deployedURL); } else { urlsToRemove.add(deployedURL); } } } // ******** // Undeploy // ******** for (Iterator i = urlsToRemove.iterator(); i.hasNext();) { DeployedURL deployedURL = (DeployedURL) i.next(); if (trace) { log.trace("Removing " + deployedURL.url); } undeploy(deployedURL); }
Since urlsToDeploy still retains the empty-list default value, every element in the deployedSet container will be inserted into urlsToRemove. The obvious next step is to undeploy anything in this list, which is what happens, and what did happen yesterday. The obvious fix for this one would be to rethrow, or not even catch, the IOException in directory listing, but there may be concerns beyond what I've looked at.