4 Replies Latest reply on Jul 4, 2013 4:59 PM by b.eckenfels

    InterruptedIO in wildfly main thread

    b.eckenfels

      Hello,

       

      in Domain mode the wildfly server will loop over the inputStream and read the connection coordinates of the master controller. When this loop is ended by a closed input stream the server will terminate.

       

      There is a catch of InterruptedIOException, which clears the interrupted state (which should not be needed as throwing that exception will already do). This catch will then continue to loop. I see this a bit problematic as you never know how many objects (and partial objects) have been read. So if you loop on interrupt you might read corrupt information or even worse read partial information and wait endless for its end.

       

      I guess, we could try to analyse where the exception have been thrown and also look at InterruptedIOException#bytesTransferred, but I think the safest thing to do here would be to actually break aka terminate.

       

      One thing which could be considered is to have the protocol beween PC and server a bit more robust. A first step would be to define a well known message length (and possibly version header) and a magic signature. Under those conditions the interrupted read could be more sanely retried. BTW: the try/catch also spans the actual reconnect() method. I dont think we want to terminate if a problem happens here?

       

      What do you think?

       

      https://github.com/wildfly/wildfly/blob/master/server/src/main/java/org/jboss/as/server/DomainServerMain.java#L140

       

      Bernd

       

      PS: The

      {document:id=48381}

      wiki page still points to this forum instead of the ML, is this intentional?

        • 1. Re: InterruptedIO in wildfly main thread
          ctomc

          I fixed the hacking on wildfly wiki, as the link to as7-dev forum should not even be used for as7 and i have moved this thread to wildfly forum.

          1 of 1 people found this helpful
          • 2. Re: InterruptedIO in wildfly main thread
            brian.stansberry

            Thanks for the post!

             

            I think the thing to do in that catch block is Thread.currentThread().interrupt() and then break, allowing termination. (The Thread.currentThread().interrupt() call is just to follow best practice on not swallowing interrupted exceptions.)  If a thread has somehow interrupted the main thread, that's an indication termination is desired and I don't see any reason to ignore it.

             

            As for the client.reconnect() part, I want to talk to Emanuel Muckenhuber, but I think the thing to do is to have HostControllerClient.reconnect simply not throw exceptions other than InterruptedIOException. It already has logic for dealing with connection failure by putting the server in restart-required state. The use of that logic could be expanded to cover exception cases.

             

            I also think in DomainServerMain the System.exit(ExitCodes.NORMAL) call once the loop terminates should be changed to take a variable, set to ExitCodes.FAILED in the catchall "catch (Exception e)" block.

            • 3. Re: InterruptedIO in wildfly main thread
              b.eckenfels

              Brian Stansberry wrote:

               

              I think the thing to do in that catch block is Thread.currentThread().interrupt() and then break, allowing termination. (The Thread.currentThread().interrupt() call is just to follow best practice on not swallowing interrupted exceptions.)  If a thread has somehow interrupted the main thread, that's an indication termination is desired and I don't see any reason to ignore it.

              I agree and can provide a patch for that. I am a bit unsure about restoring the interruption flag, as the shutdown code should not be interrupted again. But then, the rest of the function is reasonable short so we will not run into any method which barfs on it.

              As for the client.reconnect() part, I want to talk to Emanuel Muckenhuber, but I think the thing to do is to have HostControllerClient.reconnect simply not throw exceptions other than InterruptedIOException. It already has logic for dealing with connection failure by putting the server in restart-required state. The use of that logic could be expanded to cover exception cases.

              Ok, in that case I will acknowledge that with a short comment.

               

              I also think in DomainServerMain the System.exit(ExitCodes.NORMAL) call once the loop terminates should be changed to take a variable, set to ExitCodes.FAILED in the catchall "catch (Exception e)" block.

              Ok, I somewhat wonder if losing the parent is actually a "normal" condition. I guess the normal server shutdown is signaled from the HC and executed in some other place. That would mean that all code patch in the main() method would be non-NORMAL. But if you think it should differentiate, I can do that. I was thinking about setting the exitState to NORMAL after the future has returned the service. (And resetting it to error in case of (non-Interrupted and non-EOF) exceptions).

              Should I open a JIRA Bug for it?

              An related question: is it possible/safe to use the logger in this place? I really think we should log the reason for the exit code decision there.

              • 4. Re: InterruptedIO in wildfly main thread
                b.eckenfels

                Hello,

                 

                I have started a branch, which is supposed to address this point. It can be found here. Right now it is only adding a break on Interrupted State and pulled the Future dereference into the proper try block. I am still investigating on the normal control flow on reconnect/shutdown, thats why I havent added the ExitCode change yet.

                 

                https://github.com/ecki/wildfly/compare/topic-mainfix?expand=1

                 

                Greetings

                Bernd