2 Replies Latest reply on Mar 1, 2017 3:48 PM by rpierce99

    Threads lock up waiting for XNIO awaitReadable()

    rpierce99

      Trying to track down a problem with Wildfly 10 Final in our production SaaS environment. We deploy many servers across multiple Amazon zones, and all at once, a specific zone will stop responding to HTTPS requests, with all of the XNIO worker threads in all of the instances hung with the below stack trace. All of the other zones remain fine.

       

      java.lang.Thread.State: RUNNABLE

        at sun.nio.ch.PollArrayWrapper.poll0(Native Method)

        at sun.nio.ch.PollArrayWrapper.poll(PollArrayWrapper.java:115)

        at sun.nio.ch.PollSelectorImpl.doSelect(PollSelectorImpl.java:73)

        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)

        - locked <0x0000000184eb2a10> (a sun.nio.ch.Util$2)

        - locked <0x0000000184eb2a00> (a java.util.Collections$UnmodifiableSet)

        - locked <0x0000000184eb27c8> (a sun.nio.ch.PollSelectorImpl)

        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)

        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)

        at org.xnio.nio.SelectorUtils.await(SelectorUtils.java:46)

        at org.xnio.nio.NioSocketConduit.awaitReadable(NioSocketConduit.java:345)

        at org.xnio.conduits.AbstractSourceConduit.awaitReadable(AbstractSourceConduit.java:66)

        at io.undertow.conduits.ReadDataStreamSourceConduit.awaitReadable(ReadDataStreamSourceConduit.java:101)

        at io.undertow.conduits.FixedLengthStreamSourceConduit.awaitReadable(FixedLengthStreamSourceConduit.java:285)

        at org.xnio.conduits.ConduitStreamSourceChannel.awaitReadable(ConduitStreamSourceChannel.java:151)

        at io.undertow.channels.DetachableStreamSourceChannel.awaitReadable(DetachableStreamSourceChannel.java:77)

        at io.undertow.server.HttpServerExchange$ReadDispatchChannel.awaitReadable(HttpServerExchange.java:2092)

        at io.undertow.server.handlers.form.FormEncodedDataDefinition$FormEncodedDataParser.parseBlocking(FormEncodedDataDefinition.java:253)

        at io.undertow.servlet.spec.HttpServletRequestImpl.parseFormData(HttpServletRequestImpl.java:762)

        at io.undertow.servlet.spec.HttpServletRequestImpl.getParameter(HttpServletRequestImpl.java:636)

        at com.company.web.filters.SystemResourceQuotaFilter.doFilter(SystemResourceQuotaFilter.java:45)

       

      The only thing we can do to recover is restart the server. This stack seems to match [UNDERTOW-115] Stuck ReadDispatchChannel#awaitReadable() - JBoss Issue Tracker but that issue is resolved. Because they all go down at once I figured the issue is triggered by something environmental, like a loss of network connectivity or something, but I can't seem to reproduce in a test environment, and the same basic code running in JBoss 4 was able to recover from these network blips without issue. I appreciate any and all assistance.