5 Replies Latest reply on Apr 28, 2006 6:18 AM by bulenterdemir

    Important bug in thread pool management

    bulenterdemir

      Hi, while testing JBoss Remoting I limited the maxPoolSize to 2 on the Connector on the server side. Also I added an InvocationHandler which simply waits for 5 seconds and returns a response. Then I created 5 client threads each making an invocation at almost the same time.

      In the above scenario, I receive 2, sometimes 3 responses while expecting all 5 requests to complete in succession in about 25 seconds.

      Debugged the JBoss Remoting code, and noticed a bug:

      SocketServerInvoker, when there's no spare thread available in the pool, waits on the clientpool. And when the clientpool is notified, fetches a ServerThread from the ThreadPool when its size() > 0 and notifies that ServerThread by calling its wakeup() method.

      ServerThread's, on the other hand, after finishing their work, notify the clientpool, add themselves to the threadpool, then notify clientpool.

      After this notification, ServerThread's start to wait on themselves in the hope to be reused.

      The bug occurs in the right time frame when SocketServerInvoker notifies the ServerThread but before when the ServerThread starts waiting on itself. So the notification is lost, therefore ServerThread keeps on waiting forever.

      In other words, ServerThread finishes work, notifies SocketServerInvoker, SocketServerInvoker wakes up, picks the newly available ServerThread, and wakes it up by notifying it (or at least it thinks it wakes the ServerThread up), before the ServerThread gets the chance to wait on itself.

      Please try the following code.

      I'd like to hear the comments before supplying a patch.

      Regards,
      Bulent Erdemir

      package org.falez.sandbox.remoting;
      
      import org.apache.log4j.BasicConfigurator;
      import org.apache.log4j.Level;
      import org.apache.log4j.Logger;
      import org.jboss.remoting.Client;
      import org.jboss.remoting.InvokerLocator;
      import org.jboss.remoting.transport.Connector;
      
      public class RemotingTest {
       static {
       BasicConfigurator.configure();
       Logger.getRootLogger().setLevel(Level.INFO);
       }
       private Logger logger = Logger.getRootLogger();
      
       public static void main(String[] args) throws Throwable {
       RemotingTest rt = new RemotingTest();
       rt.startServer();
       rt.runMultipleClients(Integer.parseInt(args[1]));
       System.in.read();
       }
      
       public void startServer() throws Exception {
       String locatorURI = "socket://localhost:54000/?maxPoolSize=2&timeout=10000";
       InvokerLocator locator = new InvokerLocator(locatorURI);
      
       Connector connector = new Connector();
      
       connector.setInvokerLocator(locator.getLocatorURI());
       connector.create();
      
       SampleInvocationHandler invocationHandler = new SampleInvocationHandler();
       connector.start();
       connector.addInvocationHandler("sample", invocationHandler);
       }
      
       public void runClient(String clientId) throws Throwable {
       String locatorURI = "socket://localhost:54000/";
       InvokerLocator locator = new InvokerLocator(locatorURI);
       Client client = new Client(locator);
       String req =clientId;
       Object resp = client.invoke(req);
       System.in.read();
       }
      
       public void runMultipleClients(int cnt) throws Throwable {
       for (int i = 0; i < cnt; i++) {
       Thread t = new Thread(new Runnable() {
       public void run() {
       try {
       Thread.sleep(1000);
       runClient(Thread.currentThread().getName());
       } catch (Throwable e) {
       logger.error(e);
       }
       }
       }, Integer.toString(i));
       t.start();
       }
       }
      
      }
      
      
      package org.falez.sandbox.remoting;
      
      import javax.management.MBeanServer;
      
      import org.apache.log4j.Logger;
      import org.jboss.remoting.InvocationRequest;
      import org.jboss.remoting.ServerInvocationHandler;
      import org.jboss.remoting.ServerInvoker;
      import org.jboss.remoting.callback.InvokerCallbackHandler;
      
      public class SampleInvocationHandler implements ServerInvocationHandler {
       private static Logger logger = Logger.getRootLogger();
      
       public Object invoke(InvocationRequest invocation) throws Throwable {
       String parm = (String) invocation.getParameter();
       Thread.sleep(5000);
       System.out.println(Thread.currentThread() + "Invoked " + parm);
       String s = "response" + parm;
       return s;
       }
      
       public void setMBeanServer(MBeanServer server) {
       }
      
       public void setInvoker(ServerInvoker invoker) {
       }
      
       public void addListener(InvokerCallbackHandler callbackHandler) {
       }
      
       public void removeListener(InvokerCallbackHandler callbackHandler) {
       }
      
      }
      
      


        • 1. Re: Important bug in thread pool management

          You are right about this bug. This looks to have been introduced within the 1.4.1 release. The problem is that there is a small window between the call to clientpool.notify() and the call to this.wait() within the run() method of ServerThread class that can allow the SocketServerInvoker to "wakeup" the released ServerThread before it actually goes into the wait state (therefore leaving the ServerThread to wait forever).

          Do you already have a patch for this?

          • 2. Re: Important bug in thread pool management
            ron_sigal

            Hi Bulent,

            It looks like I created a race condition in the process of fixing another race condition. Before I made my changes, the manipulation of clientpool and threadpool were inside the scope of synchronized(this), and I can't see any good reason for having changed it.

            I'm thinking that run() should look like the following. Does this match your own solution? -Ron

            public void run()
             {
             try
             {
             while (true)
             {
             dorun();
            
            /*
            * The following code has been changed to eliminate a race condition with SocketServerInvoker.cleanup().
            * A ServerThread can shutdown for two reasons:
            *
            * 1. the client shuts down, and
            * 2. the server shuts down.
            *
            * If both occur around the same time, a problem arises. If a ServerThread starts to shut
            * down because the client shut down, it will test shutdown, and if it gets to the test
            * before SocketServerInvoker.cleanup() calls ServerThread.stop() to set shutdown to true, it
            * will return itself to threadpool. If it moves from clientpool to threadpool at just the
            * right time, SocketServerInvoker could miss it in both places and never call stop(), leaving
            * it alive, resulting in a memory leak. The solution is to synchronize parts of
            * ServerThread.run() and SocketServerInvoker.cleanup() so that they interact atomically.
            */
             synchronized (this)
             {
             synchronized (clientpool)
             {
             synchronized (threadpool)
             {
             if (shutdown)
             {
             invoker = null;
             return; // exit thread
             }
             else
             {
             clientpool.remove(this);
             threadpool.add(this);
             Thread.interrupted(); // clear any interruption so that we can be pooled.
             clientpool.notify();
             }
             }
             }
            
             try
             {
             log.debug("begin thread wait");
             this.wait();
             log.debug("WAKEUP in SERVER THREAD");
             }
             catch (InterruptedException e)
             {
             if (shutdown)
             {
             invoker = null;
             return; // exit thread
             }
            
             throw e;
             }
             }
             }
             }
             catch (Exception ignored)
             {
             log.debug("Exiting run on exception", ignored);
             }
             }
            


            • 3. Re: Important bug in thread pool management

              Added test case - org.jboss.test.remoting.transport.socket.load.SocketLoadTestCase

              • 4. Re: Important bug in thread pool management

                Since no reply from user, have implemented code change per Ron above. Issue has been marked as resolved.

                • 5. Re: Important bug in thread pool management
                  bulenterdemir

                  Hi, very sorry about my late response. Been very busy lately.

                  I don't have a patch written and tested yet. Also, Ron's patch is not quite like the approach I would implement to resolve this issue. However, I'll submit a patch as soon as I can and we'll discuss the merits of the two approaches (Ron's and mine) in this topic.

                  I believe this is a very core piece of this code and important for the performance, therefore further discussions will be welcome, and I'll elaborate on this topic in the coming days.

                  Regards,
                  Bulent Erdemir