3 Replies Latest reply on Dec 14, 2011 9:36 AM by Randall Hauch

    Modeshape stuck - race condition in RepositoryConnectionPool

    jamat Newbie

      Hello,

       

      Setup: jboss as 6, latest modeshape.

      Started 30 requests in parallel in got stuck (nothing is happening)

      Taking a threads dump and looking at the code, it seems to me that we have a race condition in RepositoryConnectionPool.

       

      When we try to get a connection we do first (only the interesting part is shown):

       

                      mainLock.lock();

       

                      // Peek to see if there is a connection available ...

                      else if (this.availableConnections.peek() != null) {

                          // There is, so take it and return it ...

                          try {

                              connection = this.availableConnections.take();

                          } catch (InterruptedException e) {

                              LOGGER.trace("Cancelled obtaining a repository connection from pool {0}", getSourceName());

                              Thread.interrupted();

                              throw new RepositorySourceException(getSourceName(), e);

                          }

                      }

       

      The race condition is between the 'peek' and the 'take'.

      Reason is further down the same method we do:

       

                  if (connection == null) {

                      // There are not enough connections, so wait in line for the

                      // next available connection ...

                      LOGGER.trace("Waiting for a repository connection from pool {0}", getSourceName());

                      try {

                          connection = this.availableConnections.take();

                      } catch (InterruptedException e) {

                          LOGGER.trace("Cancelled obtaining a repository connection from pool {0}", getSourceName());

                          Thread.interrupted();

                          throw new RepositorySourceException(getSourceName(), e);

                      }

                      mainLock = this.mainLock;

                      mainLock.lock();

       

      So we call 'take' here without holding the mainLock.

      And this is IMO what happened.

      I have a thread that is blocked in the first 'take' (after the 'peek') and another one that is stuck while trying to get the lock after calling 'take'.

      This is this 'take' that went between the 'peek' and the 'take'.

      Note I have 9 other threads that are trying to release the connections but are also blocked was tryng to get the lock and

      this also explains why the first thread is stuck in 'take' as all the connections are taken.