4 Replies Latest reply on Mar 23, 2006 6:28 AM by timfox

    Race condition in Client invocations

    timfox

      I get the following exception intermittently under high load with multiple clients using the same invoker:

      
      org.jboss.remoting.ConnectionFailedException: org.jboss.remoting.transport.local
      .LocalClientInvoker@950689Error invoking on server because no local server to ca
      ll upon.
       at org.jboss.remoting.transport.local.LocalClientInvoker.invoke(LocalCli
      entInvoker.java:76)
       at org.jboss.remoting.Client.invoke(Client.java:497)
       at org.jboss.remoting.Client.invoke(Client.java:460)
       at org.jboss.jms.client.delegate.DelegateSupport.invoke(DelegateSupport.
      java:112)
      
      


      Looks like the invoker is being closed then the client attempts to invoke on it and fails.


        • 1. Re: Race condition in Client invocations
          timfox
          • 2. Re: Race condition in Client invocations
            timfox

            I have replicated this in a simple test case (sent to Tom).

            Actualy it is very straightforward to replicate.

            Just do the following:

            [code
            Client cl = new Client(locator);
            cl.connect();
            cl.invoke("aardvark");
            cl.disconnect();

            • 3. Re: Race condition in Client invocations
              timfox

              Sorry, too quick on the submit button.

              I have replicated this in a simple test case (sent to Tom).

              Actualy it is very straightforward to replicate.

              Just do the following:

              Client cl = new Client(locator);
              cl.connect();
              cl.invoke("aardvark");
              cl.disconnect();
              


              In multiple threads concurrently, where the locator is shared between threads.

              You'll hit the race after only a few iterations.

              The problem seems to be that the shared invoker can be closed mid way through the Client.invoke() call by another thread.

              This isn't currently synchronized in the code.

              Actually it's a little more complex too, since when calling InvokerRegistry.createClientInvoker(), between creating and doing something with the returned invoker, another thread could have closed it.

              You can't synchronize it until it is returned, hence you have an unsychronizable window when another thread can get in and disconnect it.

              I'm wondering why invoker needs to disconnected when any of the clients disconnect, why not just wait until the *last* client disconnects.

              Also I noticed that the isConnected member of the LocalClientInvoker class is not volatile/synchronized so updates made to it are not guaranteed to be visible between threads.

              In practice on single processors this probably works (although not guarateed) but on multi-processors or other implementations that cache thread storage is likely to be more of a problem.

              • 4. Re: Race condition in Client invocations
                timfox

                I have commented out the line:

                this.invoker.disconnect();
                


                In Client::disconnect() in my local copy so I can get moving.

                This means the invoker is never disconnected so the same instance is always used by all Clients with the same locator.

                Also, it means the Invoker leaks, so clearly this is temporary. However in our case there is only one invoker per server, so this is not a big problem right at this mo.

                Interestingly I'm also seeing some perf increases on creating clients, this makes sense since we're not creating new invokers after every time a client disconnects.

                This also means we're not starting unnecessary pingers, since Client.connect causes a new pinger to be started even though it's using the same client invoker. If we have one invoker and 100 clients using it, then we only need 1 pinger, not 100, since we only have one connection. Having 100 seems inefficient to me.