9 Replies Latest reply on Oct 11, 2012 8:33 AM by jaikiran

    Automatic EJB reconnection mechanism fails if Version Handshake times out

    steffenwollscheid

      Hello,

       

      In a server-to-server scenario, we observed that EJBClientContexts on the server acting as remote-ebj-client intermittently failed to reconnect after the server acting as remote-ejb-host was restarted.

       

      There seems to be a loophole in the mechanism of reconnect handling:

       

      When the ReconnectHandler has done its job of establishing a remoting connection and has deregistered itself and registered the RemotingConnectionEJBReceiver, the latter tries to make a Version handshake with a timeout of 5 seconds (in 1.0.5.Final, configurable in the current master).

       

      In our situation the server acting as remote-ejb-host is very busy during startup and we have multiple EJBClientContext trying to reconnect (also we have quite a lot of exported modules), which caused the version handshake to time out.

       

      If this handshake times out, the RemotingConnectionEJBReceiver unregisters itself from the EJBClientContext, without registering its ReconnectHandler to try at a later time, when the remote-ejb-host might not be so busy, and react faster. So the EJBClientContext is left without RemotingConnectionEJBReceiver and without a reconnect handler.

       

      Did I get this wrong, or would simply changing

       

      if (successfulHandshake) {
          final Channel compatibleChannel = versionReceiver.getCompatibleChannel();
          final ChannelAssociation channelAssociation = new ChannelAssociation(this, context, compatibleChannel, this.clientProtocolVersion, this.marshallerFactory, this.reconnectHandler);
          synchronized (this.channelAssociations) {
              this.channelAssociations.put(context, channelAssociation);
          }
          Logs.REMOTING.successfulVersionHandshake(context, compatibleChannel);
      } else {
          // no version handshake done. close the context
          Logs.REMOTING.versionHandshakeNotCompleted(context);
          context.close();
      }
      

       

      in org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver#associate(EJBReceiverContext) to

       

      if (successfulHandshake) {
          final Channel compatibleChannel = versionReceiver.getCompatibleChannel();
          final ChannelAssociation channelAssociation = new ChannelAssociation(this, context, compatibleChannel, this.clientProtocolVersion, this.marshallerFactory, this.reconnectHandler);
          synchronized (this.channelAssociations) {
              this.channelAssociations.put(context, channelAssociation);
          }
          Logs.REMOTING.successfulVersionHandshake(context, compatibleChannel);
      } else {
          // no version handshake done. close the context
          Logs.REMOTING.versionHandshakeNotCompleted(context);
          if( this.reconnectHandler != null ){
              context.getClientContext().registerReconnectHandler(this.reconnectHandler);
          }
          context.close();
      }
      

       

       

      fix the problem?

       

      Best Regards

      Steffen