Automatic EJB reconnection mechanism fails if Version Handshake times out
steffenwollscheid Oct 11, 2012 5:11 AMHello,
In a server-to-server scenario, we observed that EJBClientContexts on the server acting as remote-ebj-client intermittently failed to reconnect after the server acting as remote-ejb-host was restarted.
There seems to be a loophole in the mechanism of reconnect handling:
When the ReconnectHandler has done its job of establishing a remoting connection and has deregistered itself and registered the RemotingConnectionEJBReceiver, the latter tries to make a Version handshake with a timeout of 5 seconds (in 1.0.5.Final, configurable in the current master).
In our situation the server acting as remote-ejb-host is very busy during startup and we have multiple EJBClientContext trying to reconnect (also we have quite a lot of exported modules), which caused the version handshake to time out.
If this handshake times out, the RemotingConnectionEJBReceiver unregisters itself from the EJBClientContext, without registering its ReconnectHandler to try at a later time, when the remote-ejb-host might not be so busy, and react faster. So the EJBClientContext is left without RemotingConnectionEJBReceiver and without a reconnect handler.
Did I get this wrong, or would simply changing
if (successfulHandshake) { final Channel compatibleChannel = versionReceiver.getCompatibleChannel(); final ChannelAssociation channelAssociation = new ChannelAssociation(this, context, compatibleChannel, this.clientProtocolVersion, this.marshallerFactory, this.reconnectHandler); synchronized (this.channelAssociations) { this.channelAssociations.put(context, channelAssociation); } Logs.REMOTING.successfulVersionHandshake(context, compatibleChannel); } else { // no version handshake done. close the context Logs.REMOTING.versionHandshakeNotCompleted(context); context.close(); }
in org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver#associate(EJBReceiverContext) to
if (successfulHandshake) { final Channel compatibleChannel = versionReceiver.getCompatibleChannel(); final ChannelAssociation channelAssociation = new ChannelAssociation(this, context, compatibleChannel, this.clientProtocolVersion, this.marshallerFactory, this.reconnectHandler); synchronized (this.channelAssociations) { this.channelAssociations.put(context, channelAssociation); } Logs.REMOTING.successfulVersionHandshake(context, compatibleChannel); } else { // no version handshake done. close the context Logs.REMOTING.versionHandshakeNotCompleted(context); if( this.reconnectHandler != null ){ context.getClientContext().registerReconnectHandler(this.reconnectHandler); } context.close(); }
fix the problem?
Best Regards
Steffen