JBMESSAGING-674 - Propagating changes on Cluste...| JBoss.org Content Archive (Read Only)

15. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 17, 2007 1:03 PM (in response to clebert.suconic)

Is there a problem on the guess be the random server?

That would diminish the probability of a hopping to happen.

16. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 17, 2007 1:04 PM (in response to clebert.suconic)

Yes fine, but let's not overcomplicate things :)

17. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 17, 2007 2:26 PM (in response to clebert.suconic)

"timfox" wrote:
Clebert - I am somewhat baffled about the approach you are taking here.

Can you try and describe it clearly (and slowly) so I can understand it?

Thanks.

I just realized this message on the forum... So here is what I'm doing:

I - I am modifying CallbackManager to also receive a new message object, that will contain information about the update ClusteredConnectionFactories.

II - Every time ConnectionFactoryJNDIMapper is updated, it will send the update message on all active connections. I will be using the ConnectionManager on ServerPeer to navigate on existent connections.

III - On the client side, when the connection receives the update message, it will update its ClusteredConnectionFactory.

18. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

ovidiu.feodorov Jan 19, 2007 12:05 AM (in response to clebert.suconic)

Clebert,

I've just taken a look at your check in. Here are several questions:

1. ServerConnectionFactoryEndpoint sends a view update to ALL active connections returned by the ConnectionManager instance. Why does it send it to all? Didn't you mention that this has to be sent on a "filtered" list of connections? This JIRA issue (http://jira.jboss.com/jira/browse/JBMESSAGING-759) is about this.

2. Why the code that sends the view update (currently hosted by ServerConnectionFactoryEndpoint) is not located at ConnectionManager level? Seems more appropriate to be there...

3. Why do you keep a different "activeConnections" set at ConnectionManger level? Speed reasons? You got all the information you need in the "endpoints" map already. Keep in mind that cluster view update events are very rare events, considering the time scale at which messaging operates. Why would you need to optimize that, risking cache desynchronization?

4. What's the difference between a LoadBalancingPolicy and a LoadBalancingFactory? Why do you configure a ConnectionFactory service with the latter instead of the former?

5. What is this:

// FailoverNodeID is not on the map, that means the ConnectionFactory was updated
 // by another connection in another server.. So we will have to guess the failoverID
 // by numeric order. Case we guessed the new server wrongly we will have to rely on
 // redirect from failover

(ClusteringAspect.java line 211).
Why do we need to guess the failover ID?

6. ClusterViewUpdateTest.testUpdateConnectionFactory() fails.

19. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 19, 2007 11:43 AM (in response to clebert.suconic)

1. ServerConnectionFactoryEndpoint sends a view update to ALL active connections returned by the ConnectionManager instance. Why does it send it to all? Didn't you mention that this has to be sent on a "filtered" list of connections? This JIRA issue (http://jira.jboss.com/jira/browse/JBMESSAGING-759 ) is about this.

oops... fixing it... easy fix.

We need a testcase that uses two connectionFactory though.

2. Why the code that sends the view update (currently hosted by ServerConnectionFactoryEndpoint) is not located at ConnectionManager level? Seems more appropriate to be there...

I was wondering if we should lock the factory while updating the clients or not. Since there is that possibility we would require a ReadWriteLock on the CFEndpoint. Since there is a possibility of locking the CF that was an indication for me the right place for the update would be the CFEndpoint itself.

3. Why do you keep a different "activeConnections" set at ConnectionManger level? Speed reasons? You got all the information you need in the "endpoints" map already. Keep in mind that cluster view update events are very rare events, considering the time scale at which messaging operates. Why would you need to optimize that, risking cache desynchronization?

The active connection list is on a three levels HashMap list VMId->ClientId->Connection. It seemed simpler to have a simple HashSet, managed by the very same routines. I can change that if you like.

4. What's the difference between a LoadBalancingPolicy and a LoadBalancingFactory? Why do you configure a ConnectionFactory service with the latter instead of the former?

The LoadBalancingPolicy is not static any more.. you need a Policy per Connection created. This was working before because Serialization was creating new instances for you... and besides you had a comment about not making it static any more.

5. What is this:
Code:

// FailoverNodeID is not on the map, that means the ConnectionFactory was updated
// by another connection in another server.. So we will have to guess the failoverID
// by numeric order. Case we guessed the new server wrongly we will have to rely on
// redirect from failover

(ClusteringAspect.java line 211).
Why do we need to guess the failover ID?

Take a look on the 4th message at this thread (Posted: Mon Jan 15, 2007 17:35 PM) and some others after that.

Case the Factory is updated before failover happens, there is a chance the failoverID is not on the MAP any more.
On that case we would require a random node to try a hopping. Instead of a random node I'm guessing the node... and if it fails hopping will take care of it.

6. ClusterViewUpdateTest.testUpdateConnectionFactory() fails.

I executed the whole testsuite about two times between yesterday and today.. and didn't get any failures. Maybe you have a local problem?

20. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 19, 2007 11:46 AM (in response to clebert.suconic)

6. ClusterViewUpdateTest.testUpdateConnectionFactory() fails.

I executed the whole testsuite about two times between yesterday and today.. and didn't get any failures. Maybe you have a local problem?

I thought you meant the testupdate I wrote... I will take a look

21. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 19, 2007 3:36 PM (in response to clebert.suconic)

"Ovidiu" wrote:
6. ClusterViewUpdateTest.testUpdateConnectionFactory() fails.

I have fixed this... The problem was ConnectionFactoryCallbackHandler::getState was removed.

When the Callback instance is created, state still null as its set later in the AOP stack. I had it set on the constructor but it was not actually being set. My fault was not to document this properly I guess.

I have fixed this by placing getState back to the code.

22. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 20, 2007 7:52 AM (in response to clebert.suconic)

I've finally managed to look at the new code.

A few questions:

1. The update information is sent from server to client in a ConnectionFactoryUpdateMessage instance. This needs to be wrapped in a MessagingMarshallable, otherwise we can't version it.

2. ServerConnectionFactoryEndpoint::updateclusteredClients() is currently only called when a connection factory is deployed / undeployed. It is not called when a node joins or leaves the cluster. It needs to be called in this case too.
Also you need to add tests to check that the update occurs in this situation.

3. Why have you made MessageCallbackHandler and ConnetionFactoryCallbackHandler implement the same (new) interface (CallbackHandler) ? This seems redundant since they are never called polymorphically anyway.

4. I was unsure why init() was being called on the ClientClusteredConnectionFactoryDelegate every time a failover view change occurred...

Cheers

23. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 20, 2007 1:41 PM (in response to clebert.suconic)

Clebert-

I just tried to deploy the scoped sar in JBAS and it fails, complaining that it cannot find a property editor for ConnectionFactory::setLoadBalancingFactory.

Looking at the code, the method setLoadBalancingFactory() takes a LoadBalancingFactory, but the MBean method takes a String.

I have fixed this, so I can get on with my work. You may want to review my fix since it was very quick.

This tells me one thing, you added an MBean attribute without adding a corresponding test for it.

All new functionality needs to have unit tests, so please could you add a test.

Cheers.

24. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 21, 2007 2:38 PM (in response to clebert.suconic)

"timfox" wrote:

1. The update information is sent from server to client in a ConnectionFactoryUpdateMessage instance. This needs to be wrapped in a MessagingMarshallable, otherwise we can't version it.

Ok... I will change this.

"timfox" wrote:

2. ServerConnectionFactoryEndpoint::updateclusteredClients() is currently only called when a connection factory is deployed / undeployed. It is not called when a node joins or leaves the cluster. It needs to be called in this case too.
Also you need to add tests to check that the update occurs in this situation.

Are you sure about this? I have testcases about this... a testcase that gets the ConnectionFactory updated when a new server joins / leaves the cluster.

"timfox" wrote:

3. Why have you made MessageCallbackHandler and ConnetionFactoryCallbackHandler implement the same (new) interface (CallbackHandler) ? This seems redundant since they are never called polymorphically anyway.

I didn't want MessageCallbackHandler to also take care of ConnectionFactories... I thought the dependency would be wrong.

25. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 21, 2007 2:43 PM (in response to clebert.suconic)

"clebert.suconic@jboss.com" wrote:

Are you sure about this? I have testcases about this... a testcase that gets the ConnectionFactory updated when a new server joins / leaves the cluster.

Yep.

You need to test when a node crashes (not leaves cleanly), when it crashes it will not undeploy the connection factories first, unlike a clean exit. So if you do not update the failover maps on node join leave you have a problem.

I didn't want MessageCallbackHandler to also take care of ConnectionFactories... I thought the dependency would be wrong.

I don't understand what you're saying. Could you please elucidate?

26. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 21, 2007 3:00 PM (in response to clebert.suconic)

ConnectionFactoryJNDIMapper::onReplicationChange is calling updating ServerCFEndpoing::updateClusteredClients...

onReplicateChange is also setting the new LoadBalancing when the data is being changed.

ClusterViewUpdateTest::testUpdateConnectionFactory is testing updates on CFs... it kills the server and verifies if the connectinFactory was changed.

Test was passing for me on Friday.

Quote:

I didn't want MessageCallbackHandler to also take care of ConnectionFactories... I thought the dependency would be wrong.

I don't understand what you're saying. Could you please elucidate?

I understood you were discussing the idea of the new Callback interface.

I didn't want to treat ConnectionFactoryUpdates and MessageReceiving on the same classes. that's all.

27. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 21, 2007 3:54 PM (in response to clebert.suconic)

"clebert.suconic@jboss.com" wrote:
ConnectionFactoryJNDIMapper::onReplicationChange is calling updating ServerCFEndpoing::updateClusteredClients...

onReplicateChange is also setting the new LoadBalancing when the data is being changed.

ClusterViewUpdateTest::testUpdateConnectionFactory is testing updates on CFs... it kills the server and verifies if the connectinFactory was changed.

Test was passing for me on Friday.

I don't see how this could be the case.

Think about it, if a server crashes then the connection factories will not get undeployed.

If you are only updating the view when connection factory is deployed / undeployed then how can the client update be happening?

What am I missing?

I didn't want MessageCallbackHandler to also take care of ConnectionFactories... I thought the dependency would be wrong.

This is fine, but has nothing to do with the point I am making.

I am just saying that creating a new interface is redundant since you are not calling the methods polymorphically.

28. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

timfox Jan 21, 2007 4:02 PM (in response to clebert.suconic)

If the client view is being updated, as you say, when a node crashes, then the only other possibility AFAICT is that the connection factory deploy/undeploy even is being triggered when a node crashes.

Can you explain how this happens?

29. Re: JBMESSAGING-674 - Propagating changes on ClusteredConnec

clebert.suconic Jan 22, 2007 10:07 AM (in response to clebert.suconic)

"timfox" wrote:
If the client view is being updated, as you say, when a node crashes, then the only other possibility AFAICT is that the connection factory deploy/undeploy even is being triggered when a node crashes.

Can you explain how this happens?

The update is happening on ConnectionFactoryJNDIMapper::onReplicationChange..

Look at the method's implementation, you will see a call to updateClusteredClients:

line 359: endpoint.updateClusteredClients(delArr, failoverMap);

And the LoadBalancingPolicy is being updated at

line 348: del.setDelegates(delArr);

This is the very same routine we were using to rebing the ConnectionFactory when the view changed, event fired by the ClusteredPostOffice's replicator. We don't undeploy/deploy when a node crashes... we rebing the ConnectionFActory at the place I was telling you.

I am just saying that creating a new interface is redundant since you are not calling the methods polymorphically.

Ah... I see what you mean.... I wanted to do more refactoring on it and have it totally poliymorphically as you were saying but I would need to move some stuff for that. I will take a look if would be possible to refactor it a little bit more.