This content has been marked as final. Show 2 replies
Ok, before jumping in with a solution, could you explain a bit more what the problem is (the diagnosis)?
Your post really just explains your solution, not the underlying issue.
II - I can't just let the backup-node call largeMessageSender.send, or the backup could be running faster than the server. (Running out of Credits first for instance).
This is the main issue.
When I send a message from Server2Client, I will verify that is a LargeMessage, and start a loop on sending the several chunks.
If I just call largeMessageSender.send on the backup, I could the backup flooding flowControl before the liveNode. A lot of uncertain events could happen followed after that.
- Master will send an 1G Message.
- It will replicate the send to the backup
- Backup will start sending the 1G Message. It will immediately flood flow control. (There is no actual sending on backup).
- Master will receive credits and replicate them to backup
- Master will send another 1G Message
(Eventually, the backup will still be sending the previous message, I have seen situations where the flowControl still full on backup, and that will reject replications from the masterNode).
We need a sync between Master and Backup to solve that. Or I implement that sync chunk by chunk, and failover would resume from that chunk, Or I sync just a start and end, and failover would reset the pointers and file and resume from the beggining of the file again.