Serious flaw in camel-fix
rodehav Apr 13, 2009 6:05 AMHi!
I've been experimenting a bit with Camel and QuickFix/J via the camel-fix component. I've written about this in other posts. I've found a couple of bugs and feature requests that I believe will be taken care of in a future version of the camel-fix component. Meanwhile I'm trying to get the existing camel-fix component working since I'm getting close to my production date. In this process I've done some reliability testing to make sure that no FIX messages are lost when throwing exceptions and when killing my process and so forth. Unfortunately I've discovered a serious flaw in the camel-fix component. Maybe this is already being addressed in the work for the future version of camel-fix but I need advice on how to make the existing version of camel-fix working until then.
The way QuickFix/J works, it is imperative that all exceptions are propagated to the "fromApp()" method in the QuickFix/J application (the CamelApplication class in the camel-fix component). If the fromApp() method returns with no exceptions, then QuickFix/J assumes that your application has successfully dealt with the message. Thus, you must guarantee that the fromApp() method does not return until you're "done" in some sens. Whether "done" means processing the message all the way or just writing it to a persistent queue for further processing is of course up to the programmer. This has the following two consequences concerning the "onMessage()" method in the FixEndpoint class in the camel-fix component:
1. All exceptions thrown must be propagated back to the fromApp() method. Currently, the onMessage() method cathches all exceptions and doesn't rethrow them. This will prevent QuickFix/J from functioning properly.
2. The processing (initiated in onMessage() via "getLoadBalancer().process(exchange)) must be executed synchronously. This is largely a consequence of the first point.
Examples of things that will fail with the current implementation of camel-fix:
- If you throw a checked exception (FieldNotFound, IncorrectDataFormat, IncorrectTagValue or UnsupportedMessageType), the built in QuickFix/J handling won't be triggered if the fromApp() method doesn't receive this exception. Most of these exceptions should cause QuickFix/J to send a business message reject. Furthermore if UnsupportedMessageType is thrown, then QuickFix/J will remember that this message type is not supported and will reject it for the remainder of the session. The point is that QuickFix/J relies on receiving these exceptions in order to function properly.
- If a runtime exception is thrown, then QuickFix/J assumes that something, of a temporary nature, went wrong and will ask the counter part to resend the message. This is done on the session level and thus automatically by QuickFix/J. If the fromApp() method doesn't recieve the runtime exception, then the corresponding message will never be resent and thus potentially lost.
- If the process dies while processing a message (while in the fromApp() method), then QuickFix/J will remember that this message was not properly processed. On the next login, the QuickFix/J will therefore ask the counter part to resend all messages from that message. Thus, no messages will be lost. On the contrary, if I perform a System.exit() in the processing initiated by the onMessage() method (to simulate a crashed process), QuickFix/J will never request a resend because the onMessage() method has already returned (since it is not processed synchronously) and QuickFix/J believs that the message was processed successfully even though it wasn't.
I would like two things:
1. When you redesing the camel-fix component you have to make absolute certain that the processing initiatde by the camel-fix component is being done synchronously and that all exceptions are passed back to the fromApp() method.
2. Since I need to go to production before you are done with the new camel-fix component, I have patched the camel-fix component to get it to work in my scenario. However, I'm not a Camel expert and I don't quite understnad how to modify the FixEndpoint message to make sure that the onMessage() method will initiate a synchronous request. Can you advice me on how to (temporarily) do this?
Also, do you have andy indication on when I can put my teeth in the new camel-fix component?
/Bengt