7 Replies Latest reply on Feb 19, 2012 1:46 PM by asadouglass

JMS TimeToLive not functioning as expected

asadouglass Sep 10, 2011 11:47 AM

I recently began setting a TTL (time to live) on outgoing messages ( I was using 20000 milliseconds) and mostly it ran fine. However, I noticed that if there were a clock skew between the node on which I was running the hornetq/publisher and the node on which I was running the subscriber ( namely the subscriber's clock was more than 20 seconds ahead of the publishers) then messages didn't get delivered. On the other hand if I sync'ed the clocks, then the messages came right through. This only happended if the subscribers' clock was ahead of the publisher/jms clock. (Not sure if its important but I was sending from linux to windows)

My hypothesis - I haven't looked at the code - is that the publisher is somehow setting the "stale time" for the message absolutely (even thought the value itself is relative)? This seems wrong to me. Is it intentional?

Example:

Step	HQ Node Time	Msg Time To Live	Msg Stale Time	Sub Node Time
Message is published to HQ Node	10:46:00	20 seconds (20000 ms)	10:46:20	10:48:00
Message available on HQServer (padded for clarity)	10:46:01	20 sec	10:46:20	10:48:01
Message retrieved by Subscriber Node	10:46:02	20 sec	10:46:20	10:48:02

In this example, the message is not received/not delivered by client on Subscriber Node since that client sees the time as 10:48:02 at the time of retrieval and the "Msg Stale Time" has passed.

It seems to me that getting this absolutely correct from the time the "publish" call was made to the exact time every subscribers "receive" call was made would need to involve a complex scheme of interrogating the clocks in a system and accounting for the deltas. However, it seems that if HQ was to interpret TimeToLive as how long the message is allowed to reside on the HQ server (as opposed to the pre-fetch and post-send buffers local to the senders and receiver nodes), then this would be relatively easy to do reliably and would accomplish the desired effect of removing old undelivered messages which were clogging up the server (instead of PAGING or BLOCKING them, or using up tons of memory). There would be some complexity for clustered instantiations, but it seems that there may already be some mechanism for synchronizing the clocks across multiple servers?

Any thoughts?

1. Re: JMS TimeToLive not functioning as expected

clebert.suconic Sep 10, 2011 3:49 PM (in response to asadouglass)

This is because the message can be sitting around on the client for some time (On the client buffer), hence we redo the check when the message is received.
Actions
2. Re: JMS TimeToLive not functioning as expected

asadouglass Sep 12, 2011 9:14 AM (in response to clebert.suconic)

Right. So it seems that there is a defect here, no? Its priority is debatable (sort of important to my project but maybe not others). Maybe in the short term the documentation could be updated with a little caveat or something? ("TTL's behavior is dependent on clock synchonization between publisher and subscriber" or something? ). If I were to send a patch which fixed this, it would not be a bad thing?
Actions
3. Re: JMS TimeToLive not functioning as expected

clebert.suconic Sep 12, 2011 10:24 AM (in response to asadouglass)

I don't think it's a defect. But you can add a JIRA to request clock synchronization wtih server if you like.

To fix it, you would need to send the time from server to client and always compute the difference between server and client before doing any time calculations.
Actions
4. Re: JMS TimeToLive not functioning as expected

asadouglass Sep 12, 2011 8:16 PM (in response to clebert.suconic)

If it were my software I'd probably call it "a vulnerability exposed by a defect in the underlying system configuration". In any event I will enter a JIRA.
Thanks.
Actions
5. Re: JMS TimeToLive not functioning as expected

crb Feb 17, 2012 12:39 PM (in response to asadouglass)

Hello,

can you send me the JIRA of this problem
Actions
6. Re: JMS TimeToLive not functioning as expected

jbertram Feb 17, 2012 12:52 PM (in response to crb)

I don't see in JIRAs in the HornetQ project opened by the user "asadouglass". It's possible he opened it with a different account, but more likely he never opened one since I can't find a JIRA whose description matches this use-case.

That said, another JIRA (HORNETQ-152) was opened for what appears to be an equivalent use-case and it was rejected.
Actions
7. Re: JMS TimeToLive not functioning as expected

asadouglass Feb 19, 2012 1:46 PM (in response to jbertram)

Sorry, I never made a JIRA because p(win) seemed small. It looks like my assessment was accurate. Tim's response was the sort of " we've done all we can given the spec". True, but shouldn't we add a caveat to the documentation for TTL so the hq documentation stands on its own?
Actions

Go to original post