Are you using synchronous or asynchronous messaging with your Topics?
It's my understanding that persistent messages only imply that the message is written to an underlying datastore before the call to publish returns. This means that the Topic accepts a message on behalf of a client and that this message is persisted to disk.
I don't think this itself guarantees that a message will be delivered to every registered subscriber. For instance if you have multiple subscribers listening asynchronously on a Topic the Topic will try and push it's messages to the listening clients. If there is a network exception here I would think that the message would not be delivered, ever. In fact the client may be deregistered with the server. The client is responsible for detecting this situation and sending an event to any registered exception listeners.
Of course this would be different if you were using durable subscriptions, or Queues. It may also be handled a bit differently for synchronous Topic subscriptions. In this case the client would be responsible for reconnecting to the Topic, but the server side Topic element may not have noticed the clients disconnection and so would still have it's messages queued.
Now here is another question. What sense does it make to have persistent messages being sent to a Topic? If a topic were to crash any non durable subscribers would be hosed. I believe this is because non durable subscribers are represented by non peristent subscriber IDs internally. Therefor if the Topic were to crash, when it came back up it would generate new IDs for the subscribers. Look at the messages that it had previously persisted, realize that those messages could not be delivered and either toss them or put them in a dead letter topic (if such a thing exists).
So short story, I think jboss is working correctly and that if you have a flaky network you need to make your subscribers durable. Or accept that some message loss will happen.
I am using synchronous consumption of messages. For more background, here is my situation. My application is a distributed build system. The application client makes a build request to a central dispatcher, which then sends that request to one build machine from a build farm. The build machine publishes JMS messages about the progress of the build to a topic, and the client application that originally sent that request uses a message filter to receive just the messages with the correct BuildRequestID.
The problem I am hitting is that when there is some network instability (but not serious instability - other apps like VNC work fine), I get a socket exception and the message is not delivered. This seems to be in violation of the 'once-and-only-once' semantics required by PERSISTENT messages (see page 78 of the JMS spec).
The reason I don't want to use a durable subscriber is because I DON'T want messages to be saved if the client subscriber dies. If a user kills his client, for example, then that's fine, and I those messages never need to be delivered to him. The problem is that the client sometimes appears to hang because it is waiting for a BUILD_FINISHED JMS message that is never delivered.
As a side note, I can't seem to find any evidence that the messages are being persisted at all - the jboss\db\jbossmq directory does not have any entries for the topic I created.
I'd really appreciate any help with this, because I'd like to not have to switch to another JMS provider at this stage.
Well, topic stuff are not persisted. I would say it is based on a special interpretation of the spec. The spec simply says that persistent messages should be persisted but that topic messages are not required to be sent to clients that are not available at the time of publisher (i.e no actuall need to persist) and that clients that requires guaranteed delivery for topics must use durable subscribers.
I believe that jboss is following page 78 of the spec. The spec says on page 78:
"When all messages for a topic must be received, a durable subscriber should be used. JMS insures that messages published while a durable subscriber is inactive are retained by JMS and delivered when the subscriber subsequently becomes active. Nondurable subscribers should be used only when missed messages are tolerable."
It also shows a table where persistent messages should be delivered 'once-and-only-once', however right below that is says, '(missed if inactive)'. This seems pretty straight forward to me. If there is a network problem (in this case a socket exception), the subscriber will be thought to be inactive and you won't get the message.
There are some ways around this problem.
1. Fix the socket exception problems. What type of socket exceptions are you seeing? If you're getting connection refused or something, then this would seem to be a problem with the host and not the network. If the connection is timing out, that would be different, but you've said that other apps still work in this case.
2. Change your design to be more fault resistant. Non durable topics are not fault resistant. If your subscriber isn't available when a message for it arrives, the Topic won't resend it. If your subscriber arrives after the message, your Topic won't send it either.
Instead of using a Topic I'd suggest using a Queue. You can send a message to a Queue the same as you would a Topic. You can have the machines in your build cluster all receive messages from that Queue. If you like you can set up selectors the same way, or you can have each receiver busy wait on the Queue. A Queue guarantees that a message will be consumed by only one receiver, so this gives a sort of implicit round robin. You can then send your response message back to the producer. I would suggest sending the message back to a TemporaryQueue located on the producer. If you send it back to the main Queue, you'll have to use a few selectors to make sure your cluster doesn't reconsume the response as a build request.
Hey, thanks a ton for your help and advice. I guess I had a different interpretation of 'inactive', but that's fine.
Here is an example of an exception I'm getting:
(2002/03/04 12:01:30) ConnectionReceiverOILClient is connecting to: 220.127.116.11:3835
(2002/03/04 12:01:30) java.net.SocketException: Option unsupported by protocol: connect
(2002/03/04 12:01:30) Could not send messages to a receiver.
java.rmi.RemoteException: Cannot connect to the ConnectionReceiver/Server
at java.lang.Thread.run(Unknown Source)
Although I don't always get this error - sometimes I just get intermittent ConnectionExceptions.
A clarification on the design of my system. Sending out the build requests doesn't use JMS - this happens using synchronous RMI. JMS is used to notify from the build machine back to the client. Thus, the steps are:
1. Client app sets up JMS to listen to a known topic. It has a message selector "BUILD_ID = xxx", where xxx is the id of the request it is about to make.
2. Client app makes a build request to dispatcher using RMI call dispatcher.build(request);
3. Dispatcher puts requests into build queue (not a JMS queue), and build call returns to client app.
4. Dispatcher dispatches requests to build machine.
5. Build machine processess build. As it processes the build, it publishes info about the build progress to the topic with BUILD_ID = xxx property set on the messages. From a user's perspective, it appears as if they are running the build tool like Ant or Make from their own machine, but this build tool output is actually sent back from the build machine as the JMS messages.
6. Client app receives messages, waiting until it receives a message with a body of BUILD_FINISHED.
OK, so if I decide to go the durable subscription route, I'm a little unclear about how the user information is used to set up the durable subscription. Does each client need a different user to create a separate subscription?
Thanks a ton,
I'm not sure what's causing the unsupported option error. This seems like a jboss issue that a developer would have to comment on. One thing to make sure of is that you are running the same version of jboss on both the client and server. To help the developers it would be nice to know what version of jbossmq you're using and what JVM. If you're using a CVS pull, make sure both client and server are running at the same build level.
As for your architecture...
I don't quite understand why you're using a Topic to beging with. Topics are generally used to facilitate one-to-many communication. Since it sounds like you only have a build machine talking to a client. In this case a Queue would give you the same effect with some of the additional features you're looking for.
For instance. You could use a Queue instead of a Topic. Now you have this situation...
You dispatch your build requests via RMI to your build machine and register your selector on the Queue. The build machine starts building, and sends incremental status updates to your Queue.
The Queue will store these messages until they're either asynchronously sent to the client (this would be my choice), or synchronously retreived by the client. Either way the messages will be stored on the Queue until they are consumed. If a client disconnects due to a network or other error, then the messages will still be waiting for it when it comes back online. If the Queue machine goes down you'll loose the messages unless they're persistent, but you don't seem too worried about this case.
In this setup, if the build machine gets an error sending to the Queue, it can try again to resend the message. If the client has trouble connecting to the Queue it can try again to retreive the message. In the end you should be able to get the build finished message every time.
Another comment I have. I would probably be easier in the long run if you went with only one middleware system. Either JMS or RMI. With a pure JMS setup you can accomplish the same goals without having to maintain the RMI servers, skeletons, and stubs.
To answer your last question, I'm not an expert in configuring durable subscriptions. Their set up is also provider specific. I'm sure there are jboss resources which would describe how to do this, if you're interested.
Hey, thanks a ton for your advice. As it turned out, there was a separate issue causing this bug:
java.net.SocketException: Option unsupported by protocol: connect
Our server was running JDK 1.3, but the user was running his client using JDK 1.4. After he switched to 1.3, everything worked fine.
That doesn't totally solve all the network issues, but should allow us to work around the much more infrequent stability problems.
Strange that there would be a problem between JVM versions. I wonder what assumptions are being made that would cause the incompatibility. Anyway, I'm glad things are working for you now.