When A comes back up, it will also see B's messages.
Things will break.
Hmm... So should I rather make the PM File? So that A and B cannot see each other's messages?
Or how should I implement this? Basically I have two nodes in a cluster, each node should be identically configured, however, only the master node should generate and process messages in the queue. Only if this node falls over should the other node take over this process. It is for importing flat files into the database.
We have no support for failover, other than cold failover.
I've explained it a few times in the past so you'll forgive me
if I don't repeat myself.
Thanks for your response. I found the references you made to your earlier explanations.
If I only persist the P2P messages in the queue to a directory on the local HDD, will that work? I don't mind in my implementation if messages are out of order, as long as when I bring the failed node back online, it will resume work on the queue.
In other words, both node A and B are configured identically with queues Q. Only A is normally master and a HASingleton bean on it kickstarts a process to import data from files into the queue. That will kickstart MDB's to import the data into the DB. If A fails, the data in the queue is stored in the persistence location (this time a directory), and B is marked as master. Thus the HASingleton on it gets started and it starts to process files in the spool directory. It dumps these files into the queue on B and the MDB's do their work as per usual.
If A comes back online, then B will stop the processes importing data, and A's MDB's will continue importing the messages in the queue on A into the DB.
Would this work? Hope you have not answered this as well earlier (I could not find any references).
What would happen if A pushed the messages onto
the queue then it failed.
You then repeat the process on B
When A comes back up it will reprocess the messages.
Or have I misunderstood your proposal?
Well, the idea was that I maintain some state in the DB as to the step I am in processing the data. So Node A will write something to the likes of "dumped file AAA into queue" once it has done so. After the MDB picked up the message and dumped it in the DB, it will update the status to "file AAA processed".
If A fails after the message is in the queue, but before the MDB could process it, B would query the DB first and see file AAA is in A's queue. So it will skip that file and continue with the next one.
When A comes back online it can reprocess the queue.
I know there is a *slight* chance that A writes the message into the queue but crashes before it could update the status on the DB. In that case B would reprocess the same file, and when A comes back online it would reprocess the queue (as you proposed). This is fine however because it will (a) happen very rarely, and (b) the system should in any case be idempotent because I might get duplicate files.
Do you think this will work?
Let's start from the beginning.
You have a file that you want processing. You do this
by sending a message. The MDB processes the message.
You want some kind of failover so that the message
is eventually processed by someone.
Where is the file coming from? How do you even
make sure the message is sent to process the file
in the first place? How do you make sure the file
will be available if the machine crashes?
I'd suggest taking jbossmq out of the picture if you
want HA (high availability).
We have no tested configuration where it is HA.
Using the HASingleton to trawl the database for
records with processed=false on a timer would give
you better semantics. Although this is not portable
to other appservers.
You still need to consider how the file and db
record are written so that files aren't misses.
PS. That's enough free consulting.
You've almost got the idea I have. Just that I am not sending messages that the file needs to be processed, rather I do the following:
In a spool directory on the app server an external process places an ASCII CSV flat file. In my timer task (kickstarted by the HASingleton bean) I pick this file up, parse it and convert it into XML. I then drop this converted XML file into the queue. The MDB will pick up the XML file in the queue, convert it into maybe a Value Object and write it to the DB.
Once the XML file is in the queue, the timer task moves the flat file into an archive directory on the app server, thus will not process it again. If the node crashes, the converted file is still in the queue so it can be imported later.
The machine has RAID 1 so the assumption is that I will not loose the queue completely (hope this is realistic as the system is not completely redundant anyways - such as no redundant power supplies etc)
So the file can be either (a) in the spool directory - unprocessed (and accessible by both nodes), (b) in the spool directory and in the queue of one node - for a very short moment, (c) only in the queue of one node or (d) in the DB and not in the queue anymore.
Does this makes sense? Thanks for your help so far, I really appreciate it.
Well I would gladly have bought the docs but AFAIK none has been written on the HA of JBossMQ because as you said - it is still in development.
Therefore my only recourse was one of you guys to tell me whether I have any hope in using JBoss for this or whether I should get my client to use WebLogic or something that is more mature.
Why pass it through JMS? Why not process it directly
from the timer? If you want the option it is processed
remotely just use a SessionBean from the timer.
If using Weblogic is the right tool for the job then go ahead.
I wanted to (a) decouple the import process and (b) have a generic method of getting the data in without bogging down on synchronous calls.
But I am no expert in J2EE designs so you should know better :)
Btw I really have a strong preference for open source - I have converted my client to Linux/PostgreSQL/Java/JBoss/Tomcat/Apache and I want to implement the solution with that in mind, though sometimes the open source solutions lack some support for updated docs.
Thanks for your help anyway. I appreciate it. And no it is not free consulting as my impressions of JBoss and the support can ultimately lead to much more widespread use of JBoss - something I am sure will be of value to anyone in the group. This implies more purchases for the documentation and possibly (paid) consulting...
You're already outside the j2ee environment.
File imports and timers are not supported (although there
is an EJB timer in EJB2.1).
You are more likely to get bogged down going through JMS.
file -> jms message -> server (send) -> pm (write)
mdb --- server (receive) -> pm (transactionally removed) ---
acknowledge --- server -> pm (fully removed)
Each -> step requires some form of serialization
and --- is a local invocation
I don't even include the final DB operation and this
assumes local JMS
compare this with:
file -> timer --- session --- entity or DAO -> db
The timer is asynchronous anyway.
Why go through the extra overhead of the guaranteed
asynchronous delivery protocol?
It is simplified by the single thread processing.
Of course if you have multiple cpus their power is wasted.
I assume you've heard about the db replication
for postgres since you are interested in HA.