Is JBoss right for the job?

midraga Feb 18, 2007 8:21 AM

Hi Everyone,

My company is required to write an application that will take raw input files from various legacy (mainframe, mvs systems) and convert them into the common application format.

The amount of data to be processed/converted is enormous up to 100GB a day with potential to increase in near future.

We have already written plain Java application that runs an a single server and does the job.

Having said that, we would need some kind of a cluster which would distribute the load across multiple nodes and also allow us to scale if there is a need for it.

So I would like to know whether JBoss is the right platform to make it run across the cluster and how likely will we need to rewrite the whole thing?

Cheers,

Midraga

1. Re: Is JBoss right for the job?

jwenting Feb 18, 2007 10:12 AM (in response to midraga)

You would certainly have to rewrite the whole thing (or at least greatly modify it) if you want to do something like that.

Load balancing and clustering is done on a request/response basis, with the cluster admin server determining which member of the cluster gets to handle a request.
If I understand your system correctly it works the other way around, monitoring a remote system for the presence of data and then retrieving and processing it.
To make that run in multiple simultaneous processes you'd need to have some way to divide the job between those processes (for example, for a massive dataconversion you could divide it based on the primary key of the main table processed).

If the files are presented to your system by an external system a clustered environment is easier to set up, as the cluster server will now select the least busy (for example, based on configuration) server to process the request.
But you'd still need to modify your system to work inside JBoss for that, which probably would mean rewriting it as servlets and/or EJBs.

The core system would now no longer monitor the remote system for the availability of data, but would only process it.
A new system would sit in front of your clustered servers, monitor for new data to become available, and send a request to the clustered processing servers for that data to be processed (handing just the filename maybe of the file to be processed and getting a status message back at some point when processing is complete).
Actions