Hi all experts.
I'm about to design a "price-finder" application that read prices from a website.
I have about 10 000 URLs to products in a database right now. Some URLs needs to be read every 5 minutes, others every 20 minutes and the rest every hour.
It takes up to 2 sec to download and process every page. It's the downloading thats the bottleneck right now, due to the "slow" webservers I read the data from.
I need to have around 10 "processes" working at the same time to be able to process it all.
I was thinking of having a scheduled MDB's that is triggered every 5, 20 and 60 minutes.
Every time it's triggered it will get rellevant URLs from the database (about 3500 each time), split it up into 10 parts and send each part to a stateless session bean for processing. The stateless session beans will then processes around 350 URLs each.
What do you experts think about that design?