Clustering / "Locality" / Scalability
tallpsmith Aug 25, 2003 8:59 PMFirstly, I'm still very new to the J2ee space, so please bear that in mind.
I was wondering how people solve, what I believe to be, a common clustering/caching issue. That is, having one gigantic cache (i.e J2EE server & Entity beans) often means a lot of thrashing if different customers do different things at different times. I.E the 2nd customer comes along and all the 1st customers data is in cache and so the cache has to expire/load a lot of information to deal with the 2nd customer.
We have an application along typical lines, CustomerBean->many->TaskBean etc etc. If the number of customers is large, and the number of Tasks obviously exponentially larger still, how do you design a system that will scale well and not thrash too much. I realise that adding more memory is the first thing to do to minimse the chance of customers data purging other customers data, but that doesn't seem like a real long term view to me.
The other problem is that you really want to take advantage of in-vm calls if you can. Clustering is damn useful for availablity etc, but doesn't that reduce the chances of a particular set of entity beans in question residing in-vm of the request? (i.e requiring a remote call)
What I thought might be possible is to "locate" customer specific EJB data on specific servers and somehow "sticky" a user's session to a particular server. (so All of Customer X's Entity beans located on Server K, and you guarantee after login that the session is stuck to Server K).
There appears to be advantages to this (& disadvantages I know, see below):
Advantages:
* Can use cheaper equipment to host for a specific customer (ie only have a single customer per box). No need for 1TB server memory to host everything.
* Sticking to a particular server benefits from in-vm calls (that's where the data is!)
Disadvantages:
* High availability is sacrificed - Server K goes down, Customer X can't login or do anything (at least it's isolated to a specific customer(s) )though
Anyway, can anyone comment on the general nature of the problem and now it is solved? I figure you eventually run out of $/hardware capacity with a single server if you want in-vm calls. Perhaps there's another way I'm completely missing.
Would love anyone's thoughts, links to good articles regarding this sort of thing.
cheers,
Paul Smith