I'm pretty confused about what you want to do and have tried.
In your 10,000 bean test case was all the work in one transaction or each bean in its own? Each extreme may be slower than something in the middle: each transaction takes at least one extra db call, and presumably disc write by the db, whereas excessively large transactions may conceivably slow down some dbs.
I would expect that at least 95% of the time for creating an entity bean would be database access. Is this the case for your situation?
When you say you need to create 1M entity beans does this mean you want to insert 1M rows in your db or have 1M objects without identity in a pool, to be drawn on as identities are needed?
Have you allocated enough heap space so all entities can be in memory simultaneously?
If you are trying to insert 1M rows into your db, possibly a non-ejb db-specific batch loader would be more appropriate.
I was inserting new rows into the database.
I did have transactions on everything and was hitting the transaction timeout for my session bean method that was actually creating the beans. I then just removed all the transaction by setting the trans-attribute to Never for both my entity bean and my session bean for all their methods.
This is still extremely slow, but as you suggested, inserting one million+ rows will be much faster using db-specific scripts. I can then use a session bean to manage my bean pool...
Does this pattern break any J2EE rules? Or even make sense?
I have not tried this yet as I am very new to EJB. But this is what my understanding is, please correct me if I am wrong.
Even I was thinking that bulk process may slow down very much with Entity Bean. If we consider a typical tax calculation business process, I will have a finder method called findCurrentMonthsPayElemForAll. Now if I have about 5000 employees each having atleast 4 pay elements I will have 20000 rows on my database. So my ejbFindCurrentMonthsPayElemForAll method will return a collection of 20000 primary keys. Then for each primary key container will create EJB and call ejbLoad on it.
Typically in non-EJB world I would just issue one sql select and get all necessay data for 20000 rows.
Is there any work around for this? Or am I worried unnecessarily?
If you are creating very large records you need to make sure that the cache can hold them. I am pretty sure that the 2.x series came with very low cache levels so that for every bean that you create you are also passivating one from memory. You essentially kill your performance by swapping your whole memory to disk.
What I recommend you do is this: set the cache size extremelly large (until you get out of memory exceptions essentially) and that should somewhat speed up your creation since you won't have serialization going on to the disk. Also if your database was on the same disk as the "swapping" (i.e. the tmp directory that jboss uses to passivate) then you were adding the time it takes to move the physical from "db" sectors to "swap" sectors. In other words that should really help.
But just for the record, with EJB you will still incur the overhead of the proxy creation (one per ejb) which can run in the milliseconds, so 10,000 could cost you 10 seconds but that isn't really significant compared to the 25 minutes you see in serialization.
> I have not tried this yet as I am very new to EJB.
> But this is what my understanding is, please correct
> me if I am wrong.
> Even I was thinking that bulk process may slow down
> very much with Entity Bean. If we consider a typical
> tax calculation business process, I will have a
> finder method called findCurrentMonthsPayElemForAll.
> Now if I have about 5000 employees each having
> atleast 4 pay elements I will have 20000 rows on my
> database. So my ejbFindCurrentMonthsPayElemForAll
> method will return a collection of 20000 primary
> keys. Then for each primary key container will create
> EJB and call ejbLoad on it.
the find will not call ejbLoad on an instance just return a proxy.
> Typically in non-EJB world I would just issue one sql
> select and get all necessay data for 20000 rows.
the ejb will lazy load the instances. Is that what you are worried about? it will indeed issues 20,000 sql statements
> the ejb will lazy load the instances. Is that what you are worried about? it will indeed issues 20,000 sql statements
How do we turn this off. I have 3.5GB of memory and I don't *ever* want anything passivated. If I have 1 million records, I want 1 million beans in memory. When I do a findAll() I want them all loaded up.
How can I accomplish this?