6 Replies Latest reply on Jan 2, 2002 4:37 PM by mnewcomb

    creating large amounts of entity beans is sssllloooowww

    mnewcomb

      I've got a test case which creates 10,000 entity beans. A session bean actually loops 10,000 times creating a new entity bean for each iteration. This process took over 25 minutes with my PII 266MHz.

      I will test again using our new dual Athlon XP 1800 machine and reply with those results.

      I am gathering some metrics to see if JBoss can handle > 1 million entity beans. I won't have to create these beans but once per event and I'm wondering if I should use a pooling mechanism for the beans themselves. Like spend the 2 days it would take to create a million or two entity beans using Home.create() methods, then use a session bean to set an active flag stating whether the bean is currently representing something. Then as I need new beans, I just get some that have their active flags == false. When/if I ever decide to drop some, just set those active flags = false;

      I would rather not do that, so, is what can I tweak in the caching policy (?) to speed up this process of creating entity beans?

      Thanks,
      Michael

        • 1. Re: creating large amounts of entity beans is sssllloooowww
          davidjencks

          I'm pretty confused about what you want to do and have tried.

          In your 10,000 bean test case was all the work in one transaction or each bean in its own? Each extreme may be slower than something in the middle: each transaction takes at least one extra db call, and presumably disc write by the db, whereas excessively large transactions may conceivably slow down some dbs.

          I would expect that at least 95% of the time for creating an entity bean would be database access. Is this the case for your situation?

          When you say you need to create 1M entity beans does this mean you want to insert 1M rows in your db or have 1M objects without identity in a pool, to be drawn on as identities are needed?

          Have you allocated enough heap space so all entities can be in memory simultaneously?

          If you are trying to insert 1M rows into your db, possibly a non-ejb db-specific batch loader would be more appropriate.

          • 2. Re: creating large amounts of entity beans is sssllloooowww
            mnewcomb

            I was inserting new rows into the database.

            I did have transactions on everything and was hitting the transaction timeout for my session bean method that was actually creating the beans. I then just removed all the transaction by setting the trans-attribute to Never for both my entity bean and my session bean for all their methods.

            This is still extremely slow, but as you suggested, inserting one million+ rows will be much faster using db-specific scripts. I can then use a session bean to manage my bean pool...

            Does this pattern break any J2EE rules? Or even make sense?

            Thanks,
            Michael

            • 3. Re: creating large amounts of entity beans is sssllloooowww
              bjavaboy

              I have not tried this yet as I am very new to EJB. But this is what my understanding is, please correct me if I am wrong.

              Even I was thinking that bulk process may slow down very much with Entity Bean. If we consider a typical tax calculation business process, I will have a finder method called findCurrentMonthsPayElemForAll. Now if I have about 5000 employees each having atleast 4 pay elements I will have 20000 rows on my database. So my ejbFindCurrentMonthsPayElemForAll method will return a collection of 20000 primary keys. Then for each primary key container will create EJB and call ejbLoad on it.

              Typically in non-EJB world I would just issue one sql select and get all necessay data for 20000 rows.

              Is there any work around for this? Or am I worried unnecessarily?

              Thanx,
              B!

              • 4. Re: creating large amounts of entity beans is sssllloooowww
                marc.fleury

                If you are creating very large records you need to make sure that the cache can hold them. I am pretty sure that the 2.x series came with very low cache levels so that for every bean that you create you are also passivating one from memory. You essentially kill your performance by swapping your whole memory to disk.

                What I recommend you do is this: set the cache size extremelly large (until you get out of memory exceptions essentially) and that should somewhat speed up your creation since you won't have serialization going on to the disk. Also if your database was on the same disk as the "swapping" (i.e. the tmp directory that jboss uses to passivate) then you were adding the time it takes to move the physical from "db" sectors to "swap" sectors. In other words that should really help.

                But just for the record, with EJB you will still incur the overhead of the proxy creation (one per ejb) which can run in the milliseconds, so 10,000 could cost you 10 seconds but that isn't really significant compared to the 25 minutes you see in serialization.

                • 5. Re: creating large amounts of entity beans is sssllloooowww
                  marc.fleury

                  > I have not tried this yet as I am very new to EJB.
                  > But this is what my understanding is, please correct
                  > me if I am wrong.
                  >
                  > Even I was thinking that bulk process may slow down
                  > very much with Entity Bean. If we consider a typical
                  > tax calculation business process, I will have a
                  > finder method called findCurrentMonthsPayElemForAll.
                  > Now if I have about 5000 employees each having
                  > atleast 4 pay elements I will have 20000 rows on my
                  > database. So my ejbFindCurrentMonthsPayElemForAll
                  > method will return a collection of 20000 primary
                  > keys. Then for each primary key container will create
                  > EJB and call ejbLoad on it.

                  the find will not call ejbLoad on an instance just return a proxy.

                  > Typically in non-EJB world I would just issue one sql
                  > select and get all necessay data for 20000 rows.

                  the ejb will lazy load the instances. Is that what you are worried about? it will indeed issues 20,000 sql statements

                  • 6. Re: creating large amounts of entity beans is sssllloooowww
                    mnewcomb

                    > the ejb will lazy load the instances. Is that what you are worried about? it will indeed issues 20,000 sql statements

                    How do we turn this off. I have 3.5GB of memory and I don't *ever* want anything passivated. If I have 1 million records, I want 1 million beans in memory. When I do a findAll() I want them all loaded up.

                    How can I accomplish this?

                    Michael