I saw your post and I plan on adding caching to the collection level of every entity. Did it make a major difference as I am seeing some response times upwards of a minute?
I added the caching and saw significant improvement. It still seems slow at times for the amount of data being retrieved. I will provide feedback should I discover any additional tuning factors.
I have a problem of performance:
I use this query to retrieve the list of tasks for actorIds:
"select distinct ti " + " from org.jbpm.taskmgmt.exe.PooledActor pooledActor " + " join pooledActor.taskInstances ti " + " where pooledActor.actorId in ( :actorIds ) " + " and ti.isCancelled = 0" + " and ti.actorId is null";
I noticed that the number of queries (select statements) generated by hibernate is proportional to the number of retrieved tasks i.e. for each task, hibernate has to instantiate some dependent objects (the token and so on) from db. The problem is that we have a large number of tasks in db and the performance becomes poor as the number of tasks increases.
How to configure hibernate to improve performance?
Do I need to enable the cache as suggested in this post? If yes, could you please tell me more about caching to the collection level?
How to retrieve all the dependent objects for all tasks in one query (and not one query per task)?
I can't imagine I am the only one who have problems of performance?
Thanks for your help
I changed all the hibernate mappings to use lazy initialization and that improved performance hugely, but I don't know yet what implications that might have bug-wise or why the jBPM team chose not to use lazy as the default. I don't plan on using a web app as the primary user interface, so I'm not concerned about 'open session in view' or any problems like that related to lazy initialization.
I thought that the latest version of hibernate has a default of lazy instantiation whereas the prior version defaulted to non-lazy. The hibernate mappings that come with jBPM in many cases used the default.
I think that you need to be careful regarding when you use lazy or not. For example, if you use lazy and you close the JbpmSession, any reference to the lazily instantiated referenced object will not be resolvable because the hibernate session will not be available to resolve the reference. Lazy references perform better until you need the reference, then one by one as you call the getter, the query is executed.
As far as the earlier post that referred to enabling cache for the collections, here is an example.
<set name="taskInstances" cascade="all" inverse="true" lazy="false" batch-size="25"> <cache usage="nonstrict-read-write"/> <key column="TASKMGMTINSTANCE_"/> <one-to-many class="org.jbpm.taskmgmt.exe.TaskInstance" /> </set>
Somthing else to consider...similar to EJB entity beans, where they provide many positive features for transactional processing, a hibernate based approach may not be the best choice for query processing. If your application provides lists of summary data of entities from which the user selects an item to drill down on the details and/or update, you might consider bypassing hibernate to build the list or define a database view, some new classes and hibernate mappings that provide a lightweight read-only list.
I found a very important in the connection pool: max_statements
Certain connection pools, drivers, databases, and other portions of the system may provide an additional cache system, known as a statement cache. This cache stores a partially compiled version of a statement in order to increase performance. By reusing the parsed or precompiled statement, the application is able to trade an increase in memory usage for a boost in performance.
property name="hibernate.c3p0.min_size">2</property> <property name="hibernate.c3p0.max_size">10</property> <property name="hibernate.c3p0.timeout">5000</property> <property name="hibernate.c3p0.max_statements">100</property> <property name="hibernate.c3p0.idle_test_period">300</property> <property name="hibernate.c3p0.acquire_increment">2</property>
My average time to retrieve data is now divided by a factor of 5!!!
Caching and connection pooling are good. But sometimes it is just a question of looking at your application and using jBPM appropriately.
For example, you must consider paging when huge recordsets are returned. In any case, you will not be able to display thousands of records to the user. So why get all of them in one stretch.
Simple changes like these to your business process layer can make your application blazingly fast.