11 Replies Latest reply on Aug 18, 2009 4:04 PM by kukeltje

    Maintenance and Repair Tools

    michaelholtzman

      Greetings.

      Is anyone aware of any tools for analyzing, repairing, etc. the jBPM database?

      For example,

      - performing regular maintenance tasks like purging the database of process instances that ended over 1 year ago.

      - identifying and fixing a data integrity constraint error.

      Thanx.

        • 1. Re: Maintenance and Repair Tools
          kukeltje

          1: If there are, they were not mentioned here. Easiest thing to do is write a small application that uses hiberenate to find all processinstances with or beyond a certain end date and do a delete via hibernate. That will cascadingly remove all things

          2: Hmm... I never had a 'data integrity constraint violation' so this will be even less likely to exist

          • 2. Re: Maintenance and Repair Tools
            michaelholtzman

            Yes, I have already written a number of these one-trick ponies to solve specific problems. We're more interested in diagnosing database issues.

            We are pushing jBPM very hard, running multiple processing threads within a java app as well as multiple distributed java apps. (We have a client that requires 20,000+ workflow instances PER HOUR during a three hour window when they do invoice processing).

            As a result we often see generic persistence service errors (unable to sync, deadlocks, etc.) which can be very difficult to diagnose. A tool to analyze and correct or at least identify the cause of these problems would be very helpful.

            • 3. Re: Maintenance and Repair Tools
              kukeltje

              Nice hearing it is used so extensively. Not nice to hear there are persistency errors.

              Ok , first, my experience is that diagnosing deadlocks etc is indeed hard, not only with jBPM, but in general. In most cases we used a dba to do find the deadlocks in combination java/db expert that uses thread dumps from the jvm to see if there are specific threads that are stuck in relation to what is seen in the db. If it is your own code, the process after that is 'fairly' easy since you know the code. If it is jPBM, I can imagine this is not the case.

              But if you inform us where and when you see errors occuring we'll be glad to help out where we can.

              Unfortunately, it is next to impossible for the jBPM project to provide tooling for this.

              Regarding the inconsistencies in data, I'm very curious where that happens. Within jBPM?

              • 4. Re: Maintenance and Repair Tools
                michaelholtzman

                Well, to make things even more fun we support three db platforms: Sybase, Oracle, and SqlServer.

                First off, almost all of the deadlock problems were cured after indexing the JBPM tables on all the foreign key columns. (We almost had to call the paramedics for the DBA when we told him we were adding 100 indexes to the database). That tip should be featured more prominently in your documentation. For Sybase, we had to explicitly enable row locking on the tables as well. BTW, indexing also improved 'deleteInstance' speed from 8 seconds per instance to 4 instances per second.

                Now most often we see 'cannot sync state with session'. This happens when jbpmContext.save() fails for some reason ... we suspect a rollback due to resource contention, but we're not really sure.

                We occasionally see persistence service errors due to failure to serialize/deserialize an object - we believe these are home-grown variable types (for example, we have java objects to represent tables and picklists). Usually we find an error in our implementation that causes this, but more detailed reporting would be useful.

                Our system has lots of inter-related moving parts, so creating a trivial example to recreate these issues is pretty much impossible. See
                http://www.olf.com/ for an overview.

                Thanx for your help.

                • 5. Re: Maintenance and Repair Tools
                  kukeltje

                  Hmm... I wrote a long response to this, but it strangely disappeared. I'll summarize:

                  - Would be great if you could elaborate a little on the way creating the indexes impacted performance more than just the delete (thankfully it was not the coroner you had to call ;-))
                  - If you have some errors regarding the persistency, we'd be happy to help and see if there could be a reason for the error in jBPM (stacktraces e.g.)
                  - I'm not a fan of serialzation (I've had an incompatibility once due to upgrading a jdk). If needed, I marshal the objects, or use explicit hibernate mappings
                  - Creating an example is not trivial, but you expect 'us' to have/know tooling for debugging it.... ;-)

                  • 6. Re: Maintenance and Repair Tools
                    bestage

                    This is a very interesting discussion to follow.

                    @michaelholtzmann: as I see you are using jBPM3.x.x here. I wonder if locking issues also take place with jbpm4. I hope not or to a smaller extent, since the db schema design has been improved considerably.

                    How do you re-use the jBPM component in multiple applications? Are you distributing a jar and use a central jbpm db for the tables? This is what I am planning to do in one of my projects. But I was also thinking about the possible deadlock issues.

                    • 7. Re: Maintenance and Repair Tools
                      michaelholtzman

                      bestage:

                      How do you re-use the jBPM component in multiple applications? Are you distributing a jar and use a central jbpm db for the tables? This is what I am planning to do in one of my projects. But I was also thinking about the possible deadlock issues.


                      Yes, this is the tricky part. We use a central jBPM database. We have a java app written around jBPM, which can have multiple processing threads. We can run multiple copies of this java app concurrently. A central app manages jBPM requests and forwards them to one of the java apps (but does not call into jBPM directly).

                      When a start process instance request is made, the central app sends that command to one of the java apps, and saves the instance id that was started along with the id of the app that is running it. From that point on, any request targeted at that process instance will be handled by the same java app.

                      Similarly, within the java app there is a hash table that links the instance id to the thread number that started the instance, and all operations on that instance are handled by that thread.

                      As long as the jBPM tables are indexed on all foreign keys (and, if necessary, configured for row locking) the threads can operate on the database without stepping on each other.


                      • 8. Re: Maintenance and Repair Tools
                        michaelholtzman

                        kukeltje:

                        - Would be great if you could elaborate a little on the way creating the indexes impacted performance more than just the delete (thankfully it was not the coroner you had to call ;-))


                        (1) When we start up the jBPM java app, we load information about all the running instances from JBPM into our application. This is much faster after indexing the table (generally anything that gathers components of a process instance or process definition is significantly faster).

                        (2) Eliminated (almost) deadlocks on tables by enabling row locking

                        (3) No downgrade in run-time performance (process instances do not run slower due to overhead from indexing)

                        - I'm not a fan of serialzation (I've had an incompatibility once due to upgrading a jdk). If needed, I marshal the objects, or use explicit hibernate mappings


                        When we started this project no one really understood how hibernate worked, much less how to configure it. I think that any new variable classes we add will be implemented the same way as the native variables.

                        - Creating an example is not trivial, but you expect 'us' to have/know tooling for debugging it.... ;-)


                        Well, I wasn't looking for a tool for a specific problem, I was wondering if there were any general purpose DB tools for jBPM. (E.g., in a previous job we had a tool that would 'walk' the database and identify any inconsistencies that might cause future problems).


                        • 9. Re: Maintenance and Repair Tools
                          kukeltje

                          Are 1,2 and 3 valid for all DBMS you use? Would be great if you could confirm that. And maybe also what the default locking mechanisms are on specific database.

                          Surprises me that there is no slowdown btw. And I'm also curious what additional indexes you created that are not already on jBPM. Since I was under the impression that FK relations had an index.

                          Regarding the tool, finding inconsistencies is (imo) a specific problem and not a general purpose tool. Most (all?) inconsistencies (PK-FK violations, required variables not persisted etc...)should not arise if the database and transactions are configured correctly. So I'm not sure what kinds of inconsistencies you are talking about.

                          DBMS have tools to identify some low level issues like partially corrupt tables etc, but I get the impression that is not what you mean.

                          • 10. Re: Maintenance and Repair Tools
                            michaelholtzman

                             

                            Are 1,2 and 3 valid for all DBMS you use? Would be great if you could confirm that.


                            Definitely Oracle and Sybase, don't have much experience with MSSql yet.

                            And maybe also what the default locking mechanisms are on specific database.


                            Defaults:
                            Oracle - Row locking (as long as it has the indexes it needs to group the rows)
                            Sybase - Table locking (row locking must be specified on a table)
                            MSSql - Version 7+ defaults to row locking

                            Surprises me that there is no slowdown btw. And I'm also curious what additional indexes you created that are not already on jBPM. Since I was under the impression that FK relations had an index.


                            Surprised us too :)
                            I think that certain DB's automatically index the foreign key columns, but apparently Oracle, Sybase, and SqlServer do not. We added these indices:
                            /* OLF - Add indices on foreign keys */
                            create index JBPM_ACTION_EVENT_IDX on JBPM_ACTION (EVENT_);
                            create index JBPM_ACTION_EXCEPT_IDX on JBPM_ACTION (EXCEPTIONHANDLER_);
                            create index JBPM_ACTION_PROCDEFN_IDX on JBPM_ACTION (PROCESSDEFINITION_);
                            create index JBPM_ACTION_TIMERACTION_IDX on JBPM_ACTION (TIMERACTION_);
                            create index JBPM_ACTION_ACTIONDEL_IDX on JBPM_ACTION (ACTIONDELEGATION_);
                            create index JBPM_ACTION_REFACTION_IDX on JBPM_ACTION (REFERENCEDACTION_);
                            
                            create index JBPM_BYTEARRAY_FILEDEFN_IDX on JBPM_BYTEARRAY (FILEDEFINITION_);
                            create index JBPM_BYTEBLOCK_PROCFILE_IDX on JBPM_BYTEBLOCK (PROCESSFILE_);
                            
                            create index JBPM_COMMENT_TOKEN_IDX on JBPM_COMMENT (TOKEN_);
                            create index JBPM_COMMENT_TASKINST_IDX on JBPM_COMMENT (TASKINSTANCE_);
                            
                            create index JBPM_DECISIONCOND_DEC_IDX on JBPM_DECISIONCONDITIONS (DECISION_);
                            
                            create index JBPM_DELEGATION_PROCDEFN_IDX on JBPM_DELEGATION (PROCESSDEFINITION_);
                            
                            create index JBPM_EVENT_PROCDEFN_IDX on JBPM_EVENT (PROCESSDEFINITION_);
                            create index JBPM_EVENT_NODE_IDX on JBPM_EVENT (NODE_);
                            create index JBPM_EVENT_TRANSITION_IDX on JBPM_EVENT (TRANSITION_);
                            create index JBPM_EVENT_TASK_IDX on JBPM_EVENT (TASK_);
                            
                            create index JBPM_LOG_SRCENODE_IDX on JBPM_LOG (SOURCENODE_);
                            create index JBPM_LOG_TOKEN_IDX on JBPM_LOG (TOKEN_);
                            create index JBPM_LOG_OLDBYTE_IDX on JBPM_LOG (OLDBYTEARRAY_);
                            create index JBPM_LOG_NEWBYTE_IDX on JBPM_LOG (NEWBYTEARRAY_);
                            create index JBPM_LOG_CHILD_IDX on JBPM_LOG (CHILD_);
                            create index JBPM_LOG_DESTNODE_IDX on JBPM_LOG (DESTINATIONNODE_);
                            create index JBPM_LOG_TASKINST_IDX on JBPM_LOG (TASKINSTANCE_);
                            create index JBPM_LOG_SWIMLANEINSTANCE_IDX on JBPM_LOG (SWIMLANEINSTANCE_);
                            create index JBPM_LOG_PARENT_IDX on JBPM_LOG (PARENT_);
                            create index JBPM_LOG_NODE_IDX on JBPM_LOG (NODE_);
                            create index JBPM_LOG_ACTION_IDX on JBPM_LOG (ACTION_);
                            create index JBPM_LOG_VARINST_IDX on JBPM_LOG (VARIABLEINSTANCE_);
                            create index JBPM_LOG_TRANSITION_IDX on JBPM_LOG (TRANSITION_);
                            
                            create index JBPM_MODULEDEFN_STRTTASK_IDX on JBPM_MODULEDEFINITION (STARTTASK_);
                            create index JBPM_MODULEDEFN_PROCDEFN_IDX on JBPM_MODULEDEFINITION (PROCESSDEFINITION_);
                            
                            create index JBPM_MODULEINST_TASKMGMT_IDX on JBPM_MODULEINSTANCE (TASKMGMTDEFINITION_);
                            create index JBPM_MODULEINST_PROCINST_IDX on JBPM_MODULEINSTANCE (PROCESSINSTANCE_);
                            
                            create index JBPM_NODE_SUBPDEFN_IDX on JBPM_NODE (SUBPROCESSDEFINITION_);
                            create index JBPM_NODE_PROCDEFN_IDX on JBPM_NODE (PROCESSDEFINITION_);
                            create index JBPM_NODE_ACTION_IDX on JBPM_NODE (ACTION_);
                            create index JBPM_NODE_DECISIONDEL_IDX on JBPM_NODE (DECISIONDELEGATION);
                            create index JBPM_NODE_SUPERSTATE_IDX on JBPM_NODE (SUPERSTATE_);
                            
                            create index JBPM_PROCDEFN_STARTSTATE_IDX on JBPM_PROCESSDEFINITION (STARTSTATE_);
                            
                            create index JBPM_PROCINST_PROCDEFN_IDX on JBPM_PROCESSINSTANCE (PROCESSDEFINITION_);
                            create index JBPM_PROCINST_ROOTTOKEN_IDX on JBPM_PROCESSINSTANCE (ROOTTOKEN_);
                            create index JBPM_PROCINST_SUPERTOKEN_IDX on JBPM_PROCESSINSTANCE (SUPERPROCESSTOKEN_);
                            
                            create index JBPM_RTACTION_PROCINST_IDX on JBPM_RUNTIMEACTION (PROCESSINSTANCE_);
                            create index JBPM_RTACTION_ACTION_IDX on JBPM_RUNTIMEACTION (ACTION_);
                            
                            create index JBPM_TASK_TASKCTRL_IDX on JBPM_TASK (TASKCONTROLLER_);
                            create index JBPM_TASK_ASSIGNDEL_IDX on JBPM_TASK (ASSIGNMENTDELEGATION_);
                            create index JBPM_TASK_TASKNODE_IDX on JBPM_TASK (TASKNODE_);
                            create index JBPM_TASK_PROCDEFN_IDX on JBPM_TASK (PROCESSDEFINITION_);
                            create index JBPM_TASK_STARTSTATE_IDX on JBPM_TASK (STARTSTATE_);
                            create index JBPM_TASK_TASKMGMT_IDX on JBPM_TASK (TASKMGMTDEFINITION_);
                            create index JBPM_TASK_SWIMLANE_IDX on JBPM_TASK (SWIMLANE_);
                            
                            create index JBPM_TACTRPOOL_POOLACTR_IDX on JBPM_TASKACTORPOOL (POOLEDACTOR_);
                            create index JBPM_TACTRPOOL_TASKINST_IDX on JBPM_TASKACTORPOOL (TASKINSTANCE_);
                            
                            create index JBPM_TASKCTRL_TSKCNTDEL_IDX on JBPM_TASKCONTROLLER (TASKCONTROLLERDELEGATION_);
                            
                            create index JBPM_TASKINST_TASKMGMT_IDX on JBPM_TASKINSTANCE (TASKMGMTINSTANCE_);
                            create index JBPM_TASKINST_TOKEN_IDX on JBPM_TASKINSTANCE (TOKEN_);
                            create index JBPM_TASKINST_SWIMINST_IDX on JBPM_TASKINSTANCE (SWIMLANINSTANCE_);
                            create index JBPM_TASKINST_TASK_IDX on JBPM_TASKINSTANCE (TASK_);
                            
                            create index JBPM_TIMER_TOKEN_IDX on JBPM_TIMER (TOKEN_);
                            create index JBPM_TIMER_PROCINST_IDX on JBPM_TIMER (PROCESSINSTANCE_);
                            create index JBPM_TIMER_ACTION_IDX on JBPM_TIMER (ACTION_);
                            create index JBPM_TIMER_TASKINST_IDX on JBPM_TIMER (TASKINSTANCE_);
                            
                            create index JBPM_TOKEN_PARENT_IDX on JBPM_TOKEN (PARENT_);
                            create index JBPM_TOKEN_NODE_IDX on JBPM_TOKEN (NODE_);
                            create index JBPM_TOKEN_PROCINST_IDX on JBPM_TOKEN (PROCESSINSTANCE_);
                            create index JBPM_TOKEN_SUBPROCINST_IDX on JBPM_TOKEN (SUBPROCESSINSTANCE_);
                            
                            create index JBPM_TOKVARMAP_CNTXINST_IDX on JBPM_TOKENVARIABLEMAP (CONTEXTINSTANCE_);
                            create index JBPM_TOKVARMAP_TOKEN_IDX on JBPM_TOKENVARIABLEMAP (TOKEN_);
                            
                            create index JBPM_TRANSITION_NODETO_IDX on JBPM_TRANSITION (TO_);
                            create index JBPM_TRANSITION_PROCDEFN_IDX on JBPM_TRANSITION (PROCESSDEFINITION_);
                            create index JBPM_TRANSITION_NODEFROM_IDX on JBPM_TRANSITION (FROM_);
                            
                            create index JBPM_VARACC_TASKCTRL_IDX on JBPM_VARIABLEACCESS (TASKCONTROLLER_);
                            create index JBPM_VARACC_SCRIPT_IDX on JBPM_VARIABLEACCESS (SCRIPT_);
                            create index JBPM_VARACC_PROCSTATE_IDX on JBPM_VARIABLEACCESS (PROCESSSTATE_);
                            
                            create index JBPM_VARINST_TOKEN_IDX on JBPM_VARIABLEINSTANCE (TOKEN_);
                            create index JBPM_VARINST_TOKVARMAP_IDX on JBPM_VARIABLEINSTANCE (TOKENVARIABLEMAP_);
                            create index JBPM_VARINST_PROCINST_IDX on JBPM_VARIABLEINSTANCE (PROCESSINSTANCE_);
                            create index JBPM_VARINST_TASKINST_IDX on JBPM_VARIABLEINSTANCE (TASKINSTANCE_);
                            create index JBPM_VARINST_BYTEVALUE_IDX on JBPM_VARIABLEINSTANCE (BYTEARRAYVALUE_);
                            commit;
                            


                            DBMS have tools to identify some low level issues like partially corrupt tables etc, but I get the impression that is not what you mean.

                            We've enslaved -- I mean, enlisted our DBA's to do this for us, I was hoping for something that worked at a higher level.

                            • 11. Re: Maintenance and Repair Tools
                              kukeltje

                              Thanks, I'll forward this to some people so they can shed a light on this.