Version 2

    For back ground, see http://in.relation.to/Bloggers/MultitableBulkOperations

     

    In Hibernate we decided to implement bulk HQL/JPAQL operations against "multi table structures" using temporary tables.  The temporary table is used to temporarily hold the ID values that matched and is then used as the restriction to the individual update and delete statements.

     

    Anyway, we have run into a number of issues due to varied support for temporary tables amongst databases:

    1. Obviously, does the database support temporary tables?  For example HSQLDB does not.
    2. Temporary tables generally come in 2 flavors, local and global.  Local temp tables are scoped to the connection that created them; global are shared amongst all connections.  Local temp table are preferred since the data in scoped to the connection, so therefore multiple connections can be using a temp table named HT_TMP and each will see only its data.  However, many times databases implement DDL (table creation, for example) as causing an implicit commit.  That makes them worthless for our use because we need to make sure creating the temporary table does not commit any pending changes.  The current solution on these databases is to isolate the creation of the temp table.  The specifics of isolation depend on the transaction strategy; in JTA scenarios we suspend the transaction, start a new one, create the table, commit transaction and then resume the original transaction; in JDBC transaction cases we simply open an new connection and perform  the table creation there.  All is great so far, except that as I said local temp tables (the preference) are only visible to the connection which created them.  This means we cannot use local temp tables, for certain in the JDBC transaction environment and I am pretty sure some JTA environments havr the same issue.  Which means we need to instead use global temp tables or non-temp tables.  However that opens up the possibility of data clashes (connection -1 sees its data, but also the data from connection-2, etc).  This is sort of where we are today

     

    It is worth mentioning Oracle's general approach to this.  They offer a global temp table that functions like a local temp table because they maintain a hidden column that tracks the connection to which the data belongs; it handles that transparently.

     

    So we need a set of solutions to cover all the cases.  Obviously local temp tables are the preference, but they are only viable if the database supports them and supports them in such a way that the table creation does not cause a transaction commit.  Beyond that we have to choose amongst a series of less-than-desireable choices.

     

    Another question is whether we want to support cases where the connection user does not have the necessary privs to create tables.

     

    I say for the time being we continue to use temporary tables for this (managing the ids in memory is fugly).  So that really leaves us with what strategies do we want to support for handling "id tables"?  It seems to me we are best served minimizing these, though if at all possible encapsulating this behind a "strategy interface".

     

    So maybe we support:

    1. local temp table strategy where the temp table creations do not commit transactions
    2. global (temp or otherwise) tables where we manage the "segmentation"
    3. global tables where the db manages the "segmentation" (ala oracle)