Interesting insight ...
But I doubt if CMP is trying to be in harmony of database server.
I think CMP is competing against and trying to take over database server in many importance functions traditionally provided by database servers, namely,
CMP could do those in a standard high level programming language and environment (Java,EJB) instead of clumsy database lanugages which vary from database vendors to vendors.
EJB could do distributed transaction quite painlessly
(2) Data Maintenance points.
"Traditionally", when updating a logical group of database objects (e.g. for Audit log of changes), we code the updates to underlying data table and insert/update in corresponding audit log tables in a blocked transaction in a Package stored procedures, or do it in Trigger functions.
It means one update to a table is co-ordinated into multiple updates to other tables transparently.
In EJB could not quite allow this as EJB caching of data objects need to be persisted with underlying database records in which EJB does not know the subsequent updates by the Package stored procedures or Triggers.
Therefore, either only one (EJB or database) should be the data maintenance kernel.
In the eye of EJB/CMP container, the chance is the database server will just be a relative dumb I/O to hard disc. It's owner contribution is to optimize the data retrieval EJB-SQL.
I am trying to see, under J2ee implementaion, how to treat the bunch of Oracle Package functions and procedures, triggers in my current application (under only Servlet/JSP-JDBC framework) to EJB, in which I am choose where eventually to put these functional role to (CMP or database).
I find that basically, they do not quite mix.
It is a lot of work and quite big move for these store procedure and functions, but CMP/CMR is not matured enough to port over (in terms of performance also).
It is a headache.
> I think CMP is competing against and trying to take
> over database server in many importance functions
> traditionally provided by database servers, namely,
CMP, or actually JBoss Persistence, is the manifestation of these functions inside JBoss. So it is both taking them over and working in harmony. A smart persistence implementation would be able to adapt to the capabilities of the database. For example:
* file store - JBoss does everything
* mySQL - JBoss uses multiple SQL operations to work around subqueries
* SQL Server - JBoss uses SQL92 grade operations
* Oracle 9i - JBoss uses SQL99(ish) operations
* LDAP - JBoss optimizes writes more agressively (as they are relatively more expensive)
* database with stored procedures - JBoss is aware of the effects of procs/triggers and can use/handle them
> (1) Transaction/Rollback.
> CMP could do those in a standard high level
> programming language and environment (Java,EJB)
> instead of clumsy database lanugages which vary from
> database vendors to vendors.
> EJB could do distributed transaction quite painlessly
I think transaction management is separate from persistence (EJB treats it this way). They just need to co-operate for synchronization (e.g. dump state on rollback, or cache synch on commit)
> (2) Data Maintenance points.
> "Traditionally", when updating a logical group of
> database objects (e.g. for Audit log of changes), we
> code the updates to underlying data table and
> insert/update in corresponding audit log tables in a
> blocked transaction in a Package stored procedures,
> or do it in Trigger functions.
This is an application design choice to split processing inside and outside the database. This has consequences...
> It means one update to a table is co-ordinated into
> multiple updates to other tables transparently.
> In EJB could not quite allow this as EJB caching of
> data objects need to be persisted with underlying
> database records in which EJB does not know the
> subsequent updates by the Package stored procedures
> or Triggers.
This is one of the consequences of split processing - the cost of cache synch. The updates done in the database don't invalidate state cached in the EJB tier. The simplest workaround for that is just not to cache the data in the EJB tier; of course, the downside is the extra cost of reading the uncached data.
Pertaining to audit, this looks like a reasonable compromise - you centralize auditing in one place so you can guarantee all updates are logged and the performance impact of reading the uncached audit trail is manageable as it is not done often.
If you can guarantee all updates are initiated from the EJB tier, a hybrid solution is also possible. If, and it's a big if, you can describe to the persistence manager all the side effects of the original update performed by the triggers/procedures, then it can factor those effects into its cache management.
Taking your audit scenario, this would allow the persistence manager to cache the state of audit log entries associated with an object. It would know, somehow, that when it updates the object in the database it must also invalidate any cached log entries.
Another alternative here arises if the Persistence Manager can use procedure calls to perform operations. Then, rather than excuting a simple SQL update to store the object, it directly calls the procedure that does the update, generates the audit entries and returns them. The PM can then update its cached state with the new data. This requires co-operation between the two tiers but uses this basic form of notification to avoid cache
This might be a good fit for Sybase/SQL Server configurations that make extensive use of stored procedures.
If all these CMP/CMR evolutions are implemented successfully (functionality and performance wise), the chance is developers would prefer to code in Java envir (structured OOPS environment).
I think it could be quite some way to develop and optimize (performance). Basically, container CMP/CMR/EJB-QL is basically trying to go through the life path of SQL and database queries optimizer (outter join, optimize sub-queries). Further, if the contain really wants to optimize the CMR/EJB-QL stuffs, sooner or later it needs to access the Statistic Estimate (e.g. Analysing the tables and index) of the phyiscal data volumes for at least key huge tables. Eventually, it is taking over the whole database server.
Just thinking ... Though, conceptionally, EJB sounds very neat, but is it worth to create almost full replicate of database/SQL life path which is by RDBM quite matured and efficient enough.
One bridging way it could go is like what you say, develop the functional abstraction of cache validation interface to the database server. This allows split processing issues to be address and CMP/CMR could be more in a co-operative mode with database server.
Typical and simplest scenario is: in updating huge amount of records. It is more efficient to use one update statement to database. The container cache validate interface will allow contain cache to invalidate those updated EJB in the cache instead of disabling the caching of specific EJB's in the container.