-
1. Re: JBCACHE-382, improve the performance of JDBCCacheLoader
hmesha Feb 2, 2006 10:04 PM (in response to hmesha)WRT using stored procedure:
Stored procedure is DBMS vendor specific so if we use stored proc then we have to write stored procedure for all the databases that we support. Also, it's not a "write once run anywhere" solution.
My suggestion is to make the stored procedure optional to use and pluggable through the configuration file as we currently do with the other variablies inside JDBCCacheLoader. Therefore, It'll be upto the user to write the stored procedure for the database they use.
Any other ideas? -
2. Re: JBCACHE-382, improve the performance of JDBCCacheLoader
manik Feb 3, 2006 5:52 AM (in response to hmesha)I'm not too keen on the stored proc idea - not all DBs even support stored procs (MySQL < 5.0)
Having putObject() call put(fqn,map) may help. If we are in a TX though, isn't this how it is done anyway - we call loader.prepare(modificationlist) on the cache loader when the tx is coompleting? -
3. Re: JBCACHE-382, improve the performance of JDBCCacheLoader
ben.wang Feb 3, 2006 10:11 PM (in response to hmesha)Either stored procedure or batch update is needed on the cache loader side. This is not a problem on the POJO Cache side. Any cache loader user that is doing a lot of get/set within the same transaction will trigger this kind of problem.
-
4. Re: JBCACHE-382, improve the performance of JDBCCacheLoader
hmesha Feb 17, 2006 9:57 AM (in response to hmesha)Through emails we have been discussing the use of batch facility of JDBC 2.0 with this issue. Below are the replies. Based on this discussion, I'm moving this issue to 1.4 release for further investigation.
Hany
Hany Mesha wrote:
WRT this issue, there was 2 options to improve the performance.
1) the use of Stored Proc. which is ruled out in the design forum thread http://www.jboss.com/index.html?module=bb&op=viewtopic&t=76590
2) the use of batch update, I have tried several technique to enforce batch update on put() and I didn't succeed. I came to the conclusion that the updates are spread over multiple transactions making it impossible to do batch update and maintain working state of the objects.
I don't see a better way to improve performance with the current CacheLoader design and my knowledge of JDBC programming. So I have run out of options at this point, any thoughts?
P.S. If you're interested in seeing the impl of the batch update, please let me know and I can send you the class and its test case
Bela Ban wrote:
Maybe there is no solution then.
Okay, to recapitulate, the loader gets a put(List modifications) and,
for each modification M, sends a JDBC statement to the DB.
What we want to bundle is the communication between loader and DB, but
if we cannot use stored procs, then we may have to look into batch
updates (available with JDBC 2.0):
http://www.jguru.com/faq/view.jsp?EID=5079 (google "batch JDBC")
This might be the solution. I don't mind requiring JDBC 2.0 as baseline
for JDBCCacheLoader.
Can you look into this and see what JDBC max version is supported, say,
by Oracle, MySQL, MSSQL and PostgresQL ?
Or, check whether JDBC 2.0 is supported by the driver, and ? if yes ? do
batch updates, otherwise fall back to single statements.
Ben Wang wrote:
Yeah, if we can't find a good enough solution, then let's mark it and
put it in later release. Maybe something better will come up. Or is
using as Hibernate as a persistence plugin will help here? That is, if
Hibernate can help, can we write a layter that works for both our native
JDBC and Hibernate. We don't ship Hibernate as a default. But if a suer
wants it, he will download it himself?
I have cc?ed Steve again for this thread. Steve, care for a quick
comment? :?)
Thanks,
?Ben
Steve Ebersole wrote:
JDBC2??? Welcome to the 20th century; oh wait... ;)
Most (all *real*) databases support JDBC3 (you know JDBC4 is already in
public review, right...). All the ones you listed support JDBC3. Of
course "support" varies in its definition, but the batch update api is
supported by all of them.
JDBC batch updates, by definition, cannot span transactions. They need
to be done within the same transaction. Also, you can only compose
batches made up of the same prepared statement (the different elements
in the batch are just different sets of parameters); this in particular
is a real weakness in the JDBC batch update api. Another option (if you
need to batch together different statements) would be to use the
dialect?specific statement grouping syntax (i.e., "BEGIN ... END" on
Oracle) if such a thing is supported.
Or, of course, we can talk about potential Hibernate usage here...
Bela Ban worte:
Steve just discovered what an idiot I am wrt JDBC... :?) Last time I
used JDBC was 19XX...
Oh, I see, if we can only batch together statements of the same kind
(INSERT, UPDATE), then that feature
may be useless to us.
Hany: can you check what statements we actually use in the JDBCCacheLoader ?
Hmm, actually looking at this, we're only using 2 kinds of statements:
INSERTs and UPDATEs. Probably REMOVEs too.
So, as an optimization, we could try to at least bundle statements of
the same kind,
e.g. I I I U U I U U U U R I U U
could be bundled into 3 inserts, 2 updates, 1 insert, 4 updates, 1
remove, 1 insert, 2 updates...
If that's the case, then (Hany) I suggest you add these comments to the
case, and move it to 1.4,marked as an optimization
Bela Ban worte:
Okay. Hany, I think you have enough information to get started on this.
If we can find out whether
addBatch() is supported for a given driver (maybe it'll throw an
UnsupportedOperationException if not), we could
make use of it, and simply fall back to multiple statements if this is
not supported.
Thanks for the clarification Steve !
Cheers,
Hany Mesha worte:
First, let me thank Steve for a well done explanation.
Now, Here's what I have done:
I created JDBC2CacheLoader to experiment with the batch update facility. I mainly focused on put(List modifications). After all, it's the performance bottle neck reported on this issue. In put(List modifications) I test the driver for batch update support. If true, then I mount the update statements on a list, then at the end I call executeBatch() to perfrom the batch update.
What I have found out is that the CacheLoader, by design, we do make flow control decisions between updates and for the reasons that Steve stated below that the batch of updates can only be applied to a narrow scope. If the scope is widen, the update statements collide with each other (see the attached files for more details)
Running the test cases with batch update and without shows that we're not gaining any performance using batch in that limited scope.
Therefore, as I said in the beginning of this email thread, CacheLoader design prevents us from using batch update effectively.
Finally, I'm suggesting to push this issue to 1.4 for further investigation of a better solution. Perhaps, using Hibernate as Steve suggested.
What you think?