Java 6 update 5 (both)
Jboss 4.2.3GA (Server 1) and Jboss 4.2.2GA (Server 2)
Oracle 10.1.0.2 (Server 1) and 10.1.0.5 (Server 2)
I have avery strange issue that I have no idea how to resolve. I don't even know what causes it. Basically, some queries that are run through jdbc from the web application (a war file running on JBoss 4.2.3GA (same thing happens whe we run it on 4.2.2GA)) take forever to run.. Sometimes! Other times they are fast. When the same query is run through just a regular SQL query tool like oracle Sql Developer, always run fast. What has been happening is this:
The users are using the application, everything runs fast. Then it suddenly shows down to a crawl. The same functions are being used, same queries executed when it's running fast and when it's running slow. I log into the server and am monitoring the Processes in Task Manager. When the application runs fast the oracle process, the one for database itself, is taking up anywhere from 00 to 14 at the highest percent of the CPU usage. When the app slows down, oracle process cpu usage jumps up to 25 - 37%. The amount of people logged into the app have no visible impact on the speed. You can have one or 25 people logged in and it's just as slow. You can have the same 25 people logged in and it's fast.
For the first month of using the program there wasn't a problem. Then for a few weeks we didn't use it and when we started using it again the problem manifested. I checked w/ the guys responsible for the servers they say no maintanance has been done on the hardware or software of the servers.
How we tried to fix the problem:
re-install Oracle (didn't work)
Put system.out.printlns all over the code. In one case where I had the S.o.p before and after the line that queries the database, the before part was printed out, but not the after. It's like the thread never came back. Now it's comming back, just taking a really really long time.
Another thing we did is try to upgrade the oracle driver that the server uses. I'm not sure which one was on there in the first place since both are named ojdbc14.jar, but the old one had the last modification date of 1/27/2005 and the one we downloaded for our version of oracle.. well I don't know it just fives the date and time of when I downloaded it. But here's the strange part: replacing the driver seems to have fixe the problem for a few days. Everything started running fast again... Then after a few days it all slowed down to a crawl. Just restarting jboss and clearing the temp and work folders didn't work. So what we did, is replaced the new driver with the old one again, just to see what happens.... The app started running fast again. This again lasted for a couple of days then it slowd down. Then we replaced the driver w/ the new one again and it started running normally again and so on. Finally replacing the driver back it forward stopped working (what surprises me is that it worked in the first place). Another thing that seemed to fix the slowness for about 20 minutes or so is restarting the server itself.
sounds like a memory leak, maybe but the memory usage is low and doesn't go up when the slowness happens. Restarting jboss and clearing temp and work and data folders doesn't do anything either.
The difference in speed between when the app works as it's supposed to and when it's slow is drastic. When it's fast it takes about a second to three seconds to get from one screen to the next. When it's slow it can take from 30 seconds to 2 minutes. Same queries are being run.
We tried it on 2 different servers. Same problem on both. One of them is a dedicated server for the app. It also houses the database. And no other program is connected to the database except this one.
Sorry for the long post. I'm not sure whether the problem is Oracle, JDBC driver, our program or something else.
If anyone has any idea as to what might be causing this and how to fix it, please let me know.
Have you used any Oracle tools to monitor the requests within Oracle? Those tools should be able to tell you if the long processing time is happening within Oracle or outside of it.
Have you monitored garbage collection activity? Any chance that a major garbage collection is happening at the time of the slow requests?
You stated that you have S.o.p calls before/after the JDBC calls and that those are showing that the slowdown occurs in the calls, so we can probably eliminate your app. But the above suggestions should help pinpoint if the database or the JDBC driver is at fault.
Our DBA says he's using enterprise manager to monitor oracle and no weird stuff seems to be happening.
I don't think it's garbage collection, because the memory usage on that machene is not going up.
At least we now know that it is no Oracle, so it is either the JDBC driver or garbage collection.
Garbage collection has nothing to do with memory usage (if you are using OS-based tools to watch memory usage). OS-based tools will report the same memory usage after a garbage collection as before because as far as the OS is concerned, that memory is still in use. I would suggest trying the -verbose:gc option, using -Xloggc to redirect the GC data to a file, and see if it could be garbage collection.
I"m trying the GC logging as you're suggesting. The app has not slowed down yet since we replaced the drivers last time, so I'm waiting for that to compare the GC loggin charts (tagtraum GCViewer makes a pretty chart out of the log ^_^)
One of the guys at the Sun's forums suggested the following:
Something "clever" is going on - like serialized persisted data in the file system. Could be jboss, could be your own code, but when the drivers get switched the persisted data is recreated. Without the driver switch the data is maintained and loaded on start up. And it is that data (and implementation of the store) that is causing the slowdown.
Have you ever heard of JBoss doing anything of the sort?