Connection Reset.. sometimes..
woodsr Oct 10, 2017 8:23 PMI am executing an ETL job through a JDBC teiid driver sitting on a linux box that is hitting an Oracle 11 DB. I can successfully run specific numbers of 'jobs' with no errors and the jobs function exactly as designed. We run 1 update statement, works fine. We run 1,000, works fine. We run 10,000, it gives this error in our ETL tool after 3964 records update in the database.
SQLException org.teiid.net.socket.SingleInstanceCommunicationException: Connection reset
I then tried several other number of records and got other odd results. I tried 100,000 records and it has the same behavior and fails this time at 4048 records.
What we can’t seem to figure out though is why that number is consistent. What we found was that specific query of 10,000 records gave that specific record result. A slightly different set of 10,000 records would give a slightly different failure point. All of the failure points where +/- 100 records of 4000.
The other sidewinder is that sometimes it works. I can run the same 10,000 record job 5 times and two times it works, three times it fails on the exact number of records, which in this example is 3964.
After lots of google research and bugging different people I decided to start posting this problem on a few sites to see if anyone could recommend some troubleshooting ideas as we are officially out of them.
What we have tried and proved to not impact.
Nothing is unique about the data or the query that is created. We can take those queries and run them directly in oracle and they work fine without exception.
The ETL tool has been ruled out as it works for 1,000’s of other jobs without having this issue.
Oracle and linux settings have been attempted such as changing the SQLNET.EXPIREYTIME to various settings and using urandom vs. random in the java settings and those seem to change the frequency of the job passing or failing, instead of every other time it started failing ever 3 or 4 times, but the record count at which it would fail would not change. If it failed, it failed at the exact same record count.
Environment:
Teiid Driver 8.4 (also tried 9.1, same result)
Oracle 11.2.0.4.0
Linux 6.9 64bit