-
15. Re: performance statistics for teiid query
rareddy Feb 25, 2013 3:37 PM (in response to yjma2001)Jack,
You are right, the user command does include the waiting time. That way it can give you complete time it took for processing the query, queued or otherwise. With out adding wait time, the numbers are really skewed and favor Teiid processing, in reality it should not. Testing like what you are doing should point you towards minimizing the wait.
Ramesh..
-
16. Re: performance statistics for teiid query
yapnel Feb 25, 2013 4:03 PM (in response to rareddy)Ramesh
I have this federated query fetching data from 5 different data sources ( A,B,C,D and E) with 11360, 15673, 118900, 20890 and 2903045 records respectively. I can see all the data sources timing and connector timings. However, when all the data have been received by Teiid from the data sources, there aren't anymore logging information telling me what is happening in the engine. I know it is doing the joining of the data and something else and I can see the tmp/teiid/buffer directory growing in size. Is there a way to see what is actually happening at this stage and also when the data starts to stream back to the client? I have command and connector log enabled at debug level.
Please advise.
Thanks Nelson
-
17. Re: performance statistics for teiid query
yjma2001 Feb 25, 2013 4:08 PM (in response to rareddy)Hi Ranesh:
If teiid could adds a log entry at command_log, which could let us determining how long the user query stays at waiting queue, that will be very helpfully for us to do the performance tuning up.
Thanks
Jack
-
18. Re: performance statistics for teiid query
rareddy Feb 25, 2013 4:16 PM (in response to yapnel)Nelson,
If you see your QueryPlan (SHOWPLAN DEBUG), it should tell you what processing that Teiid is doing after the fetching of results. I do not believe Teiid has any fine grained timinings on each of those tasks. Also, I do not think there is any log that I can remember that shows start of the first batch of rows to client. It may be useful what you and Jack asking to capture at debug level.
Ramesh..
-
19. Re: performance statistics for teiid query
scottmcentee Feb 25, 2013 4:17 PM (in response to yjma2001)Ramesh,
To elaborate a little more on Jack's request. Weunderstand the importance of including waiting time and arent proposing changing any of the items you are currently logging. But we are also interested in explaining why a particular query took longer than others. For this, it is important to understand if it was a result of wait time.
We have been stress testing the system by running hundreds of simultaneous queries through it. Some of the queries stress the DB servers, some stress Teiid. We have been adjusting various Teiid settings such as maxActivePlans as well as adjusting VM resources for our Teiid server. We have been gathering what metrics we can get from Teiid to clearly identify where the time is being spent. One of the things we wanted to confirm was that Teiid and Data soruce processing time was relatively consistent throughout and that wait time was the significant factor in the queries that took a lot longer. Conceptually, we figured that to be the cases, but we wanted metrics we could use to confirm this.
We can approximate wait time by measuring the time between Start User Command and Plan User Command. We were just hoping that an additional item be logged to indicate when the command was pulled from the queue so that an accurate wait time could be calculated.
Thanks,
Scott
-
20. Re: performance statistics for teiid query
rareddy Feb 25, 2013 4:39 PM (in response to scottmcentee)Jack, Scott,
Taking a look again in the code, looks like the time difference between USER.NEW and PLAN logs, should give you the planning time + waiting time for a given query. Since planning is all Teiid internal and should take more or less equal time always for a given query, I would say the variance is wait time. In that would you still see the need for separate log event?
Ramesh..
-
21. Re: performance statistics for teiid query
yapnel Feb 25, 2013 4:46 PM (in response to rareddy)Ramesh
It will be good to add more logging to capture such timing. It is important we can get access to such metrics especially when we are performing performance tuning exercise. At the moment, we can't determine the bottleneck lies with teiid or client because we cant tell how much time is actually spend in processing the data in teiid and when the client starts receiving the data.
Nelson
-
22. Re: performance statistics for teiid query
scottmcentee Feb 25, 2013 4:59 PM (in response to rareddy)Okay, if the Teiid internal planning time is neglible, then the method of approximating wait time we are currently using is sufficient for our purposes.
-
23. Re: performance statistics for teiid query
rareddy Feb 26, 2013 9:22 AM (in response to yapnel)@Nelson - Teiid does log message every time a batch of results are sent in TRACE mode. Also the logging context is "org.teiid.PROCESSOR". So, if you change logging configuration for this context to the TRACE mode, you will see it.
Ramesh..
-
24. Re: performance statistics for teiid query
yapnel Feb 26, 2013 3:37 PM (in response to rareddy)Thanks Ramesh. That has given us some insight on what is happening and we have managed to locate the bottleneck. It is taking a long time in this step "Created intermediate sort buffer". Can you explain what is actually happening here and any advise on adjusting any teiid parameters to improve the performance. Would slow internal HDD contribute to this as well?
Please advise.
Thanks.
Nelson
-
25. Re: performance statistics for teiid query
shawkins Feb 27, 2013 10:16 AM (in response to yapnel)> Okay, if the Teiid internal planning time is neglible, then the method of approximating wait time we are currently using is sufficient for our purposes.
It typically is. At a debug level for org.teiid.PROCESSOR you can check for the logs:
requestId executing sql
...
ProcessTree for requestId ...
The difference between these two is the time spent in the planner.
> Teiid does log message every time a batch of results are sent in TRACE mode
I think an enhancement may be in order such that we can maintain a stats object with the CommandContext to track relevant timestamps in a single place that will make correlating that information much easier.
> It is taking a long time in this step "Created intermediate sort buffer". Can you explain what is actually happening here and any advise on adjusting any teiid parameters to improve the performance. Would slow internal HDD contribute to this as well?
Full sorts (not just dup removal) are a blocking operations as the sort must have all of the incoming data before producing output. Large sorts can take a lot of time to process and more so if there is a slow internal HDD since Teiid will be paging to disk as needed. You'll need to see the surrounding plan to understand how the sort is being processed. A recent issue: https://issues.jboss.org/browse/TEIID-2363 helps address latency with some sorts in join operations, but that may not be applicable depending on your scenario.
Steve
-
26. Re: performance statistics for teiid query
yapnel Feb 28, 2013 5:00 PM (in response to shawkins)Thanks Steve. That was helpful
Can you please explain what the below means? Is this the number of records that have been streamed back to the client and the size of the batch is controlled by the parameter <property name="processorBatchSize">1024</property> ?
"[org.teiid.PROCESSOR] (New I/O server worker #4-1) DQP process cursor request from 6145"
Sometimes i'm seeing 48 batch size even though the parameter is set to 512.
2013-02-28 16:38:38,312 DEBUG [org.teiid.PROCESSOR] (New I/O server worker #1-1) DQP process cursor request from 1888723
2013-02-28 16:38:38,320 DEBUG [org.teiid.PROCESSOR] (New I/O server worker #1-1) DQP process cursor request from 1888771Setting the FetchSize=2048 in the connection URL did not influence the fetch size and i can't see anything in the log either.
Please advise.
-
27. Re: performance statistics for teiid query
shawkins Feb 28, 2013 7:11 PM (in response to yapnel)> Can you please explain what the below means? Is this the number of records that have been streamed back to the client and the size of the batch is controlled by the parameter <property name="processorBatchSize">1024</property> ?
There are several relevant factors that control what the client gets in a results batch. The client fetch size is honored as an upper limit, but because the logic was implemented to return batches as quickly as possible the client can receive partial processing batches (that for example were a larger batch that has been passed through a filter) when just performing forward iteration. The processorBatchSize sets the nominal target for rows in a batch, but the engine may use a smaller or larger size depending upon the datatypes and how wide the data is. Setting the client fetch size to larger than the working batch size currently has no effect - that's a todo that hasn't been logged yet. There is also a max fetch size setting in the configuration that in theory limits the what could requested by the fetch size, but it is practially superceded by the prior limitation.
If you are seeing lots of small partial batches, then output logic should probably be tweaked to better balance latency vs. network traffic.
Steve