1 2 Previous Next 27 Replies Latest reply on Feb 28, 2013 7:11 PM by shawkins Go to original post
      • 15. Re: performance statistics for teiid query
        rareddy

        Jack,

         

        You are right, the user command does include the waiting time. That way it can give you complete time it took for processing the query, queued or otherwise. With out adding wait time, the numbers are really skewed and favor Teiid processing, in reality it should not. Testing like what you are doing should point you towards minimizing the wait.

         

        Ramesh..

        • 16. Re: performance statistics for teiid query
          yapnel

          Ramesh

           

          I have this federated query fetching data from 5 different data sources  ( A,B,C,D and E)  with 11360, 15673, 118900, 20890 and 2903045 records respectively. I can see all the data sources timing and connector timings. However, when all the data have been received by Teiid from the data sources, there aren't anymore logging information telling me what is happening in the engine. I know it is doing the joining of the data and something else and I can see the tmp/teiid/buffer directory growing in size. Is there a way to see what is actually happening at this stage and also when the data starts to stream back to the client? I have command and connector log enabled at debug level.

           

          Please advise.

           

          Thanks Nelson

          • 17. Re: performance statistics for teiid query
            yjma2001

            Hi Ranesh:

             

            If teiid could adds a log entry at command_log, which could let us determining how long the user query stays at waiting queue, that will be very helpfully for us to do the performance tuning up.

             

            Thanks

            Jack

            • 18. Re: performance statistics for teiid query
              rareddy

              Nelson,

               

              If you see your QueryPlan (SHOWPLAN DEBUG), it should tell you what processing that Teiid is doing after the fetching of results. I do not believe Teiid has any fine grained timinings on each of those tasks. Also, I do not think there is any log that I can remember that shows start of the first batch of rows to client. It may be useful what you and Jack asking to capture at debug level.

               

              Ramesh..

              • 19. Re: performance statistics for teiid query
                scottmcentee

                Ramesh,

                 

                To elaborate a little more on Jack's request.  Weunderstand the importance of including waiting time and arent proposing changing any of the items you are currently logging.  But we are also interested in explaining why a particular query took longer than others. For this, it is important to understand if it was a result of wait time.

                We have been stress testing the system by running hundreds of simultaneous queries through it. Some of the queries stress the DB servers, some stress Teiid. We have been adjusting various Teiid settings such as maxActivePlans as well as adjusting VM resources for our Teiid server. We have been gathering what metrics we can get from Teiid to clearly identify where the time is being spent. One of the things we wanted to confirm was that Teiid and Data soruce processing time was relatively consistent throughout and that wait time was the significant factor in the queries that took a lot longer. Conceptually, we figured that to be the cases, but we wanted metrics we could use to confirm this.

                 

                We can approximate wait time by measuring the time between Start User Command and Plan User Command.  We were just hoping that an additional item be logged to indicate when the command was pulled from the queue so that an accurate wait time could be calculated.

                 

                 

                Thanks,

                 

                Scott

                 

                • 20. Re: performance statistics for teiid query
                  rareddy

                  Jack, Scott,

                   

                  Taking a look again in the code, looks like the time difference between USER.NEW and PLAN logs, should give you the planning time + waiting time for a given query. Since planning is all Teiid internal and should take more or less equal time always for a given query, I would say the variance is wait time. In that would you still see the need for separate log event?

                   

                  Ramesh..

                  • 21. Re: performance statistics for teiid query
                    yapnel

                    Ramesh

                     

                    It will be good to add more logging to capture such timing. It is important we can get access to such metrics especially when we are performing performance tuning exercise. At the moment, we can't determine the bottleneck lies with teiid or client because we cant tell how much time is actually spend in processing the data in teiid and when the client starts receiving the data.

                     

                    Nelson

                    • 22. Re: performance statistics for teiid query
                      scottmcentee

                      Okay, if the Teiid internal planning time is neglible, then the method of approximating wait time we are currently using is sufficient for our purposes.

                      • 23. Re: performance statistics for teiid query
                        rareddy

                        @Nelson - Teiid does log message every time a batch of results are sent in TRACE mode. Also the logging context is "org.teiid.PROCESSOR". So, if you change logging configuration for this context to the TRACE mode, you will see it.

                         

                        Ramesh..

                        • 24. Re: performance statistics for teiid query
                          yapnel

                          Thanks Ramesh. That has given us some insight on what is happening and we have managed to locate the bottleneck. It is taking a long time in this step "Created intermediate sort buffer". Can you explain what is actually happening here and any advise on adjusting any teiid parameters to improve the performance. Would slow internal HDD contribute to this as well?

                           

                          Please advise.

                           

                          Thanks.

                           

                          Nelson

                          • 25. Re: performance statistics for teiid query
                            shawkins

                            > Okay, if the Teiid internal planning time is neglible, then the method of approximating wait time we are currently using is sufficient for our purposes.

                             

                            It typically is.  At a debug level for org.teiid.PROCESSOR you can check for the logs:

                             

                            requestId executing sql

                            ...

                            ProcessTree for requestId ...

                             

                            The difference between these two is the time spent in the planner.

                             

                            > Teiid does log message every time a batch of results are sent in TRACE mode

                             

                            I think an enhancement may be in order such that we can maintain a stats object with the CommandContext to track relevant timestamps in a single place that will make correlating that information much easier.

                             

                            > It is taking a long time in this step "Created intermediate sort buffer". Can you explain what is actually happening here and any advise on adjusting any teiid parameters to improve the performance. Would slow internal HDD contribute to this as well?

                             

                            Full sorts (not just dup removal) are a blocking operations as the sort must have all of the incoming data before producing output. Large sorts can take a lot of time to process and more so if there is a slow internal HDD since Teiid will be paging to disk as needed.  You'll need to see the surrounding plan to understand how the sort is being processed.  A recent issue: https://issues.jboss.org/browse/TEIID-2363  helps address latency with some sorts in join operations, but that may not be applicable depending on your scenario.

                             

                            Steve

                            • 26. Re: performance statistics for teiid query
                              yapnel

                              Thanks Steve. That was helpful

                               

                              Can you please explain what the below means? Is this the number of records that have been streamed back to the client and the size of the batch is controlled by the parameter <property name="processorBatchSize">1024</property> ?

                               

                              "[org.teiid.PROCESSOR] (New I/O server worker #4-1) DQP process cursor request from 6145"

                                

                              Sometimes i'm seeing 48 batch size even though the parameter is set to 512.

                               

                              2013-02-28 16:38:38,312 DEBUG [org.teiid.PROCESSOR] (New I/O server worker #1-1) DQP process cursor request from 1888723
                              2013-02-28 16:38:38,320 DEBUG [org.teiid.PROCESSOR] (New I/O server worker #1-1) DQP process cursor request from 1888771

                               

                               

                              Setting the FetchSize=2048 in the connection URL did not influence the fetch size and i can't see anything in the log either.

                               

                              Please advise.

                              • 27. Re: performance statistics for teiid query
                                shawkins

                                > Can you please explain what the below means? Is this the number of records that have been streamed back to the client and the size of the batch is controlled by the parameter <property name="processorBatchSize">1024</property> ?

                                 

                                There are several relevant factors that control what the client gets in a results batch.  The client fetch size is honored as an upper limit, but because the logic was implemented to return batches as quickly as possible the client can receive partial processing batches (that for example were a larger batch that has been passed through a filter) when just performing forward iteration.  The processorBatchSize sets the nominal target for rows in a batch, but the engine may use a smaller or larger size depending upon the datatypes and how wide the data is.  Setting the client fetch size to larger than the working batch size currently has no effect - that's a todo that hasn't been logged yet.  There is also a max fetch size setting in the configuration that in theory limits the what could requested by the fetch size, but it is practially superceded by the prior limitation.

                                 

                                If you are seeing lots of small partial batches, then output logic should probably be tweaked to better balance latency vs. network traffic.

                                 

                                Steve

                                1 2 Previous Next