3 Replies Latest reply on Jun 2, 2019 12:34 PM by Steven Hawkins

    How do you get the query engine to push down a where clause when view has UNION ALL

    Alan Study Newbie

      I have the following view

      CREATE VIEW PartMarriage (

          src VARCHAR(6), --This is an inline comment in DDL metadata

          marriageType VARCHAR(6),

          masterSerialNo VARCHAR(20),

          masterSerialNoAbbrev VARCHAR(10),

          dependentSerialNo VARCHAR(20),

          dependentSerialNoAbbrev VARCHAR(10),

          CONSTRAINT PK PRIMARY KEY (src, masterSerialNo, masterSerialNoAbbrev, dependentSerialNo, dependentSerialNoAbbrev) --Every view must define a constraint for primary key to be used with odata, MS Access can not have PK with column > 255

      )

      AS

      select

          src,

          marriageType,

          masterSerialNo,

          masterSerialNoAbbrev,

          dependentSerialNo,

          dependentSerialNoAbbrev

      from PartMarriageHamDsnVdbHam.PartMarriage

      union all

      select  

          src,

          marriageType,

          masterSerialNo,

          masterSerialNoAbbrev,

          dependentSerialNo,

          dependentSerialNoAbbrev

      from PartMarriageDataCollectionVdbAep.PartMarriage

      union all

      select  

          src,

          marriageType,

          masterSerialNo,

          masterSerialNoAbbrev,

          dependentSerialNo,

          dependentSerialNoAbbrev

      from PartMarriageGalcHistVdbHma1.PartMarriage

       

      When I use this view like

      select 

          pm1.src pm1_source

        , pm1.masterSerialNo pm1_target_sn

        , pm1.masterSerialNoAbbrev pm1_target_sn_abbrev

        , pm1.dependentSerialNo pm1_sn

        , pm1.dependentSerialNoAbbrev pm1_sn_abbrev

      from

        PartMarriage.PartMarriage pm1

      where

        pm1.masterSerialNo in

          (

          'L15BE3105823'

          , 'L15BE3105824'

          , 'L15BE3105825'

          )

        and pm1.masterSerialNoAbbrev = 'EIN'

      order by

        pm1.masterSerialNo

        , pm1.dependentSerialNoAbbrev

       

      I was expecting the where clause to get passed down to all of the source models but for some reason it is not.  This results in all records being returned.

       

      Here are some logs

      2019-05-31 14:57:03.443 DEBUG [org.teiid.COMMAND_LOG] (Worker33988_QueryProcessorQueue6890274) qcd_vdb_ham.33 XrtEXmeRmbma      SOURCE SRC COMMAND:    endTime=2019-05-31 14:57:03.443 requestID=XrtEXmeRmbma.7        sourceCommandID=10      executionID=686289      txID=null      modelName=PartMarriageDataCollectionVdbAep      translatorName=teiid    sessionID=XrtEXmeRmbma  principal=vc039058@AMU  sourceCommand=[SELECT g_0."dependentSerialNo" AS c_0, g_0."dependentSerialNoAbbrev" AS c_1, g_0."src" AS c_2, g_0."masterSerialNo" AS c_3, g_0."masterSerialNoAbbrev" AS c_4 FROM "PartMarriage"."PartMarriage" AS g_0 WHERE g_0."masterSerialNo" IN ('L15BE3105823', 'L15BE3105824', 'L15BE3105825') AND g_0."masterSerialNoAbbrev" = 'EIN' ORDER BY c_1, c_4 LIMIT 100]

       

      2019-05-31 14:57:03.469 DEBUG [org.teiid.COMMAND_LOG] (Worker33987_QueryProcessorQueue6890280) qcd_vdb_ham.33 XrtEXmeRmbma      START DATA SRC COMMAND:        startTime=2019-05-31 14:57:03.469       requestID=XrtEXmeRmbma.7        sourceCommandID=19      executionID=686294     txID=null       modelName=PartMarriageHamDsnVdbHam      translatorName=teiid    sessionID=XrtEXmeRmbma  principal=vc039058@AMUsql=SELECT g_0.masterSerialNo, g_0.masterSerialNoAbbrev, g_0.src, g_0.dependentSerialNo, g_0.dependentSerialNoAbbrev FROM PartMarriageHamDsnVdbHam.PartMarriage AS g_0

       

      2019-05-31 14:57:09.852 DEBUG [org.teiid.COMMAND_LOG] (Worker33992_QueryProcessorQueue6891804) qcd_vdb_ham.33 XrtEXmeRmbma      END SRC COMMAND:       endTime=2019-05-31 14:57:09.852 requestID=XrtEXmeRmbma.7        sourceCommandID=14      executionID=686291      txID=null      modelName=PartMarriageHamDsnVdbHam      translatorName=teiid    sessionID=XrtEXmeRmbma  principal=vc039058@AMU  finalRowCount=89612    cpuTime(ns)=129362152

       

      Is there anything that I could do to make it pass down the where clause to the select statements that are being unioned together in the view?

        • 1. Re: How do you get the query engine to push down a where clause when view has UNION ALL
          Steven Hawkins Master

          > Is there anything that I could do to make it pass down the where clause to the select statements that are being unioned together in the view?

           

          UNION all will not prevent pushdown.  More than likely it's an issue with your translator not supporting IN or even equality predicates.  What translator type are you using?

           

          You can get a full understanding of the planning from using a Teiid JDBC connection or by increasing the logging on the server to get debug log - see Diagnosing Issues · GitBook for more on reading query plans.

          1 of 1 people found this helpful
          • 2. Re: How do you get the query engine to push down a where clause when view has UNION ALL
            Alan Study Newbie

            set showplan debug

             

            select 

                pm1.src pm1_source

              , pm1.masterSerialNo pm1_target_sn

              , pm1.masterSerialNoAbbrev pm1_target_sn_abbrev

              , pm1.dependentSerialNo pm1_sn

              , pm1.dependentSerialNoAbbrev pm1_sn_abbrev

            from

              PartMarriage.PartMarriage pm1

            where

              pm1.masterSerialNo in

                (

                'L15BE3105823'

                , 'L15BE3105824'

                , 'L15BE3105825'

                )

              and pm1.masterSerialNoAbbrev = 'EIN'

            order by

              pm1.masterSerialNo

              , pm1.dependentSerialNoAbbrev

             

            show plan

             

            returned

             

            LimitNode

              + Relational Node ID:0

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 39

              1: Node Next Batch Process Time: 0

              2: Node Cumulative Next Batch Process Time: 2

              3: Node Cumulative Process Time: 123130

              4: Node Next Batch Calls: 4

              5: Node Blocks: 3

              + Cost Estimates:Estimated Node Cardinality: 100.0

              + Child 0:

              SortNode

              + Relational Node ID:1

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 39

              1: Node Next Batch Process Time: 0

              2: Node Cumulative Next Batch Process Time: 2

              3: Node Cumulative Process Time: 123130

              4: Node Next Batch Calls: 4

              5: Node Blocks: 3

              + Cost Estimates:Estimated Node Cardinality: 300.0

              + Child 0:

              UnionAllNode

              + Relational Node ID:2

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 39

              1: Node Next Batch Process Time: 0

              2: Node Cumulative Next Batch Process Time: 2

              3: Node Cumulative Process Time: 123130

              4: Node Next Batch Calls: 5

              5: Node Blocks: 2

              + Cost Estimates:Estimated Node Cardinality: 300.0

              + Child 0:

              AccessNode

              + Relational Node ID:4

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 0

              1: Node Next Batch Process Time: 1

              2: Node Cumulative Next Batch Process Time: 1

              3: Node Cumulative Process Time: 7

              4: Node Next Batch Calls: 2

              5: Node Blocks: 1

              + Cost Estimates:Estimated Node Cardinality: 100.0

              + Query:SELECT g_0.src AS c_0, g_0.masterSerialNo AS c_1, g_0.masterSerialNoAbbrev AS c_2, g_0.dependentSerialNo AS c_3, g_0.dependentSerialNoAbbrev AS c_4 FROM PartMarriageHamDsnVdbHam.PartMarriage AS g_0 WHERE (g_0.masterSerialNo IN ('L15BE3105823', 'L15BE3105824', 'L15BE3105825')) AND (g_0.masterSerialNoAbbrev = 'EIN') ORDER BY c_1, c_4 LIMIT 100

              + Model Name:PartMarriageHamDsnVdbHam

              + Child 1:

              AccessNode

              + Relational Node ID:5

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 39

              1: Node Next Batch Process Time: 0

              2: Node Cumulative Next Batch Process Time: 0

              3: Node Cumulative Process Time: 23

              4: Node Next Batch Calls: 3

              5: Node Blocks: 2

              + Cost Estimates:Estimated Node Cardinality: 100.0

              + Query:SELECT g_0.src AS c_0, g_0.masterSerialNo AS c_1, g_0.masterSerialNoAbbrev AS c_2, g_0.dependentSerialNo AS c_3, g_0.dependentSerialNoAbbrev AS c_4 FROM PartMarriageDataCollectionVdbAep.PartMarriage AS g_0 WHERE (g_0.masterSerialNo IN ('L15BE3105823', 'L15BE3105824', 'L15BE3105825')) AND (g_0.masterSerialNoAbbrev = 'EIN') ORDER BY c_1, c_4 LIMIT 100

              + Model Name:PartMarriageDataCollectionVdbAep

              + Child 2:

              AccessNode

              + Relational Node ID:6

              + Output Columns:

              0: src (string)

              1: masterSerialNo (string)

              2: masterSerialNoAbbrev (string)

              3: dependentSerialNo (string)

              4: dependentSerialNoAbbrev (string)

              + Statistics:

              0: Node Output Rows: 0

              1: Node Next Batch Process Time: 1

              2: Node Cumulative Next Batch Process Time: 1

              3: Node Cumulative Process Time: 123129

              4: Node Next Batch Calls: 4

              5: Node Blocks: 3

              + Cost Estimates:Estimated Node Cardinality: 100.0

              + Query:SELECT g_0.src AS c_0, g_0.masterSerialNo AS c_1, g_0.masterSerialNoAbbrev AS c_2, g_0.dependentSerialNo AS c_3, g_0.dependentSerialNoAbbrev AS c_4 FROM PartMarriageGalcHistVdbHma1.PartMarriage AS g_0 WHERE (g_0.masterSerialNo IN ('L15BE3105823', 'L15BE3105824', 'L15BE3105825')) AND (g_0.masterSerialNoAbbrev = 'EIN') ORDER BY c_1, c_4 LIMIT 100

              + Model Name:PartMarriageGalcHistVdbHma1

              + Sort Columns:[masterSerialNo, dependentSerialNoAbbrev]

              + Sort Mode:SORT

              + Row Offset:null

              + Row Limit:100

              + Data Bytes Sent:1823

              + Planning Time:10

             

            This actually shows that the where clause is being pushed down.

            I went back to the logs in more detail and saw that the original query was actually slightly more complex than I originally thought.

             

            select 

              pm1.src pm1_source

              , pm1.masterSerialNo pm1_target_sn

              , pm1.masterSerialNoAbbrev pm1_target_sn_abbrev

              , pm1.dependentSerialNo pm1_sn

              , pm1.dependentSerialNoAbbrev pm1_sn_abbrev

              , pm2.src pm2_source

              , pm2.dependentSerialNo pm2_sn

              , pm2.dependentSerialNoAbbrev pm2_sn_abbrev

              , pm3.src pm3_source

              , pm3.dependentSerialNo pm3_sn

              , pm3.dependentSerialNoAbbrev pm3_sn_abbrev

            from

              PartMarriage.PartMarriage pm1

              left outer join PartMarriage.PartMarriage pm2 on pm2.masterSerialNo = pm1.dependentSerialNo

              and pm2.masterSerialNoAbbrev = pm1.dependentSerialNoAbbrev

              left outer join PartMarriage.PartMarriage pm3 on pm3.masterSerialNo = pm2.dependentSerialNo

              and pm3.masterSerialNoAbbrev = pm2.dependentSerialNoAbbrev

            where

              pm1.masterSerialNo in

              (

              'L15BE3105823'

              , 'L15BE3105824'

              , 'L15BE3105825'

              )

              and pm1.masterSerialNoAbbrev = 'EIN'

            order by

              pm1.masterSerialNo

              , pm1.dependentSerialNoAbbrev

             

            In this case you can see that there is a join to itself 2 times to do some level of recursion.  The problem with this statement is that it never returns so the show plan does not return anything.

            I can see how the join to itself might result in saying it is best to return all results from each source.

            I have not tried to use a WITH statement in teiid to see if recursion is supported across multiple sources.  Maybe that would be a better solution than joining to itself if it is supported.  Or maybe for this use case we will need to write an application instead of trying to do it all in teiid.

            • 3. Re: How do you get the query engine to push down a where clause when view has UNION ALL
              Steven Hawkins Master

              > I can see how the join to itself might result in saying it is best to return all results from each source.

               

              Do you have any costing information on the source tables?  With at least the cardinality the planner would create a plan that would first retrieve PartMarriage.PartMarriage pm1, then use that result to fetch the other PartMarriage references.

               

              Also is this a situation where the join is partitioned - that is can only records in PartMarriageHamDsnVdbHam join with those in PartMarriageHamDsnVdbHam, or will the join span sources?

               

              >  I have not tried to use a WITH statement in teiid to see if recursion is supported across multiple sources.  Maybe that would be a better solution than joining to itself if it is supported.

               

              WITH clauses are eligible to be pushed, but generally if it's cross source, then it will be evaluated in Teiid.  This would ensure however the effective processing that uses the results from the previous iteration to drive the next.