7 Replies Latest reply on Sep 26, 2014 5:08 AM by sanjeev.gour

    Query optimization with multiple nulls in select on outer query to an inline view

    sanjeev.gour

      We have some views which contain a constant in their first column. Several such views are UNIONed together to form an inline view and the inline view is then queried asking for DISTINCT values like the following-

       

      SELECT DISCTINCT constant_column_in_views as c1, 'constant' as c2, null as c3, 'another const' as c4, null as c5 FROM (inline view).

      Since we are only taking c1 from the views which is a constant in the views and c2, c3, c4, and c5 are already a constant or a null in this query, the expectation is that a LIMIT 1 will be applied to the view.

       

      Query and plan with only a single null in select-

       

      SELECT DISTINCT c1, null as c2 FROM(

          SELECT c1, c2 FROM (

              SELECT 'const_col_1' as c1, ID as c2 FROM FIRST_SCHEMA.FIRST_TABLE

              UNION ALL

              SELECT 'const_col_2' as c1, ID as c2 FROM SECOND_SCHEMA.SECOND_TABLE

          ) as v

      ) as v1

      --------------------------------------------------------------------PLAN-------------------------------------------------------------------------------------------------------

      Plan: UnionAllNode

        + Output Columns:

          0: c1 (string)

          1: const1 (string)

        + Cost Estimates:Estimated Node Cardinality: 2.0

        + Child 0:

          AccessNode

            + Output Columns:

              0: c1 (string)

              1: const1 (string)

            + Cost Estimates:Estimated Node Cardinality: 1.0

            + Query:SELECT null AS c_1 FROM h2extended.TEST.FIRST_SCHEMA.FIRST_TABLE AS g_0 LIMIT 1

            + Model Name:h2extended

            + Select Columns:['const_col_1' AS c_0, null AS c_1]

        + Child 1:

          AccessNode

            + Output Columns:

              0: c1 (string)

              1: const1 (string)

            + Cost Estimates:Estimated Node Cardinality: 1.0

            + Query:SELECT null AS c_1 FROM h2extended1.TEST1.SECOND_SCHEMA.SECOND_TABLE AS g_0 LIMIT 1

            + Model Name:h2extended1

            + Select Columns:['const_col_2' AS c_0, null AS c_1]

      =====================================================================================================

      Query and plan with more than one nulls in select-

       

      SELECT DISTINCT c1, null as c2, null as c3 FROM(

          SELECT c1, c2 FROM (

              SELECT 'const_col_1' as c1, ID as c2 FROM FIRST_SCHEMA.FIRST_TABLE

              UNION ALL

              SELECT 'const_col_2' as c1, ID as c2 FROM SECOND_SCHEMA.SECOND_TABLE

          ) as v

      ) as v1

       

      ------------------------------------------------------------------PLAN------------------------------------------------------------

      Plan: SortNode

        + Output Columns:

          0: c1 (string)

          1: c2 (string)

          2: c3 (string)

        + Cost Estimates:Estimated Node Cardinality: -1.0

        + Child 0:

          ProjectNode

            + Output Columns:

              0: c1 (string)

              1: c2 (string)

              2: c3 (string)

            + Cost Estimates:Estimated Node Cardinality: -1.0

            + Child 0:

              UnionAllNode

                + Output Columns:c1 (string)

                + Cost Estimates:Estimated Node Cardinality: -1.0

                + Child 0:

                  AccessNode

                    + Output Columns:c1 (string)

                    + Cost Estimates:Estimated Node Cardinality: -1.0

                    + Query:SELECT 'const_col_1' FROM h2extended.TEST.FIRST_SCHEMA.FIRST_TABLE AS g_0 -------------------LIMIT 1 is not applied here

                    + Model Name:h2extended

                + Child 1:

                  AccessNode

                    + Output Columns:c1 (string)

                    + Cost Estimates:Estimated Node Cardinality: -1.0

                    + Query:SELECT 'const_col_2' FROM h2extended1.TEST1.SECOND_SCHEMA.SECOND_TABLE AS g_0-------------------LIMIT 1 is not applied here

                    + Model Name:h2extended1

            + Select Columns:

              0: v.c1

              1: null AS c2

              2: null AS c3

        + Sort Mode:DUP_REMOVE

       

       

       

      Thanks.

      Sanjeev.

        • 1. Re: Query optimization with multiple nulls in select on outer query to an inline view
          shawkins

          Any time there is not a simple correspondence between the upper and lower project (in this case the extra null is) we won't assume that we can push the dup remove lower, which is what then causes the limit to be added.  You can log this as an issue if needed.

          • 3. Re: Query optimization with multiple nulls in select on outer query to an inline view
            sanjeev.gour

            Hi Steve-

            This works fine on 8.9 code, however, I am trying to back port it 8.7 as we are using 8.7 in our project currently and cannot upgrade to 8.9 yet. After back porting, I am not getting it to work. I receive the following exception when I run the newly added test case testDistinctConstant3. Do you know which other change sets I should back port in order to get it working?

             

            org.teiid.core.TeiidRuntimeException: Assertion failed.

                at org.teiid.query.optimizer.TestOptimizer.getPlan(TestOptimizer.java:281)

                at org.teiid.query.optimizer.TestOptimizer.helpPlanCommand(TestOptimizer.java:216)

                at org.teiid.query.optimizer.TestOptimizer.helpPlan(TestOptimizer.java:182)

                at org.teiid.query.optimizer.TestOptimizer.helpPlan(TestOptimizer.java:161)

                at org.teiid.query.optimizer.TestOptimizer.testDistinctConstant3(TestOptimizer.java:6668)

                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

                at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

                at java.lang.reflect.Method.invoke(Method.java:606)

                at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)

                at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

                at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)

                at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)

                at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)

                at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)

                at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)

                at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)

                at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)

                at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)

                at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)

                at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)

                at org.junit.runners.ParentRunner.run(ParentRunner.java:309)

                at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)

                at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)

                at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)

                at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)

                at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)

                at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

            Caused by: java.lang.AssertionError: Assertion failed.

                at org.teiid.core.util.Assertion.failed(Assertion.java:73)

                at org.teiid.core.util.Assertion.assertTrue(Assertion.java:68)

                at org.teiid.core.util.Assertion.assertTrue(Assertion.java:60)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:148)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:161)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:161)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:161)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:149)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:161)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convertPlan(PlanToProcessConverter.java:161)

                at org.teiid.query.optimizer.relational.PlanToProcessConverter.convert(PlanToProcessConverter.java:122)

                at org.teiid.query.optimizer.relational.RelationalPlanner.optimize(RelationalPlanner.java:257)

                at org.teiid.query.optimizer.QueryOptimizer.optimizePlan(QueryOptimizer.java:159)

                at org.teiid.query.optimizer.TestOptimizer.getPlan(TestOptimizer.java:275)

                ... 27 more

             

            Thanks.

            • 4. Re: Query optimization with multiple nulls in select on outer query to an inline view
              shawkins

              Look at the history for RuleCollapseSource for changes between 8.7 and this commit.  More than likely it's one of those - for example does the low level exception go away if you back out the RuleCollapseSource change?

              • 5. Re: Query optimization with multiple nulls in select on outer query to an inline view
                sanjeev.gour

                Thank you, I could back port it. Now we are facing another problem under the same case. We are using a custom translator which does not support select expressions and this lack of support causes another problem in the optimization. The query is like this-

                 

                SELECT DISTINCT c1, null as c2, null as c3 FROM(

                    SELECT c1, c2 FROM (

                        SELECT 'const_col_1' as c1, ID as c2 FROM FIRST_SCHEMA.FIRST_TABLE

                        UNION ALL

                        SELECT 'const_col_2' as c1, ID as c2 FROM SECOND_SCHEMA.SECOND_TABLE

                    ) as v

                ) as v1

                 

                When the transltor supports select expressions, the plan is generated as expected, however, when the translator does not support them, the plan shows that a query to the data source without a limit, this query is possibly generated in the plan to figure out if whether there are any records in the database before it decides to return a single row for the constant. Under these circumstance, adding a limit 1 would be desirable. The translator does not support inline views also but in my tests that doesn't seem to affect the behavior. Here the query plan with select expression support-

                 

                Plan: SortNode

                  + Output Columns:

                    0: c1 (string)

                    1: c2 (string)

                    2: c3 (string)

                  + Cost Estimates:Estimated Node Cardinality: -1.0

                  + Child 0:

                    ProjectNode

                      + Output Columns:

                        0: c1 (string)

                        1: c2 (string)

                        2: c3 (string)

                      + Cost Estimates:Estimated Node Cardinality: -1.0

                      + Child 0:

                        UnionAllNode

                          + Output Columns:const_col (string)

                          + Cost Estimates:Estimated Node Cardinality: -1.0

                          + Child 0:

                            ProjectNode

                              + Output Columns:const_col (string)

                              + Cost Estimates:Estimated Node Cardinality: -1.0

                              + Child 0:

                                AccessNode

                                  + Output Columns

                                  + Cost Estimates:Estimated Node Cardinality: -1.0

                                  + Query:SELECT g_0.ID FROM h2extended.TEST.FIRST_SCHEMA.FIRST_TABLE AS g_0

                                  + Model Name:h2extended

                              + Select Columns:'const_col_1' AS const_col

                          + Child 1:

                            ProjectNode

                              + Output Columns:const_col (string)

                              + Cost Estimates:Estimated Node Cardinality: -1.0

                              + Child 0:

                                AccessNode

                                  + Output Columns

                                  + Cost Estimates:Estimated Node Cardinality: -1.0

                                  + Query:SELECT g_0.ID FROM h2extended1.TEST1.SECOND_SCHEMA.SECOND_TABLE AS g_0

                                  + Model Name:h2extended1

                              + Select Columns:'const_col_2' AS const_col

                      + Select Columns:

                        0: v.const_col AS c1

                        1: null AS c2

                        2: null AS c3

                  + Sort Mode:DUP_REMOVE

                • 6. Re: Query optimization with multiple nulls in select on outer query to an inline view
                  shawkins

                  Yes, that is because we are looking at the projected columns of the query and not the output columns of the access node.  There's a new commit under the issue to address that.

                  • 7. Re: Query optimization with multiple nulls in select on outer query to an inline view
                    sanjeev.gour

                    That resolved the problem.

                    Thank you, Steve.