1 2 Previous Next 20 Replies Latest reply on Mar 6, 2014 7:49 AM by Steven Hawkins

    Partitioned UNION

    Tom Arnold Novice

      I am trying to aggregate data from several models using a view and UNION.

       

      I have a query like this where FOO and BAR are two different Oracle databases.

       

      select * from (
        (select item_id, created_at, 'Foo' source from foo.items order by created_at desc limit 5000)
         union all
        (select item_id, created_at, 'Bar' source from bar.items order by created_at desc limit 5000)
      ) x
      where
        source in ('Foo', 'Bar')
      order by created_at desc
      limit 0, 500;

       

      If I adjust the filter to only include results from the second query, I get an exception.

       

      07:31:51,869 ERROR [org.teiid.PROCESSOR] (Worker845_QueryProcessorQueue20901) 8sJdP3mSpZs4 TEIID30019 Unexpected exception for request 8sJdP3mSpZs4.8: java.lang.ArrayIndexOutOfBoundsException

       

      Let me know if you need more info. This is Teiid 8.6.

       

      Thanks,

      Tom

        • 1. Re: Partitioned UNION
          Ramesh Reddy Master

          Can you post/attach the whole stack.

          • 2. Re: Partitioned UNION
            Tom Arnold Novice

            Whoops, here you go.

             

            I noticed that if I remove the ORDER BY from the BAR query it seems to work (although obviously not the results I want).

             

            Edit: I can't seem to post a new reply in this thread because the forum says I've been rate-limited?

             

            09:15:19,265 DEBUG [org.teiid.COMMAND_LOG] (New I/O worker #2) 7rl0TEA59SdU     START USER COMMAND:    startTime=2014-02-24 09:15:19.264    requestID=7rl0TEA59SdU.7    txID=null    sessionID=7rl0TEA59SdU    applicationName=JDBCprincipal=user@teiid-security    vdbName=Foo    vdbVersion=1    sql=select * from (
              (select item_id, created_at, 'Foo' source from foo.products order by created_at desc limit 5000)
               union all
              (select item_id, created_at, 'Bar' source from bar.products order by created_at desc limit 5000)
            ) x
            where
              source in ('Bar')
            order by created_at desc
            limit 0, 500
            09:15:19,279 INFO  [org.teiid.PLANNER] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU
            ============================================================================
            USER COMMAND:
            SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500

             

            ----------------------------------------------------------------------------
            OPTIMIZE:
            SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500

             

            ----------------------------------------------------------------------------
            GENERATE CANONICAL:
            SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500

             

            CANONICAL PLAN:
            TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500})
              Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC})
                Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                  Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar'})
                    Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT, x.source='Foo'}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}})
                      SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true})
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
                          Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC})
                            Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source]})
                              Source(groups=[foo.ITEMS])
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
                          Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC})
                            Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source]})
                              Source(groups=[bar.ITEMS])

             


            ============================================================================
            EXECUTING PlaceAccess

             

            AFTER:
            TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500})
              Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC})
                Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                  Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar'})
                    Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT, x.source='Foo'}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}})
                      SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true})
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
                          Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC})
                            Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source]})
                              Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000})
                                Source(groups=[foo.ITEMS])
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
                          Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC})
                            Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source]})
                              Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000})
                                Source(groups=[bar.ITEMS])

             


            ============================================================================
            EXECUTING AssignOutputElements

             

            AFTER:
            TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
              Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source], OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                  Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar', OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                    Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.source='Foo', x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}, OUTPUT_COLS=[x.source, x.ITEM_ID, x.CREATED_AT]})
                      SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                          Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                            Project(groups=[foo.ITEMS], props={PROJECT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                              Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                                Source(groups=[foo.ITEMS], props={OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                        TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                          Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                            Project(groups=[bar.ITEMS], props={PROJECT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                              Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                                Source(groups=[bar.ITEMS], props={OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})

             


            ============================================================================
            EXECUTING PushSelectCriteria

             

            AFTER:
            TupleLimit(groups=[])
              Sort(groups=[x])
                Project(groups=[x])
                  Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar', OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source], IS_PHANTOM=true})
                    Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.source='Foo', x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}, OUTPUT_COLS=[x.source, x.ITEM_ID, x.CREATED_AT]})
                      SetOperation(groups=[])
                        TupleLimit(groups=[])
                          Sort(groups=[foo.ITEMS])
                            Project(groups=[foo.ITEMS], props={PROJECT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                              Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                                Select(groups=[], props={SELECT_CRITERIA=1 = 0})
                                  Source(groups=[foo.ITEMS], props={OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                        TupleLimit(groups=[])
                          Sort(groups=[bar.ITEMS])
                            Project(groups=[bar.ITEMS], props={PROJECT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                              Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                                Select(groups=[], props={SELECT_CRITERIA=1 = 1})
                                  Source(groups=[bar.ITEMS], props={OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})

             


            ============================================================================
            EXECUTING MergeVirtual

             

            AFTER:
            TupleLimit(groups=[])
              Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
                SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                  TupleLimit(groups=[])
                    Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                      Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
                        Access(groups=[foo.ITEMS])
                          Select(groups=[])
                            Source(groups=[foo.ITEMS])
                  TupleLimit(groups=[])
                    Sort(groups=[bar.ITEMS])
                      Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
                        Access(groups=[bar.ITEMS])
                          Select(groups=[])
                            Source(groups=[bar.ITEMS])

             


            ============================================================================
            EXECUTING CleanCriteria

             

            AFTER:
            TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500, OUTPUT_COLS=null})
              Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
                SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=null})
                  TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=null})
                    Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
                      Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source], OUTPUT_COLS=null})
                        Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=null})
                          Null(groups=[foo.ITEMS])
                  TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=null})
                    Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
                      Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source], OUTPUT_COLS=null})
                        Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=null})
                          Source(groups=[bar.ITEMS], props={OUTPUT_COLS=null})

             


            ============================================================================
            EXECUTING RaiseNull

             

            09:15:19,286 DEBUG [org.teiid.COMMAND_LOG] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU     ERROR USER COMMAND:    endTime=2014-02-24 09:15:19.286    requestID=7rl0TEA59SdU.7    txID=null    sessionID=7rl0TEA59SdU    principal=user@teiid-security    vdbName=Foo    vdbVersion=1    finalRowCount=null
            09:15:19,287 ERROR [org.teiid.PROCESSOR] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU TEIID30019 Unexpected exception for request 7rl0TEA59SdU.7: java.lang.ArrayIndexOutOfBoundsException: -1
                at java.util.ArrayList.elementData(ArrayList.java:400) [rt.jar:1.7.0_51]
                at java.util.ArrayList.get(ArrayList.java:413) [rt.jar:1.7.0_51]
                at org.teiid.query.optimizer.relational.rules.RuleRaiseNull.raiseNullNode(RuleRaiseNull.java:170) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.query.optimizer.relational.rules.RuleRaiseNull.execute(RuleRaiseNull.java:70) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.query.optimizer.relational.RelationalPlanner.executeRules(RelationalPlanner.java:606) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.query.optimizer.relational.RelationalPlanner.optimize(RelationalPlanner.java:255) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.query.optimizer.QueryOptimizer.optimizePlan(QueryOptimizer.java:159) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.Request.generatePlan(Request.java:408) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.Request.processRequest(Request.java:435) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.RequestWorkItem.processNew(RequestWorkItem.java:600) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.RequestWorkItem.process(RequestWorkItem.java:309) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.AbstractWorkItem.run(AbstractWorkItem.java:51) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.RequestWorkItem.run(RequestWorkItem.java:248) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.DQPWorkContext.runInContext(DQPWorkContext.java:273) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.ThreadReuseExecutor$RunnableWrapper.run(ThreadReuseExecutor.java:119) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at org.teiid.dqp.internal.process.ThreadReuseExecutor$3.run(ThreadReuseExecutor.java:214) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51]
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51]
                at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51]

            • 3. Re: Partitioned UNION
              Steven Hawkins Master

              I can confirm this.  Can you log an issue?

               

              Thanks,

              Steve

              1 of 1 people found this helpful
              • 5. Re: Partitioned UNION
                Steven Hawkins Master

                This was easier to address than I had anticipated, so I had logged [TEIID-2862] index out of bounds when a union branch with limit/order by is removed - JBoss Issue Tracker before seeing that you had done the same.  In any case the fix is in for 8.7.

                 

                Thanks,

                Steve

                • 6. Re: Partitioned UNION
                  Tom Arnold Novice

                  Thanks Steve!

                   

                  I have a build setup for master/8.7 but have not had a chance to try it.

                   

                  As a workaround until then, adding a dummy select as the first query seems to avoid the issue.

                   

                  Something like:

                   

                  SELECT * FROM (

                    SELECT null foo_id, null created_at, null source

                    UNION ALL

                    SELECT foo_id, created_at, 'Foo' from foo.items

                    UNION ALL

                    SELECT bar_id, created_at, 'Bar' from bar.items

                  ) x where x.source is not null;

                   

                  Edit: I was getting weird results with this workaround until I added "LIMIT 1" to the dummy query.

                   

                  Tom

                  • 7. Re: Partitioned UNION
                    Tom Arnold Novice

                    On a related note, is there a way to push an order by into the inner queries?

                     

                    Edit (because I've been rate limited):

                     

                    Whoops, yep. This is my query.

                     

                    select * from (

                      (select item_id, created_at, 'Foo' source from foo.items order by created_at desc limit 5000)

                       union all

                      (select item_id, created_at, 'Bar' source from bar.items order by created_at desc limit 5000)

                    ) x

                    where

                      source in ('Foo', 'Bar')

                    order by created_at desc

                    limit 0, 500;

                     

                    I'd prefer to remove the ORDER BY from the nested queries and have Teiid add the ORDER BY from the outer query when it can, just as it does with filters from the outer query. Basically I want to send this, but have Teiid run the query from above.

                     

                    select * from (

                      (select item_id, created_at, 'Foo' source from foo.items limit 5000)

                       union all

                      (select item_id, created_at, 'Bar' source from bar.items limit 5000)

                    ) x

                    where

                      source in ('Foo', 'Bar')

                    order by created_at desc

                    limit 0, 500;

                    • 8. Re: Partitioned UNION
                      Steven Hawkins Master

                      Can you provide an example of what you mean?

                       

                      Steve

                      • 9. Re: Partitioned UNION
                        Steven Hawkins Master

                        Generally we have to treat an ordered limit as an in place operation in the plan.

                         

                        In the case of the second query, what is the advantage of adding the order bys to the union branches?  You'll still have to process the parent order by - although there is an existing issue somewhere for us to process distinct/order by over a union in this manner as sorted sublists rather than a full sort.

                         

                        Steve

                        • 10. Re: Partitioned UNION
                          Tom Arnold Novice

                          I need an ORDER BY in here because there are generally >5000 results from this, and I'm usually interested in the most recent.

                           

                          select item_id, created_at, 'Bar' source from bar.items limit 5000

                           

                          For the whole thing, I want to make a view like this. I have existing code that uses ITEMS. I have V_ITEMS with the same columns as ITEMS, but the data comes from additional sources. I want to be able to switch the software to use V_ITEMS with as few changes as possible. This means that when the software does an ORDER BY on the view I'd like Teiid to run the individual queries with that ORDER BY when possible.

                           

                          create view v_items (...) as

                          select * from (

                            (select item_id, created_at, 'Foo' source from foo.items limit 5000)

                             union all

                            (select item_id, created_at, 'Bar' source from bar.items limit 5000)

                          ) x;

                           

                          Might need to take another shot at explaining this in the morning because my brain is toast.

                           

                          Thanks,

                          Tom

                          • 11. Re: Partitioned UNION
                            Steven Hawkins Master

                            Ok I think I follow you.  You want both the limit and the order by pushed and in this case you'd have the top level limit supersede the nested.  Yes, that makes sense and could be done.  Can you log an issue?

                             

                            Steve

                            • 12. Re: Partitioned UNION
                              Tom Arnold Novice

                              Regarding the limit, I would like even more to be able to do this and have both the ORDER BY and LIMIT pushed. The ORDER BY seems like it should be straightforward, but I'm not sure how the LIMIT would work exactly in this scenario.

                               

                              create view v_items (...) as

                              select * from (

                                (select item_id, created_at, 'Foo' source from foo.items)

                                 union all

                                (select item_id, created_at, 'Bar' source from bar.items)

                              ) x;

                               

                              -- Page 1.

                              select * from v_items order by created_at desc limit 0, 500;

                              -- Page 2.

                              select * from v_items order by created_at desc limit 500, 500;

                               

                              I will write an enhancement issue, at least for the ORDER BY part.

                               

                              Tom

                              • 13. Re: Partitioned UNION
                                Steven Hawkins Master

                                The driver for optimization would be the top level limit.  If you have a scenario such as

                                 

                                union all

                                   ...  limit x

                                   ...  limit x

                                 

                                with an ordering above, the optimizer has no reason to push the affect of the ordering as it actually makes the plan more costly to execute.  However if there is an order by and limit above then even though we potentially have to reorder (or ideally process as sorted sublists), the the cost could be lower (if the top level limit is sufficiently less than the nested limits - or if there are no nested limits) with the affect of both the ordering and the limit pushed.

                                 

                                Offset handling in this case though would basically expand the nested limit, that is a top level limit 500, 500 would be pushed to a branch as limit 1000.  So the benefit would lessen further into the results.

                                1 of 1 people found this helpful
                                • 14. Re: Partitioned UNION
                                  Tom Arnold Novice

                                  I wrote an enhancement issue on this.

                                   

                                  https://issues.jboss.org/browse/TEIID-2878

                                   

                                  Thanks,

                                  Tom

                                  1 2 Previous Next