-
1. Re: Partitioned UNION
rareddy Feb 24, 2014 8:52 AM (in response to tom9729)Can you post/attach the whole stack.
-
2. Re: Partitioned UNION
tom9729 Feb 24, 2014 9:38 AM (in response to rareddy)Whoops, here you go.
I noticed that if I remove the ORDER BY from the BAR query it seems to work (although obviously not the results I want).
Edit: I can't seem to post a new reply in this thread because the forum says I've been rate-limited?
09:15:19,265 DEBUG [org.teiid.COMMAND_LOG] (New I/O worker #2) 7rl0TEA59SdU START USER COMMAND: startTime=2014-02-24 09:15:19.264 requestID=7rl0TEA59SdU.7 txID=null sessionID=7rl0TEA59SdU applicationName=JDBCprincipal=user@teiid-security vdbName=Foo vdbVersion=1 sql=select * from (
(select item_id, created_at, 'Foo' source from foo.products order by created_at desc limit 5000)
union all
(select item_id, created_at, 'Bar' source from bar.products order by created_at desc limit 5000)
) x
where
source in ('Bar')
order by created_at desc
limit 0, 500
09:15:19,279 INFO [org.teiid.PLANNER] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU
============================================================================
USER COMMAND:
SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500----------------------------------------------------------------------------
OPTIMIZE:
SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500----------------------------------------------------------------------------
GENERATE CANONICAL:
SELECT * FROM ((SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000)) AS x WHERE x.source = 'Bar' ORDER BY x.CREATED_AT DESC LIMIT 500CANONICAL PLAN:
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500})
Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC})
Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar'})
Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT, x.source='Foo'}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}})
SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true})
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC})
Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source]})
Source(groups=[foo.ITEMS])
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC})
Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source]})
Source(groups=[bar.ITEMS])
============================================================================
EXECUTING PlaceAccessAFTER:
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500})
Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC})
Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar'})
Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT, x.source='Foo'}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}})
SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true})
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC})
Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source]})
Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000})
Source(groups=[foo.ITEMS])
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000})
Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC})
Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source]})
Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000})
Source(groups=[bar.ITEMS])
============================================================================
EXECUTING AssignOutputElementsAFTER:
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Sort(groups=[x], props={SORT_ORDER=ORDER BY x.CREATED_AT DESC, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Project(groups=[x], props={PROJECT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source], OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar', OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.source='Foo', x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}, OUTPUT_COLS=[x.source, x.ITEM_ID, x.CREATED_AT]})
SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Project(groups=[foo.ITEMS], props={PROJECT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Source(groups=[foo.ITEMS], props={OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Project(groups=[bar.ITEMS], props={PROJECT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Source(groups=[bar.ITEMS], props={OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
============================================================================
EXECUTING PushSelectCriteriaAFTER:
TupleLimit(groups=[])
Sort(groups=[x])
Project(groups=[x])
Select(groups=[x], props={SELECT_CRITERIA=x.source = 'Bar', OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source], IS_PHANTOM=true})
Source(groups=[x], props={NESTED_COMMAND=(SELECT foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source FROM foo.ITEMS ORDER BY foo.ITEMS.CREATED_AT DESC LIMIT 5000) UNION ALL (SELECT bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source FROM bar.ITEMS ORDER BY bar.ITEMS.CREATED_AT DESC LIMIT 5000), SYMBOL_MAP={x.source='Foo', x.ITEM_ID=foo.ITEMS.ITEM_ID, x.CREATED_AT=foo.ITEMS.CREATED_AT}, PARTITION_INFO={x.source=[['Foo'], ['Bar']]}, OUTPUT_COLS=[x.source, x.ITEM_ID, x.CREATED_AT]})
SetOperation(groups=[])
TupleLimit(groups=[])
Sort(groups=[foo.ITEMS])
Project(groups=[foo.ITEMS], props={PROJECT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Select(groups=[], props={SELECT_CRITERIA=1 = 0})
Source(groups=[foo.ITEMS], props={OUTPUT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
TupleLimit(groups=[])
Sort(groups=[bar.ITEMS])
Project(groups=[bar.ITEMS], props={PROJECT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Select(groups=[], props={SELECT_CRITERIA=1 = 1})
Source(groups=[bar.ITEMS], props={OUTPUT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
============================================================================
EXECUTING MergeVirtualAFTER:
TupleLimit(groups=[])
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=[x.ITEM_ID, x.CREATED_AT, x.source]})
SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
TupleLimit(groups=[])
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source], OUTPUT_COLS=['Foo' AS source, foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT]})
Access(groups=[foo.ITEMS])
Select(groups=[])
Source(groups=[foo.ITEMS])
TupleLimit(groups=[])
Sort(groups=[bar.ITEMS])
Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source], OUTPUT_COLS=['Bar' AS source, bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT]})
Access(groups=[bar.ITEMS])
Select(groups=[])
Source(groups=[bar.ITEMS])
============================================================================
EXECUTING CleanCriteriaAFTER:
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=500, OUTPUT_COLS=null})
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
SetOperation(groups=[], props={SET_OPERATION=UNION, USE_ALL=true, OUTPUT_COLS=null})
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=null})
Sort(groups=[foo.ITEMS], props={SORT_ORDER=ORDER BY foo.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
Project(groups=[foo.ITEMS], props={PROJECT_COLS=[foo.ITEMS.ITEM_ID, foo.ITEMS.CREATED_AT, 'Foo' AS source], OUTPUT_COLS=null})
Access(groups=[foo.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=foo, nameInSource=null, uuid=tid:3af63e7dfae5-000194c4-00000000, OUTPUT_COLS=null})
Null(groups=[foo.ITEMS])
TupleLimit(groups=[], props={MAX_TUPLE_LIMIT=5000, OUTPUT_COLS=null})
Sort(groups=[bar.ITEMS], props={SORT_ORDER=ORDER BY bar.ITEMS.CREATED_AT DESC, OUTPUT_COLS=null})
Project(groups=[bar.ITEMS], props={PROJECT_COLS=[bar.ITEMS.ITEM_ID, bar.ITEMS.CREATED_AT, 'Bar' AS source], OUTPUT_COLS=null})
Access(groups=[bar.ITEMS], props={SOURCE_HINT=null, MODEL_ID=Schema name=bar, nameInSource=null, uuid=tid:7079feeb259f-00259c3e-00000000, OUTPUT_COLS=null})
Source(groups=[bar.ITEMS], props={OUTPUT_COLS=null})
============================================================================
EXECUTING RaiseNull09:15:19,286 DEBUG [org.teiid.COMMAND_LOG] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU ERROR USER COMMAND: endTime=2014-02-24 09:15:19.286 requestID=7rl0TEA59SdU.7 txID=null sessionID=7rl0TEA59SdU principal=user@teiid-security vdbName=Foo vdbVersion=1 finalRowCount=null
09:15:19,287 ERROR [org.teiid.PROCESSOR] (Worker5_QueryProcessorQueue50) 7rl0TEA59SdU TEIID30019 Unexpected exception for request 7rl0TEA59SdU.7: java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:400) [rt.jar:1.7.0_51]
at java.util.ArrayList.get(ArrayList.java:413) [rt.jar:1.7.0_51]
at org.teiid.query.optimizer.relational.rules.RuleRaiseNull.raiseNullNode(RuleRaiseNull.java:170) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.query.optimizer.relational.rules.RuleRaiseNull.execute(RuleRaiseNull.java:70) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.query.optimizer.relational.RelationalPlanner.executeRules(RelationalPlanner.java:606) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.query.optimizer.relational.RelationalPlanner.optimize(RelationalPlanner.java:255) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.query.optimizer.QueryOptimizer.optimizePlan(QueryOptimizer.java:159) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.Request.generatePlan(Request.java:408) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.Request.processRequest(Request.java:435) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.RequestWorkItem.processNew(RequestWorkItem.java:600) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.RequestWorkItem.process(RequestWorkItem.java:309) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.AbstractWorkItem.run(AbstractWorkItem.java:51) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.RequestWorkItem.run(RequestWorkItem.java:248) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.DQPWorkContext.runInContext(DQPWorkContext.java:273) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.ThreadReuseExecutor$RunnableWrapper.run(ThreadReuseExecutor.java:119) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at org.teiid.dqp.internal.process.ThreadReuseExecutor$3.run(ThreadReuseExecutor.java:214) [teiid-engine-8.6.0.Final.jar:8.6.0.Final]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] -
3. Re: Partitioned UNION
shawkins Feb 24, 2014 11:08 AM (in response to tom9729)1 of 1 people found this helpfulI can confirm this. Can you log an issue?
Thanks,
Steve
-
4. Re: Partitioned UNION
tom9729 Feb 24, 2014 12:44 PM (in response to shawkins)Wrote this issue: https://issues.jboss.org/browse/TEIID-2861
-
5. Re: Partitioned UNION
shawkins Feb 24, 2014 12:48 PM (in response to tom9729)This was easier to address than I had anticipated, so I had logged [TEIID-2862] index out of bounds when a union branch with limit/order by is removed - JBoss Issue Tracker before seeing that you had done the same. In any case the fix is in for 8.7.
Thanks,
Steve
-
6. Re: Partitioned UNION
tom9729 Feb 24, 2014 7:03 PM (in response to shawkins)Thanks Steve!
I have a build setup for master/8.7 but have not had a chance to try it.
As a workaround until then, adding a dummy select as the first query seems to avoid the issue.
Something like:
SELECT * FROM (
SELECT null foo_id, null created_at, null source
UNION ALL
SELECT foo_id, created_at, 'Foo' from foo.items
UNION ALL
SELECT bar_id, created_at, 'Bar' from bar.items
) x where x.source is not null;
Edit: I was getting weird results with this workaround until I added "LIMIT 1" to the dummy query.
Tom
-
7. Re: Partitioned UNION
tom9729 Feb 24, 2014 7:02 PM (in response to tom9729)On a related note, is there a way to push an order by into the inner queries?
Edit (because I've been rate limited):
Whoops, yep. This is my query.
select * from (
(select item_id, created_at, 'Foo' source from foo.items order by created_at desc limit 5000)
union all
(select item_id, created_at, 'Bar' source from bar.items order by created_at desc limit 5000)
) x
where
source in ('Foo', 'Bar')
order by created_at desc
limit 0, 500;
I'd prefer to remove the ORDER BY from the nested queries and have Teiid add the ORDER BY from the outer query when it can, just as it does with filters from the outer query. Basically I want to send this, but have Teiid run the query from above.
select * from (
(select item_id, created_at, 'Foo' source from foo.items limit 5000)
union all
(select item_id, created_at, 'Bar' source from bar.items limit 5000)
) x
where
source in ('Foo', 'Bar')
order by created_at desc
limit 0, 500;
-
8. Re: Partitioned UNION
shawkins Feb 24, 2014 6:47 PM (in response to tom9729)Can you provide an example of what you mean?
Steve
-
9. Re: Partitioned UNION
shawkins Feb 24, 2014 7:21 PM (in response to shawkins)Generally we have to treat an ordered limit as an in place operation in the plan.
In the case of the second query, what is the advantage of adding the order bys to the union branches? You'll still have to process the parent order by - although there is an existing issue somewhere for us to process distinct/order by over a union in this manner as sorted sublists rather than a full sort.
Steve
-
10. Re: Partitioned UNION
tom9729 Feb 24, 2014 8:38 PM (in response to shawkins)I need an ORDER BY in here because there are generally >5000 results from this, and I'm usually interested in the most recent.
select item_id, created_at, 'Bar' source from bar.items limit 5000
For the whole thing, I want to make a view like this. I have existing code that uses ITEMS. I have V_ITEMS with the same columns as ITEMS, but the data comes from additional sources. I want to be able to switch the software to use V_ITEMS with as few changes as possible. This means that when the software does an ORDER BY on the view I'd like Teiid to run the individual queries with that ORDER BY when possible.
create view v_items (...) as
select * from (
(select item_id, created_at, 'Foo' source from foo.items limit 5000)
union all
(select item_id, created_at, 'Bar' source from bar.items limit 5000)
) x;
Might need to take another shot at explaining this in the morning because my brain is toast.
Thanks,
Tom
-
11. Re: Partitioned UNION
shawkins Feb 25, 2014 8:53 AM (in response to tom9729)Ok I think I follow you. You want both the limit and the order by pushed and in this case you'd have the top level limit supersede the nested. Yes, that makes sense and could be done. Can you log an issue?
Steve
-
12. Re: Partitioned UNION
tom9729 Feb 25, 2014 10:12 AM (in response to shawkins)Regarding the limit, I would like even more to be able to do this and have both the ORDER BY and LIMIT pushed. The ORDER BY seems like it should be straightforward, but I'm not sure how the LIMIT would work exactly in this scenario.
create view v_items (...) as
select * from (
(select item_id, created_at, 'Foo' source from foo.items)
union all
(select item_id, created_at, 'Bar' source from bar.items)
) x;
-- Page 1.
select * from v_items order by created_at desc limit 0, 500;
-- Page 2.
select * from v_items order by created_at desc limit 500, 500;
I will write an enhancement issue, at least for the ORDER BY part.
Tom
-
13. Re: Partitioned UNION
shawkins Feb 27, 2014 8:34 PM (in response to tom9729)1 of 1 people found this helpfulThe driver for optimization would be the top level limit. If you have a scenario such as
union all
... limit x
... limit x
with an ordering above, the optimizer has no reason to push the affect of the ordering as it actually makes the plan more costly to execute. However if there is an order by and limit above then even though we potentially have to reorder (or ideally process as sorted sublists), the the cost could be lower (if the top level limit is sufficiently less than the nested limits - or if there are no nested limits) with the affect of both the ordering and the limit pushed.
Offset handling in this case though would basically expand the nested limit, that is a top level limit 500, 500 would be pushed to a branch as limit 1000. So the benefit would lessen further into the results.
-