-
1. Re: Teiid Impala Multiple Count Distinct Issue
shawkins Oct 5, 2015 5:31 PM (in response to walla2sl)> Are there any hints available to force that behavior to not push down the aggregate to Impala?
Not currently. The workaround would be to use an anonymous procedure block:
begin
insert into #temp select col1, col2 from table;
select count(distinct col1), count(distinct col2) from #temp;
end
-
2. Re: Teiid Impala Multiple Count Distinct Issue
walla2sl Oct 6, 2015 4:09 PM (in response to shawkins)Did some more reading on Cloudera's documentation and here's an excerpt.
By default, Impala only allows a single COUNT(DISTINCT columns) expression in each query.
To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the following technique for queries involving a single table:
select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1 cross join (select count(distinct col2) as c1 from t1) v2;
Ideally, the translator should rewrite the Impala query similar as above.
-
3. Re: Teiid Impala Multiple Count Distinct Issue
shawkins Oct 7, 2015 8:26 AM (in response to walla2sl)That works fine as long as there is no group by clause, then the cross join could produce a much larger row set than desired. Can you log an issue for the translator to handle single group case?
-
4. Re: Teiid Impala Multiple Count Distinct Issue
walla2sl Oct 7, 2015 11:55 AM (in response to shawkins)Thanks, Steven. Opened [TEIID-3743] Multiple Count Distinct Columns Fails for Impala - JBoss Issue Tracker
-
5. Re: Teiid Impala Multiple Count Distinct Issue
shawkins Oct 7, 2015 4:37 PM (in response to walla2sl)1 of 1 people found this helpfulThanks Scott. The fix will be in 8.12 Final.
-
6. Re: Teiid Impala Multiple Count Distinct Issue
walla2sl Oct 8, 2015 10:06 AM (in response to shawkins)Awesome! That was fast, thank you!