6 Replies Latest reply on Oct 8, 2015 10:06 AM by walla2sl

Teiid Impala Multiple Count Distinct Issue

walla2sl Oct 5, 2015 11:14 AM

Impala does not currently support multiple count distinct on different columns in the same sql.

https://issues.cloudera.org/browse/IMPALA-110

If we execute such a query in Teiid against Impala, an error is received.

Eg. Teiid query:

select count(distinct col1), count(distinct col2)

from table;

Would it be possible for the Teiid translator to request the details and execute the aggregate count distinct on its end?

Are there any hints available to force that behavior to not push down the aggregate to Impala?

Much thanks,

Scott

1. Re: Teiid Impala Multiple Count Distinct Issue

shawkins Oct 5, 2015 5:31 PM (in response to walla2sl)

> Are there any hints available to force that behavior to not push down the aggregate to Impala?

Not currently. The workaround would be to use an anonymous procedure block:

begin
insert into #temp select col1, col2 from table;
select count(distinct col1), count(distinct col2) from #temp;
end
Actions
2. Re: Teiid Impala Multiple Count Distinct Issue

walla2sl Oct 6, 2015 4:09 PM (in response to shawkins)
Did some more reading on Cloudera's documentation and here's an excerpt.

By default, Impala only allows a single COUNT(DISTINCT columns) expression in each query.

To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the following technique for queries involving a single table:

select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1 cross join (select count(distinct col2) as c1 from t1) v2;

Ideally, the translator should rewrite the Impala query similar as above.
Actions
3. Re: Teiid Impala Multiple Count Distinct Issue

shawkins Oct 7, 2015 8:26 AM (in response to walla2sl)

That works fine as long as there is no group by clause, then the cross join could produce a much larger row set than desired. Can you log an issue for the translator to handle single group case?
Actions
4. Re: Teiid Impala Multiple Count Distinct Issue

walla2sl Oct 7, 2015 11:55 AM (in response to shawkins)

Thanks, Steven. Opened [TEIID-3743] Multiple Count Distinct Columns Fails for Impala - JBoss Issue Tracker
Actions
5. Re: Teiid Impala Multiple Count Distinct Issue

shawkins Oct 7, 2015 4:37 PM (in response to walla2sl)

Thanks Scott. The fix will be in 8.12 Final.
1 of 1 people found this helpful
Actions
6. Re: Teiid Impala Multiple Count Distinct Issue

walla2sl Oct 8, 2015 10:06 AM (in response to shawkins)

Awesome! That was fast, thank you!
Actions

Go to original post