> Are there any hints available to force that behavior to not push down the aggregate to Impala?
Not currently. The workaround would be to use an anonymous procedure block:
insert into #temp select col1, col2 from table;
select count(distinct col1), count(distinct col2) from #temp;
Did some more reading on Cloudera's documentation and here's an excerpt.
By default, Impala only allows a single COUNT(DISTINCT columns) expression in each query.
To produce the same result as multiple COUNT(DISTINCT) expressions, you can use the following technique for queries involving a single table:
select v1.c1 result1, v2.c1 result2 from (select count(distinct col1) as c1 from t1) v1 cross join (select count(distinct col2) as c1 from t1) v2;
Ideally, the translator should rewrite the Impala query similar as above.
That works fine as long as there is no group by clause, then the cross join could produce a much larger row set than desired. Can you log an issue for the translator to handle single group case?
1 of 1 people found this helpful
Thanks Scott. The fix will be in 8.12 Final.
Awesome! That was fast, thank you!