2 Replies Latest reply on May 8, 2015 8:29 AM by shawkins

    Pushing GROUP BY across federated join?

    markaddleman

      (copying the community from an email thread)

       

      I think the planner misses an opportunity to push a GROUP BY across a federated join but I'm not certain of my logic.  Consider the query

      with t as (select 2014 as y, 1 as m union select 2015, 2)

      select year(ts), month(ts), count(t.m) from timerange inner join t on t.y=year(ts) and t.m=month(ts) where year(ts)=2015 GROUP BY year(ts), month(ts)

       

      Here, the planner pushes

      SELECT timerange.ts FROM timerange WHERE year(timerange.ts) = 2015

      to the timerange translator but I think it can push

      SELECT year(ts), month(ts) FROM timerange GROUP BY year(ts), month(ts) 

       

      The query plan is attached.

       

      Pushing the GROUP BY is important for my use case.  The timerange table has a timestamp column and a duration column.  The semantics of the table are the timestamp column represents the start of a time range and the duration represents the length of a time range which can only be computed by examining the GROUP BY. 

       

      What are your thoughts?  Is it proper to push the GROUP BY in this case?

       

      -------------------------------------------------------------------

      Response from Steven Hawkins:

       

      It would only make sense to push the grouping in a limited set of circumstances.  If there were a count distinct (although we don't currently do that), or we know the join cardinality and that the year/month combinations are unique (however if they were unique then there isn't really a reason to push the grouping), etc.  Otherwise suppose you don't push the grouping and have repeated year month entries from ts that match in t.  This should produce a count of at least two.  If you push the grouping, then there would just be 1 row and the count would be off.

       

      Also with https://issues.jboss.org/browse/TEIID-3464 we'll just inline with clauses like the one you have here.

        • 1. Re: Pushing GROUP BY across federated join?
          markaddleman

          > suppose you don't push the grouping and have repeated year month entries from ts that match in t.  This should produce a count of at least two.  If you push the grouping, then there would just be 1 row and the count would be off.

           

          Yes, of course, you're right. 

           

          > we know the join cardinality and that the year/month combinations are unique (however if they were unique then there isn't really a reason to push the grouping)

           

          The more I think about my use case, I'm trying to use the translator for something it's not:  a means to layer semantics on the query because the client - an MDX server - doesn't understand the underlying semantics of the data.  What I really need is a layer that sits before the planner to massage the SQL taking into account the client's limitations.  This might be a perfect opportunity to try out TEIID-3369.  I haven't explored the API yet but if I can introduce hints, I'm pretty sure I can achieve what I want in a cleaner way.

          • 2. Re: Pushing GROUP BY across federated join?
            shawkins

            Thanks Mark for converting this into a forum posting.

             

            Yes it may make more sense to handle any semantic mismatch above us rather than trying to get the engine to compensate.