5 Replies Latest reply on Jan 26, 2016 10:05 AM by jie tao

    unable to evaluate Hive source function

    jie tao Novice

      In my Rest-enabled virtual procedure I used the Hive UDF get_json_object in my query and got this exception as I called the procedure via Web service:

       

      "org.teiid.jdbc.TeiidSQLException: TEIID30328 Unable to evaluate XMLELEMENT(NAME event, XMLFOREST(Hive13.\"\\\"\"test\".\"event_log\\\"\"\".game AS game, Hive13.\"\\\"\"test\".\"event_log\\\"\"\".lang AS lang, get_json_object(Hive13.\"\\\"\"test\".\"event_log\\\"\"\".json, '$.product') AS product)): TEIID30341 Function Hive13.get_json_object is marked in the function metadata as a function that must be evaluated at the source.","details": "javax.ws.rs.WebApplicationException: org.teiid.jdbc.TeiidSQLException: TEIID30328 Unable to evaluate XMLELEMENT(NAME event, XMLFOREST(Hive13.\"\\\"\"test\".\"event_log\\\"\"\".game AS game, Hive13.\"\\\"\"test\".\"event_log\\\"\"\".lang AS lang, get_json_object(Hive13.\"\\\"\"test\".\"event_log\\\"\"\".json, '$.product') AS product)): TEIID30341 Function Hive13.get_json_object is marked in the function metadata as a function that must be evaluated at the source.


      I defined this UDF in my Hive source model as following:


      <<procedure>>

      get_json_object

       

      json: string (4000)

      field: string (4000)

      result: string (4000)

       

      where the first two parameter with Direction "IN" and the last one with "RETURN".

       

      I call the function in my virtual procedure in this way:

       

      select get_json_object(event_log.json,'$.product') from event_log;

        • 1. Re: unable to evaluate Hive source function
          Ramesh Reddy Master

          can you provide the query plan?

          • 2. Re: unable to evaluate Hive source function
            jie tao Novice

            The following is the Execution Plan I got from Designer while preview the virtual procedure. Is this what you need? If not, where I get it?

             

            <?xml version="1.0" encoding="UTF-8"?><node name="ProjectNode"><property name="Relational Node ID"><value>4</value></property><property name="Output Columns"><value>xml_out (xml)</value></property><property name="Statistics"><value>Node Output Rows: 0</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 9</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 1</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: -1.0</value></property><property name="Child 0"><node name="ProjectNode"><property name="Relational Node ID"><value>5</value></property><property name="Output Columns"><value>xml_out (xml)</value></property><property name="Statistics"><value>Node Output Rows: 0</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 9</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 1</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: -1.0</value></property><property name="Child 0"><node name="PlanExecutionNode"><property name="Relational Node ID"><value>6</value></property><property name="Output Columns"><value>xml_out (xml)</value></property><property name="Statistics"><value>Node Output Rows: 0</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 9</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 1</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: -1.0</value></property><property name="Execution Plan"><node name="Program"><property name="Instruction 0"><node name="CREATE CURSOR"><property name="Result Set"></property><property name="Query"><node name="ProjectNode"><property name="Relational Node ID"><value>1</value></property><property name="Output Columns"><value>xml_out (xml)</value></property><property name="Statistics"><value>Node Output Rows: 0</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 0</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 1</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: 1.0</value></property><property name="Child 0"><node name="GroupingNode"><property name="Relational Node ID"><value>2</value></property><property name="Output Columns"><value>agg0 (xml)</value></property><property name="Statistics"><value>Node Output Rows: 0</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 0</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 1</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: 1.0</value></property><property name="Child 0"><node name="AccessNode"><property name="Relational Node ID"><value>3</value></property><property name="Output Columns"><value>game (string)</value><value>lang (string)</value><value>json (string)</value></property><property name="Statistics"><value>Node Output Rows: 512</value><value>Node Next Batch Process Time: 0</value><value>Node Cumulative Next Batch Process Time: 1</value><value>Node Cumulative Process Time: 0</value><value>Node Next Batch Calls: 2</value><value>Node Blocks: 1</value></property><property name="Cost Estimates"><value>Estimated Node Cardinality: -1.0</value></property><property name="Query"><value>SELECT g_0.game, g_0.lang, g_0.json FROM Hive13."\""test"."event_log\""" AS g_0</value></property><property name="Model Name"><value>Hive13</value></property></node></property><property name="Grouping Mapping"><value>anon_grp0.agg0=XMLAGG(XMLELEMENT(NAME event, XMLFOREST(Hive13."\""test"."event_log\""".game AS game, Hive13."\""test"."event_log\""".lang AS lang, get_json_object(Hive13."\""test"."event_log\""".json, '$.product') AS product)))</value></property><property name="Sort Mode"><value>false</value></property></node></property><property name="Select Columns"><value>XMLELEMENT(NAME events, anon_grp0.agg0) AS xml_out</value></property></node></property></node></property><property name="Output Columns"><value>xml_out (xml)</value></property></node></property></node></property><property name="Select Columns"><value>HiveView.getJsonObject.xml_out</value></property></node></property><property name="Select Columns"><value>X_X.xml_out</value></property><property name="Data Bytes Sent"><value>0</value></property><property name="Planning Time"><value>42</value></property></node>

            • 3. Re: unable to evaluate Hive source function
              Steven Hawkins Master

              The problem is that the optimizer isn't looking to individually push the subexpressions in XMLAGG, which is where get_json_object is referenced.  The processor also cannot simply evaluate get_json_object since the Hive translator does not support selectWithFrom - we'll need to check to see if Hive supports something like dual as that would be an easy way to address this.

               

              Otherwise you would need to create a view layer with the NO_UNNEST hint to project the get_json_object so that the planner can push it to the source.

              • 4. Re: unable to evaluate Hive source function
                Steven Hawkins Master

                To expand on the last reply, you would need a query that looks like:

                 

                select xmlelement(...) from /*+ no_unnest */ (select get_json_object(Hive13."\""test"."event_log\""".json, '$.product') AS product ... ) v

                 

                So that the optimizer would push the needed subexpression.

                 

                I've also logged [TEIID-3934] Additional must pushdown support - JBoss Issue Tracker to continue the work of [TEIID-3227] Add support for sub-expression pushing of must pushdown functions - JBoss Issue Tracker which only handled the subexpressions in aggregate functions.

                • 5. Re: unable to evaluate Hive source function
                  jie tao Novice

                  I did as you said. Works! Thx a lot.