8 Replies Latest reply on Oct 13, 2014 8:14 AM by shiveeta.mattoo Branched to a new discussion.

    Queries regarding Teiid Texttable function

    shiveeta.mattoo

      I am using Teiid in embedded mode to connect to flat files using Teiid's Flat File Connector.

      There are couple of issues/clarifications I am facing while using the configurations for the TEXTTABLE function.

       

      1. NO TRIM -

       

      On specifying the NO TRIM in the Select Query, encounter an exception -

      org.teiid.api.exception.query.QueryParserException: TEIID31100 Parsing error: Encountered "name STRING NO [*]TRIM[*] ,surname" at line 1, column 118.

      Was expecting: "row".

       

      Could you provide any sample usage for the same?

       

      2. QUOTE -

       

      • In case we don't set 'quote' or qualifier character, and the input text contains certain qualifier characters, the tabular output of TEXTTABLE correctly treats the quotes as text itself.
      • However if The quote characters are present in a row which is identified as HEADER, Teiid throws exception -  org.teiid.core.TeiidProcessingException: TEIID30181 HEADER entry missing for column name <n>

       

      Any suggestions or workaround for this?

       

      3. CUSTOM LINE SEPARATOR -

       

      Is there a way we can specify a custom line separator character for the File instead of Default (\r\n) Newline?

        • 1. Re: Queries regarding Teiid Texttable function
          shawkins

          In general see FROM Clause - Teiid 8.9 (draft) - Project Documentation Editor and the grammar BNF for SQL Grammar - Teiid 8.9 (draft) - Project Documentation Editor

           

          1) Note that this option is only for fixed width parsing.

           

          2) Can you provide a full example of what you mean?  The header is parsed the same as any row and then we look in the values for the names of the columns defined in the texttable.

           

          3) Not currently.  That would require an enhancement.

          1 of 1 people found this helpful
          • 2. Re: Re: Queries regarding Teiid Texttable function
            shiveeta.mattoo

            Thanks Steven.

            1) In our application, we need the support for Trim Spaces for Delimited Text Files as well. Also there's a requirement for a Custom Line separator support as well. I will raise these two as enhancements in the community.

             

            2) Here is a full example related to the QUOTE issue -

             

            Input file - (tqualheader.csv)

            "n",s

            "susan","sarandon"

             

            SQL Query - (Being sent to Teiid Embedded server)

            SELECT A.\"n\",A.s FROM (EXEC theadq.getTextFiles('tqualheader.csv')) AS F, TEXTTABLE(F.file COLUMNS \"n\" STRING,s STRING delimiter ',' quote '\u0000' HEADER) AS A

             

            where, theadq is the view name

             

            Exception

            TEIID30181 HEADER entry missing for column name n in file:/E:/Work/tqualheader.csv.

             

            Debug ScreenShots


            Attached are the Screenshots of the code while debugging the flow.

             

            1. NameIndexes -

            NameIndexes-details.png

             

            2. TextTableNode -

            TexttableSQL.png

             

             

            As you can see, since the TEXTTABLE has Column name as just n , instead of "n" , the code in TextTableNode.java throws an exception at line 475 (Integer index = nameIndexes.get(col.getName().toUpperCase());)

             

            Is this indeed a bug or am I missing something here?

            Thanks for your help.

            • 3. Re: Re: Queries regarding Teiid Texttable function
              shawkins

              > Is this indeed a bug or am I missing something here?

               

              I think that there may be something missing.  Why is your quote character \u0000?  It appears from from your data it should just be the default double quote ".  If you remove the quote setting from your example, it will work as expected.

               

              > 1) In our application, we need the support for Trim Spaces for Delimited Text Files as well. Also there's a requirement for a Custom Line separator support as well. I will raise these two as enhancements in the community.

               

              What would be the purpose of trimming delimited entries?  By default non-quoted delimited entries are trimmed.  And a quoted delimited entry such as "  a  " is understood as preserving whitespace.  Since you can always use a trim/rtrim function in a select clause above the texttable there's probably little need to expand the scope of the texttable logic in any case.

               

              A custom line separator is a valid enhancement.

              • 4. Re: Re: Queries regarding Teiid Texttable function
                shiveeta.mattoo

                Hi Steven,

                 

                > QUOTE Issue :

                 

                Let me explain the use case for the QUOTE example I mentioned above. The scenario is that although input text contains DOUBLE QUOTE characters, we want the user to have the option to either treat these characters as 'TextQualifiers/Quote characters' or just as normal characters with no special meaning associated.

                 

                Hence, when we are giving user option to choose Double Quotes as Textqualifier character, things work perfect, using query -

                1. SELECT A.n,A.s FROM (EXEC tql1.getTextFiles('textqualifier.csv')) AS F, TEXTTABLE(F.file COLUMNS n STRING,s STRING delimiter ',' quote '"' HEADER) AS A

                 

                But, in case I want the Quote character to be just treated as String, removing the quote setting as you suggest will not help.

                2. SELECT A.n,A.s FROM (EXEC tql2.getTextFiles('textqualifier.csv')) AS F, TEXTTABLE(F.file COLUMNS n STRING,s STRING delimiter',' HEADER) AS A

                 

                Query #2 simply does what #1 was doing, since default quote character set in TextTableNode is - ". I am seeking a way to get around this default behaviour for my use case.

                 

                \u0000 - is the null or default value of char that I was trying to pass. This value gave me the desired behaviour, when the quote character is present anywhere in the input text except in HEADER. In case the Quote character is present in HEADER, we encounter exception reported above.

                 

                I hope, I am offering some clarity to the issue.

                 

                > TRIM Spaces

                 

                The purpose of the requested enhancement would be to support the scenario where there may be spaces outside the qualified text in case of delimited files.

                In such case, User could have capability to trim/not trim such spaces as desired.

                e.g. - "susan"  ,hello.


                Currently by default, we are always trimming these spaces. The enhancement request was to make this behaviour configurable as per user needs.

                 

                 

                Thank you very much for your prompt responses.

                • 5. Re: Re: Queries regarding Teiid Texttable function
                  shawkins

                  > \u0000 - is the null or default value of char that I was trying to pass. This value gave me the desired behaviour, when the quote character is present anywhere in the input text except in HEADER. In case the Quote character is present in HEADER, we encounter exception reported above.

                   

                  You may be misinterpreting the column name.  Just using as set of double quotes is the standard sql mechanism for complex identifiers. Instead to match the header value of "n", it would require using (shown as above escaped for java):  COLUMNS \"\"\"n\"\"\"


                  > The purpose of the requested enhancement would be to support the scenario where there may be spaces outside the qualified text in case of delimited files.

                   

                  Why would the white space outside of qualification be meaningful?  Wouldn't the whitespace be within the quoting if it were meant to be part of the value?  Or do you mean in this case that there is no / null quote character and that you want to preserve all whitespace?

                  1 of 1 people found this helpful
                  • 6. Re: Re: Queries regarding Teiid Texttable function
                    shiveeta.mattoo

                    Steven Hawkins wrote:

                     

                    >You may be misinterpreting the column name.  Just using as set of double quotes is the standard sql mechanism for complex identifiers. Instead to match the header value of "n", it would require using (shown as above escaped for java):  COLUMNS \"\"\"n\"\"\"


                         Thanks,I will try this out.


                    > Why would the white space outside of qualification be meaningful?  Wouldn't the whitespace be within the quoting if it were meant to be part of the value?  Or do you mean in this case that there is no / null quote character and that you want to preserve all whitespace?

                     

                    Yes, that's right, we may want to preserve the whitespaces, when there is no/ null quote character

                    • 7. Re: Re: Queries regarding Teiid Texttable function
                      shawkins

                      > Yes, that's right, we may want to preserve the whitespaces, when there is no/ null quote character

                       

                      Ok, go ahead and log something for that too.

                      • 8. Re: Re: Queries regarding Teiid Texttable function
                        shiveeta.mattoo

                        shiveeta.mattoo wrote:

                         

                        Steven Hawkins wrote:

                         

                        >You may be misinterpreting the column name.  Just using as set of double quotes is the standard sql mechanism for complex identifiers. Instead to match the header value of "n", it would require using (shown as above escaped for java):  COLUMNS \"\"\"n\"\"\"

                        > I tried the above suggestion and the SQL Query being sent to Teiid was  - SELECT A.\"\"\"n\"\"\" ,A.s FROM (EXEC tqualheader1.getTextFiles('tqualheader.csv')) AS F, TEXTTABLE(F.file COLUMNS \"\"\"n\"\"\"  STRING,s STRING delimiter ',' quote '\u0000' HEADER) AS A
                        Now the exception is not thrown anymore, and the column values are resolved in TextTableNode -> process method, as seen in screenshot below -
                        TexttableNodeProcess.png
                        However the final output of TextTable, results in following :
                        Texttableoutput.png
                        I couldn't figure out exactly where these values are being reset, after correct resolution at Line 334 in TextTableNode.
                        Please advise if you may have any pointers on this issue. Thanks!