1 2 Previous Next 15 Replies Latest reply on Mar 24, 2016 12:38 PM by shawkins

    Troubles reading csv files, maybe due to their weird formatting...

    m.ardito

      Hi,

       

      I'm trying to read data from a csv file, generated by a snmp data collector

       

      I've looked

      - this quickstart teiid-quickstarts/dynamicvdb-datafederation at master · teiid/teiid-quickstarts · GitHub

      - this wiki guide Text to Table with Teiid

       

      I tried to mimic those examples, but in those examples the target text file is always like

       

      SYMBOL,PRICE

      RHT,30.00

      BA,42.75

      MON,78.75

      PNRA,84.97

      SY,24.30

       

      and this is quite different from the file I get form the snmp collector... i.e. like

       

      #Contatore totale

      #Format Version:4.2.3.0

      #Data/Ora esportazione: 2015/10/05 11:13:17

      #Codice Risultato,Informazioni contatore più aggiornate,Indirizzo,Indirizzo nodo,Modello,Nome,Commento,Contatore totale

      <ReturnCode>,<Latest Comm. Time>,<Address>,<MacAddress>,<Model Name>,<Device Name>,<Comment>,<Total Counter>

      0,2015/10/05 11:12:58,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],48173

      0,2015/10/05 11:12:58,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539A78],[RICOH Aficio SP 3510SF V2.07 / ],24594

      0,2015/10/05 11:12:57,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539AF0],[RICOH Aficio SP 3510SF V2.07 / ],34818

      0,2015/10/05 11:12:57,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539B00],[RICOH Aficio SP 3510SF V2.07 / ],5735

      0,2015/10/05 11:12:58,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-53B156],[RICOH Aficio SP 3510SF V2.07 / ],76948

       

      After several failures, I tried several things:

      - removing the first 4 rows,

      - removing the "<" and ">" form the header row

      - removing spaces in headers field names

       

      ...it seems nothing works...

       

      The last try was after converting the csv file to

       

      ReturnCode,LatestCommTime,Address,MacAddress,ModelName,DeviceName,Comment,TotalCounter

      0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],47652

      0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539A78],[RICOH Aficio SP 3510SF V2.07 / ],23202

      0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539AF0],[RICOH Aficio SP 3510SF V2.07 / ],34355

      0,2015/09/17 16:16:33,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539B00],[RICOH Aficio SP 3510SF V2.07 / ],4342

      0,2015/09/17 16:16:33,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-53B156],[RICOH Aficio SP 3510SF V2.07 / ],75335

       

      after loading the vdb

      <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

      <vdb name="Contatori" version="1">

          <description>Contatori lan</description>

          <property name="UseConnectorMetadata" value="true" />

          <model name="ContatoriLan">

              <source name="text-connector" translator-name="file" connection-jndi-name="java:/contatori"/>

          </model>

          <model name="Dati" type="VIRTUAL">

              <metadata type="DDL"><![CDATA[

                  CREATE VIEW Stampanti (

                  updated date,

                  address string

                  )

                  AS 

                     SELECT x.LatestCommTime, x.Address

                      FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                          TEXTTABLE(f.file COLUMNS LatestCommTime date, Address string HEADER) AS x;

              ]]>

                      </metadata>

          </model>

      </vdb>

       

      I get

       

      11:31:23,323 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (MSC service thread 1-3)  TEIID40120 VDB Contatori.1 will be removed from the repository

      11:31:23,326 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (MSC service thread 1-3)  TEIID40119 VDB Contatori.1 removed from the repository

      11:31:23,329 INFO  [org.teiid.SECURITY] (MSC service thread 1-3)  TEIID40009 Admin [null] is terminating this session: 3byYHBqd0NyU.

      11:31:23,332 INFO  [org.teiid.RUNTIME] (MSC service thread 1-3)  TEIID50026 VDB "Contatori.1[ContatoriLan{text-connector=text-connector, file, java:/contatori}, Dati{}]" undeployed.

      11:31:23,338 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (MSC service thread 1-3)  TEIID40118 VDB Contatori.1 added to the repository - is reloading false

      11:31:23,341 INFO  [org.teiid.RUNTIME] (MSC service thread 1-3)  TEIID50029 VDB Contatori.1 model "ContatoriLan" metadata is currently being loaded. Start Time: 26/02/16 11.31

      11:31:23,345 INFO  [org.teiid.RUNTIME] (teiid-async-threads - 4)  TEIID50030 VDB Contatori.1 model "ContatoriLan" metadata loaded. End Time: 26/02/16 11.31

      11:31:23,345 INFO  [org.teiid.RUNTIME] (MSC service thread 1-3)  TEIID50029 VDB Contatori.1 model "Dati" metadata is currently being loaded. Start Time: 26/02/16 11.31

      11:31:23,352 INFO  [org.teiid.RUNTIME] (teiid-async-threads - 3)  TEIID50030 VDB Contatori.1 model "Dati" metadata loaded. End Time: 26/02/16 11.31

      11:31:23,356 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (teiid-async-threads - 3)  TEIID40003 VDB Contatori.1 is set to ACTIVE

       

      From squirrel, connecting to the vdb looks fine, but as soon as I query anything like "select * from Stampanti" I get

       

      11:34:40,311 WARN  [org.teiid.PROCESSOR] (Worker10_QueryProcessorQueue237) DG5IdTWJSfjH TEIID30020 Processing exception for request DG5IdTWJSfjH.11 'TEIID30176 Could not convert value for column LatestCommTime in the row ending on text line 2 in file:/mnt/common/TEST/20150917161732_Lan_T.CSV.'. Originally TeiidProcessingException Date.java:140. Enable more detailed logging to see the entire stacktrace.

       

      then I tried changing the "date" to "timestamp" in the vdb  like

       

      <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

      <vdb name="Contatori" version="1">

          <description>Contatori lan</description>

          <property name="UseConnectorMetadata" value="true" />

          <model name="ContatoriLan">

              <source name="text-connector" translator-name="file" connection-jndi-name="java:/contatori"/>

          </model>

          <model name="Dati" type="VIRTUAL">

              <metadata type="DDL"><![CDATA[

              CREATE VIEW Stampanti (

                  updated timestamp,

                  address string

                  )

                  AS 

                     SELECT x.LatestCommTime, x.Address

                      FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                          TEXTTABLE(f.file COLUMNS LatestCommTime timestamp, Address string HEADER) AS x;

              ]]>

                      </metadata>

          </model>

      </vdb>

       

      and now I get

      11:37:49,892 WARN  [org.teiid.PROCESSOR] (Worker10_QueryProcessorQueue262) xIOTB4045PN0 TEIID30020 Processing exception for request xIOTB4045PN0.6 'TEIID30176 Could not convert value for column LatestCommTime in the row ending on text line 2 in file:/mnt/common/TEST/20150917161732_Lan_T.CSV.'. Originally TeiidProcessingException 'Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]' Timestamp.java:235. Enable more detailed logging to see the entire stacktrace.

       

      ... I can't randomly try everything that comes to mind, but also I can't preprocess every csv in order to be read in this way...

      It is simpler to load that in a mysql database, and then query it (which works perfectly, btw..), althought I was trying to avoid this intermediate step, from the start.

       

      What is the best approach to use those .csv, files, if possible without "preprocessing" them?

      is it possible to add some paramenter or function to

      - specify which are the header and data start rows (like the excel connector allows)

      - use even those weird header column name names (like <headername>) in some way?

       

      what you suggest?

      I could find so many differently formatted .csv files, I need to learn all tips & tricks of teiid-fu to be able to use them, or find other ways...

       

      Thanks,

      Marco

        • 1. Re: Troubles reading csv files, maybe due to their weird formatting...
          m.ardito

          wuth those csv preprocessed to look like

          ReturnCode,LatestCommTime,Address,MacAddress,ModelName,DeviceName,Comment,TotalCounter

          0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],47652

          0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539A78],[RICOH Aficio SP 3510SF V2.07 / ],23202

          0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539AF0],[RICOH Aficio SP 3510SF V2.07 / ],34355

          0,2015/09/17 16:16:33,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539B00],[RICOH Aficio SP 3510SF V2.07 / ],4342

          0,2015/09/17 16:16:33,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-53B156],[RICOH Aficio SP 3510SF V2.07 / ],75335

           

          I now can use csv dates, since I used parsetimestamp in the vdb... but can't imagine how to treat those weird <headers>

           

          <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

          <vdb name="Contatori" version="1">

              <description>Contatori lan</description>

              <property name="UseConnectorMetadata" value="true" />

              <model name="ContatoriLan">

                  <source name="text-connector" translator-name="file" connection-jndi-name="java:/contatori"/>

              </model>

              <model name="Dati" type="VIRTUAL">

                  <metadata type="DDL"><![CDATA[

                  CREATE VIEW Stampanti (

                      updated timestamp,

                      address string

                      )

                      AS 

                         SELECT PARSETIMESTAMP(x.LatestCommTime, 'yyyy/MM/dd HH:mm:ss'), x.Address

                          FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                              TEXTTABLE(f.file COLUMNS LatestCommTime string, Address string HEADER) AS x;

                  ]]>

                          </metadata>

              </model>

          </vdb>

          • 2. Re: Troubles reading csv files, maybe due to their weird formatting...
            rareddy

            I do not think there is anyway to remove the "<>" in the header fields, however one you build the table with "<" ">" in headers, create view without them and use that for your client application.

             

            Ramesh..

            • 3. Re: Troubles reading csv files, maybe due to their weird formatting...
              shawkins

              If you want to use a different column name than the header, then with [TEIID-3346] File Source | Dot in file column names causes VDB deployment errors - JBoss Issue Tracker you can use the column header option:

               

              SELECT PARSETIMESTAMP(x.LatestCommTime, 'yyyy/MM/dd HH:mm:ss'), x.Address

                              FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                                  TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<Latest Comm. Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

              • 4. Re: Troubles reading csv files, maybe due to their weird formatting...
                m.ardito

                Sorry Ramesh, maybe I was not clear enough...

                 

                if I leave the headers like <headername>,

                <ReturnCode>,<LatestCommTime>,<Address>

                0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],47652

                0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539A78],[RICOH Aficio SP 3510SF V2.07 / ],23202

                 

                the vdb loads but sql fails, at least if I use a vdb like this (modified to keep those weird headers)

                 

                <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

                <vdb name="Contatori" version="1">

                    <description>Contatori lan</description>

                    <property name="UseConnectorMetadata" value="true" />

                    <model name="ContatoriLan">

                        <source name="text-connector" translator-name="file" connection-jndi-name="java:/contatori"/>

                    </model>

                    <model name="Dati" type="VIRTUAL">

                        <metadata type="DDL"><![CDATA[

                        CREATE VIEW Stampanti (

                            updated timestamp,

                            address string

                            )

                            AS 

                               SELECT PARSETIMESTAMP(x.<LatestCommTime>, 'yyyy/MM/dd HH:mm:ss'), x.<Address>

                                FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                                    TEXTTABLE(f.file COLUMNS <LatestCommTime> string, <Address> string HEADER) AS x;

                        ]]> </metadata>

                    </model>

                </vdb>

                 

                then I get this:

                 

                15:02:49,389 WARN  [org.teiid.RUNTIME] (teiid-async-threads - 3)  TEIID50036 VDB Contatori.1 model "Dati" metadata failed to load. Reason:TEIID30386 org.teiid.api.exception.query.QueryParserException: TEIID31100 Parsing error: Encountered "PARSETIMESTAMP(x[*].[*]<LatestCommTime" at line 8, column 39.

                Was expecting: "and" | "between" | "in" | "is" | "like" | "like_regex" | "not" | "or" | "order" | "similar" ...: org.teiid.metadata.ParseException: TEIID30386 org.teiid.api.exception.query.QueryParserException: TEIID31100 Parsing error: Encountered "PARSETIMESTAMP(x[*].[*]<LatestCommTime" at line 8, column 39.

                Was expecting: "and" | "between" | "in" | "is" | "like" | "like_regex" | "not" | "or" | "order" | "similar" ...

                        at org.teiid.query.parser.QueryParser.parseDDL(QueryParser.java:472) [teiid-engine-8.12.3.jar:8.12.3]

                        at org.teiid.metadata.MetadataFactory.parse(MetadataFactory.java:746) [teiid-api-8.12.3.jar:8.12.3]

                        at org.teiid.query.metadata.DDLMetadataRepository.loadMetadata(DDLMetadataRepository.java:40) [teiid-engine-8.12.3.jar:8.12.3]

                        at org.teiid.runtime.AbstractVDBDeployer$MetadataRepositoryWrapper.loadMetadata(AbstractVDBDeployer.java:84) [teiid-runtime-8.12.3.jar:8.12.3]

                        at org.teiid.query.metadata.ChainingMetadataRepository.loadMetadata(ChainingMetadataRepository.java:55) [teiid-engine-8.12.3.jar:8.12.3]

                        at org.teiid.jboss.VDBService$6.run(VDBService.java:395) [teiid-jboss-integration-8.12.3.jar:8.12.3]

                        at org.teiid.jboss.VDBService$7.run(VDBService.java:446) [teiid-jboss-integration-8.12.3.jar:8.12.3]

                        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_91]

                        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_91]

                        at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_91]

                        at org.jboss.threads.JBossThread.run(JBossThread.java:122)

                Caused by: org.teiid.api.exception.query.QueryParserException: TEIID31100 Parsing error: Encountered "PARSETIMESTAMP(x[*].[*]<LatestCommTime" at line 8, column 39.

                Was expecting: "and" | "between" | "in" | "is" | "like" | "like_regex" | "not" | "or" | "order" | "similar" ...

                        at org.teiid.query.parser.QueryParser.convertParserException(QueryParser.java:214) [teiid-engine-8.12.3.jar:8.12.3]

                        ... 11 more

                 

                15:02:49,408 INFO  [org.jboss.as.server] (HttpManagementService-threads - 81)  JBAS015865: Replaced deployment "contatori-vdb.xml" with deployment "contatori-vdb.xml"

                15:02:49,440 INFO  [org.jboss.as.repository] (HttpManagementService-threads - 81)  JBAS014901: Content removed from location /opt/teiid-8.12.3/standalone/data/content/10/e46e431c7cbd9e9b50389fc953fe2cad6d5900/content

                • 5. Re: Troubles reading csv files, maybe due to their weird formatting...
                  m.ardito

                  Thanks Steven, I tried to apply your suggestion (if I got it right):

                   

                  vdb is now

                  <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

                  <vdb name="Contatori" version="1">

                      <description>Contatori lan</description>

                      <property name="UseConnectorMetadata" value="true" />

                      <model name="ContatoriLan">

                          <source name="text-connector" translator-name="file" connection-jndi-name="java:/contatori"/>

                      </model>

                      <model name="Dati" type="VIRTUAL">

                          <metadata type="DDL"><![CDATA[               

                          CREATE VIEW Stampanti (

                              updated timestamp,

                              address string

                              )

                              AS 

                                 SELECT PARSETIMESTAMP(x.LatestCommTime, 'yyyy/MM/dd HH:mm:ss'), x.Address

                                  FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                                       TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestCommTime>' string, Address HEADER '<Address>' string HEADER 5) AS x

                          ]]>

                                  </metadata>

                      </model>

                  </vdb>

                   

                  while data is now back to its standard headers

                   

                  #Contatore totale

                  #Format Version:4.2.3.0

                  #Data/Ora esportazione: 2015/10/05 11:13:17

                  #Codice Risultato,Informazioni contatore più aggiornate,Indirizzo,Indirizzo nodo,Modello,Nome,Commento,Contatore totale

                  <ReturnCode>,<LatestCommTime>,<Address>

                  0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],47652

                  0,2015/09/17 16:16:35,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-539A78],[RICOH Aficio SP 3510SF V2.07 / ],23202

                   

                  but loading this vdb fails:

                   

                  16:16:53,740 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-8)  JBAS015876: Starting deployment of "contatori-vdb.xml" (runtime-name: "contatori-vdb.xml")

                  16:16:53,763 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (MSC service thread 1-6)  TEIID40118 VDB Contatori.1 added to the repository - is reloading false

                  16:16:53,767 INFO  [org.teiid.RUNTIME] (MSC service thread 1-6)  TEIID50029 VDB Contatori.1 model "ContatoriLan" metadata is currently being loaded. Start Time: 26/02/16 16.16

                  16:16:53,770 INFO  [org.teiid.RUNTIME] (MSC service thread 1-6)  TEIID50029 VDB Contatori.1 model "Dati" metadata is currently being loaded. Start Time: 26/02/16 16.16

                  16:16:53,771 INFO  [org.teiid.RUNTIME] (teiid-async-threads - 2)  TEIID50030 VDB Contatori.1 model "ContatoriLan" metadata loaded. End Time: 26/02/16 16.16

                  16:16:53,775 INFO  [org.teiid.RUNTIME] (teiid-async-threads - 1)  TEIID50030 VDB Contatori.1 model "Dati" metadata loaded. End Time: 26/02/16 16.16

                  16:16:53,780 WARN  [org.teiid.PLANNER.RESOLVER] (teiid-async-threads - 1)  TEIID31080 Dati.Stampanti validation error: TEIID31100 Parsing error: Encountered "f.file COLUMNS LatestCommTime [*]'<LatestCommTime>'[*] string," at line 1, column 168.

                  Was expecting: "string" | "varbinary" | "varchar" | "boolean" | "byte" | "tinyint" | "short" | "smallint" | "char" | "integer" ...

                  16:16:53,788 INFO  [org.teiid.RUNTIME] (teiid-async-threads - 1)  TEIID40073 The metadata for the VDB Contatori.1 is loaded, however it is not valid. Check models for errors. Correct the metadata and re-deploy.

                  16:16:53,791 INFO  [org.jboss.as.server] (HttpManagementService-threads - 94)  JBAS015859: Deployed "contatori-vdb.xml" (runtime-name : "contatori-vdb.xml")

                  16:16:53,792 INFO  [org.teiid.RUNTIME.VDBLifeCycleListener] (teiid-async-threads - 1)  TEIID40003 VDB Contatori.1 is set to FAILED

                  • 6. Re: Troubles reading csv files, maybe due to their weird formatting...
                    shawkins

                    Sorry, yes there is an issue with the command to string logic that is omitting the header keyword.  So this query would work when sent from the client, but not when used to define a view.  This needs corrected.

                    • 7. Re: Troubles reading csv files, maybe due to their weird formatting...
                      m.ardito

                      Ok, no problem, if it will be corrected, I'll try that again, because it could be really useful...

                       

                      btw, is there any limit to the text files size, when using such connector?

                      We have, in another department, other csv files quite big - several megabytes, and up to 40 -which I get from a partner (no other option until now...) updated at least weekly,

                      which are not currenlty managed through teiid, but if they could be integrated, too, it would also help a lot...

                       

                      Thanks

                      • 8. Re: Troubles reading csv files, maybe due to their weird formatting...
                        shawkins

                        The issue will be addressed with [TEIID-4006] Texttable column header option toString incorrect - JBoss Issue Tracker

                         

                        There isn't a fixed limit to the size that can be processed.  Since we are re-parsing every time you would just want to ensure that caching is used for larger sizes.

                        • 9. Re: Troubles reading csv files, maybe due to their weird formatting...
                          m.ardito

                          Thanks, I'll test that in ther next release, and then mark this as solved, hopefully.

                           

                          About size, yesterday I successfully read my 40mb CSV file, after solving an encoding-related issue: this file comes from a partner, and encoding is not specified, turns out to be "ansi", ie windows-1252 or cp-1252.

                          Adding that property loads the file without problems. I'll explore also caching features, before this hits production.

                           

                          Marco

                          • 10. Re: Troubles reading csv files, maybe due to their weird formatting...
                            m.ardito

                            And I can confirm that in latest 9.0.0.Alpha2 also this now works...

                             

                            As for other issues, I'll wait next 8.13.3 stable to mark this as solved.

                             

                            So many thanks, Teiid devs.

                             

                            Marco

                            • 11. Re: Troubles reading csv files, maybe due to their weird formatting...
                              m.ardito

                              I still have some problem: it works with the latest test vdb above, which has

                               

                              TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestCommTime>' string, Address HEADER '<Address>' string HEADER 5) AS x

                               

                              but it does not if (as it is really) my fields have spaces, dots. eg:

                               

                              TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestComm Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

                              TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestComm.Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

                              TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestComm. Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

                               

                              I also tried to use ESCAPE \ or ESCAPE '\'  with

                              TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<LatestComm\ Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

                               

                              but this doesn't work either....

                               

                              I'm sure I'm doing something wrong, but what? I can't find a similar example in the TEXTTABLE reference...

                               

                              [edit]

                              Yes, this is still failing with my real use case which is, like said in the starting post,

                               

                              #Contatore totale

                              #Format Version:4.2.3.0

                              #Data/Ora esportazione: 2015/10/05 11:13:17

                              #Codice Risultato,Informazioni contatore più aggiornate,Indirizzo,Indirizzo nodo,Modello,Nome,Commento,Contatore totale

                              <ReturnCode>,<Latest Comm. Time>,<Address>,<MacAddress>,<Model Name>,<Device Name>,<Comment>,<Total Counter>

                              0,2015/10/05 11:12:58,[192.168.x.y],[00-00-00-00-00-00],[Aficio SP 3510SF],[3510SF-39E733],[RICOH Aficio SP 3510SF V2.01 / ],48173

                              [/edit]

                               

                              Marco

                              • 12. Re: Troubles reading csv files, maybe due to their weird formatting...
                                shawkins

                                I think you just have a typo.  LatestCommTime in the header is <Latest Comm. Time> - with a space between Latest and Comm:

                                 

                                TEXTTABLE(f.file COLUMNS LatestCommTime HEADER '<Latest Comm. Time>' string, Address HEADER '<Address>' string HEADER 5) AS x

                                • 13. Re: Troubles reading csv files, maybe due to their weird formatting...
                                  m.ardito

                                  Maybe I was tired, or too confused, but now I just tried to correct the vdb xml, setting the view query like this

                                   

                                  SELECT PARSETIMESTAMP(x.LatestCommTime, 'yyyy/MM/dd HH:mm:ss'), x.Address

                                  FROM (EXEC ContatoriLan.getTextFiles('*.CSV')) AS f,

                                  TEXTTABLE(f.file COLUMNS

                                  LatestCommTime HEADER '<Latest Comm. Time>' string,

                                  ...

                                  HEADER 5) AS x

                                   

                                  the vdb loaded active, but it failed in squirrel, with log line:

                                   

                                  2016-03-23 18:21:42,317 WARN  [org.teiid.PROCESSOR] (Worker229_QueryProcessorQueue58387) 8ozdZCYIDdZc TEIID30020 Processing exception for request 8ozdZCYIDdZc.5 'TEIID30181 HEADER entry missing for column name LatestCommTime in file:/mnt/common/TEST/20150917161732_Lan_T.CSV.'. Originally TeiidProcessingException TextTableNode.java:503. Enable more detailed logging to see the entire stacktrace.

                                   

                                  ...I was about to write about the fail, but then realized I had two CSV files there, and the log was referring to one in particular...

                                   

                                  and there was the cause: my fault! Struggling to find what failed before (before seeing the typo you spotted), I changed the headers in the CSV files: I tried removing dots and spaces, from files headers, and in that way it worked, but then I restored the original headers (with dots and spaces) only in one file... I forgot about the second one: 20150917161732_Lan_T.CSV.

                                   

                                  Now it works, again...

                                   

                                  Thanks

                                  Marco

                                  • 14. Re: Troubles reading csv files, maybe due to their weird formatting...
                                    m.ardito

                                    Confirmed working in 8.13.3 stable.

                                     

                                    Thanks

                                    Marco

                                    1 2 Previous Next