Google spreadsheet: text cells size issue...
m.ardito Sep 21, 2016 10:49 AMAnother one on Google Spreadsheets,
I am developing something to put later in production, and I'm starting to get real data (ie: not test data) from my google spreadsheets.
the model is like
<model name="gssmodel"> <property name="importer.useFullSchemaName" value="false"/> <source name="gformsA" translator-name="google-spreadsheet-override" connection-jndi-name="java:/gssmodel"/> </model>
and the view like
CREATE VIEW gformsAView( ... "Descrizione segnalazione" clob, ... ) AS SELECT ... gsA."Descrizione segnalazione", ... FROM (call gssmodel.native(' worksheet=A; query=SELECT A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U' )) w, ARRAYTABLE(w.tuple COLUMNS ... "Descrizione segnalazione" clob, ... ) AS gsA WHERE gsA."timestamp" IS NOT NULL
And it works nicely, but I recently found a text cell size issue:
- The google spreadsheet has a "description" column which contains a sort of "description text" for ech row item, formatted there as "Automatic".
- Some rows have long texts in the "description" column: more than 4000 chars, and also "rich text" inside (bullet lists, special chars for quotes, etc, I guess some user pastes there texts copied elsewhere...
- my Teiid model "gssmodel" retrieves a DDL with the column as "string" type (4000 chars limit in Teiid) ("NATIVE_TYPE 'STRING')"
- SquirrelSQL retrieves the data as "string", witha "column_size" of 4000
- but selecting that column returns all real text length as in the google spreadsheet, eg: more than 4000 chars.eg:
Then, since I needed a materialized view of the spreadsheet data, I created it:
- this view gets the column value through the "gssmodel.native" method, then using ARRAYTABLE to convert that column values
- but the "description" column offered by the view, always returns a maximum of 4000 chars
- I tried to define the column (within ARRAYTABLE definition) as "string" or "clob" and:
- the returned text length is always 4000 (using unix format for line feeds)
- the returned text seems to be missing (fortunately) some "rich text" charaters causing issues with POstgres/ODBC recordset, later...
- I also tried to define the view column as blob, and then use the "TO_CHARS" string function ("Return a clob from the blob with the given encoding.")
but this always failed because the Teiid engine seems to always try to use a "CONVERT" function before, which fails... with something like:
if my view query was something like
SELECT TO_CHARS(gsA."Descrizione segnalazione", 'UTF-8') FROM gsA
I got something this:
"2016-09-16 10:01:06,903 ERROR [org.teiid.PROCESSOR.MATVIEWS] (Worker5518_QueryProcessorQueue51833) YQVO+HaeqoHu TEIID30015 Failed to load materialized view table #MAT_GFORMSVIEW.GFORMSAPITOVIEW.: org.teiid.api.exception.query.ExpressionEvaluationException: TEIID30328 Unable to evaluate TO_CHARS(convert(array_get(w.tuple, 7), blob), 'UTF-8'): TEIID30384 Error while evaluating function convert
at org.teiid.query.eval.Evaluator.evaluate(Evaluator.java:728)"
so I guessed this couldn't work...
Since Google spreadsheet seem to have no limit, now, as cell text length (Files you can store in Google Drive - Drive Help ), I guess the Teiid "string" format could be a bit short...
But somehow, although the model DDL defines the column as "string", Squirrel can get all field content, as if it was a clob... perhaps.
I could live with that, but I need to let my custom view to behave the same, if possible (mostly because I need materialization)... How can I do? (Of course I can post more details, try whatever, to reproduce the issue...)
Thanks