9 Replies Latest reply on Sep 7, 2017 6:22 PM by shawkins

    VARCHAR(4000)

    lukyer

      What is the recommended way to integrate data sources with foreign tables which use datatypes like TEXT? It is automatically converted into string with maximum length 4000. However it is very normal that such a columns have much greater size. I know i can cast it to bigger string (e.g. varchar(10000)) but according to docs it is not recommended because of in-memory item storing. Why does translator (at least postgresql one) convert these large datatypes into cut string(4000) instead of some kind of unlimited blob not suffering from complete in-memory processing?

       

      Thanks.

        • 1. Re: VARCHAR(4000)
          rareddy

          You need to design them as clobs, what is type it is coming as?

          • 2. Re: VARCHAR(4000)
            lukyer

            Source table from postgresql P_TABLE has column P_COLUMN of postgresql datatype TEXT.

             

            CREATE VIEW test_view AS SELECT P_COLUMN FROM P_TABLE;

             

            results in test_view having one projected column P_COLUMN with datatype string(4000).

            • 3. Re: VARCHAR(4000)
              rareddy

              I do not remember why we are not implicitly creating a clob in this case, may be the Postgres driver is reporting this as string column but Teiid has the capability to take look at native types and make changes. May be driver does not implicitly allow clob types over TEXT types. In this case Teiid would be able to read the data as Clob, but if you are updating then that may be limited 4000 chars for the update, as we would have to convert back to string at the end.

               

              This is a good candidate to change the data type at source model level as if you leave as a string(4000) there is data loss. If you have an option to convert this to clob in Postgres itself that would be ideal, but I know that is not always possible.

              • 4. Re: VARCHAR(4000)
                lukyer

                Another interesting fact is that although teiid reports string(4000) using sql command select * from SYS.columns; in reality i can select, update or insert data much larger using jdbc vdb connection without any problem.
                Why is there such a behaviour? Can i rely on it?

                • 5. Re: VARCHAR(4000)
                  rareddy

                  it shouldn't be the case, IMO. How did you test it, just using a SQL client?

                  • 6. Re: VARCHAR(4000)
                    lukyer

                    i tried sql client, jdbc java connector queries and generated VDB REST endpoint. It works.

                    • 7. Re: VARCHAR(4000)
                      shawkins

                      > Another interesting fact is that although teiid reports string(4000) using sql command select * from SYS.columns; in reality i can select, update or insert data much larger using jdbc vdb connection without any problem.

                      Why is there such a behaviour? Can i rely on it?

                       

                      There are several nuances to the string handling behavior.

                       

                      String length, just like other type constraints, is not strictly enforced for JDBC.  The system max length (not the column length) is enforced if Teiid operates on the value, such that a new string is created.  But yes otherwise we allow values to stream to/from the source unmolested to help with compatibility.  There is a system property that can be set to adjust the system max string length.

                       

                      > Why does translator (at least postgresql one) convert these large datatypes into cut string(4000) instead of some kind of unlimited blob not suffering from complete in-memory processing?

                       

                      That seems to be an issue.  The translator should be importing the value as a clob instead of a string in that case.  Can you log something for that?

                      • 8. Re: VARCHAR(4000)
                        lukyer

                        Okay so i guess it is not a good idea to rely on not enforcing datatype size according to the declaration. In that case i would need some kind of logic preventing datatype crop. For example:

                        if (source_type_length > MAX_STRING_LENGTH) then CLOB else string(MAX_STRING_LENGTH);

                         

                        What would you recommend as a best way how to achieve it? I cannot let "hidden cropping" problem to occure. I need to integrate foreign databases with any datatypes with any data inside.

                        • 9. Re: VARCHAR(4000)
                          shawkins

                          > What would you recommend as a best way how to achieve it? I cannot let "hidden cropping" problem to occure. I need to integrate foreign databases with any datatypes with any data inside.

                           

                          After [TEIID-5050] CLOB instead of STRING - JBoss Issue Tracker  we'll automatically map to clob.  Before then you would need to modify the translator on your own, or make the setting for org.teiid.maxStringLength sufficiently long to handle your case.