8 Replies Latest reply on Jun 4, 2015 10:44 AM by shawkins

    Large size of Blob data in Cassandra

    haifen_bi

      I am using teiid Cassandra translator (teiid version is 8.10.1) for Cassandra DB. I got following error when access blob data in Cassandra:

       

      org.teiid.core.types.TransformationException: TEIID10076 Invalid conversion from type class java.lang.Object with value 'java.nio.HeapByteBuffer[pos=88 lim=92 cap=109]' to type class org.teiid.core.types.BinaryType

      org.teiid.core.types.basic.ObjectToAnyTransform.transform(ObjectToAnyTransform.java:111)

      org.teiid.core.types.DataTypeManager.transformValue(DataTypeManager.java:941)

      org.teiid.dqp.internal.datamgr.ConnectorWorkItem.correctTypes(ConnectorWorkItem.java:543)

      org.teiid.dqp.internal.datamgr.ConnectorWorkItem.handleBatch(ConnectorWorkItem.java:410)

      org.teiid.dqp.internal.datamgr.ConnectorWorkItem.more(ConnectorWorkItem.java:210)

      sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

      sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

      java.lang.reflect.Method.invoke(Method.java:606)

      org.teiid.dqp.internal.datamgr.ConnectorManager$1.invoke(ConnectorManager.java:209)

      com.sun.proxy.$Proxy133.more(Unknown Source)

      org.teiid.dqp.internal.process.DataTierTupleSource.getResults(DataTierTupleSource.java:301)

      org.teiid.dqp.internal.process.DataTierTupleSource$1.call(DataTierTupleSource.java:110)

      org.teiid.dqp.internal.process.DataTierTupleSource$1.call(DataTierTupleSource.java:107)

      java.util.concurrent.FutureTask.run(FutureTask.java:262)

      org.teiid.dqp.internal.process.FutureWork.run(FutureWork.java:58)

      org.teiid.dqp.internal.process.DQPWorkContext.runInContext(DQPWorkContext.java:276)

      org.teiid.dqp.internal.process.ThreadReuseExecutor$RunnableWrapper.run(ThreadReuseExecutor.java:119)

      org.teiid.dqp.internal.process.ThreadReuseExecutor$3.run(ThreadReuseExecutor.java:210)

      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

      java.lang.Thread.run(Thread.java:745)

       

      The Cassandra server version is 2.0.

      The schema of the table I tried to access is:

       

      CREATE TABLE testcass.applicationbinarydata (

        key text PRIMARY KEY,

        data blob

      )

       

      Does this version of Teiid (8.10.1) support Cassandra 2.0? Or this is the code bug in Teiid?

       

      Please help and thanks in advance!

       

      Haifen Bi

        • 1. Re: Large size of Blob data in Cassandra
          rareddy

          For some reason the blob type from Cassandra is mapped to VARBINARY, thus you may be seeing an issue. See teiid/CassandraMetadataProcessor.java at master · teiid/teiid · GitHub

           

          You can workaround by defining the DDL for the model in your VDB to blob.

           

          Also I see handling of it is wrong here, which can lead to memory issues teiid/CassandraQueryExecution.java at master · teiid/teiid · GitHub

           

          You should log a JIRA.

          • 2. Re: Large size of Blob data in Cassandra
            shawkins

            The varbinary mapping is because we are directly reading bytes and thus already not memory safe.

             

            > You can workaround by defining the DDL for the model in your VDB to blob.

             

            That won't make a difference as the extraction logic is based upon the type being returned by Cassandra:

             

            case BLOB:

              values.add(row.getBytes(i));

              break;

             

            The default case (which is now apparently just used for custom) handling should be used instead and correctly converts to a byte array.

            • 3. Re: Large size of Blob data in Cassandra
              haifen_bi

              Teiid has size limitation for VARBINARY. It is 8K. But our blob size in cassendra is much more than that. Is it possible to increase the max size of VARBINARY?

              • 4. Re: Large size of Blob data in Cassandra
                shawkins

                The size limitation if for varbinary values created in Teiid, it doesn't extend to values read from the source.  If the custom/blob values though from cassandra can be arbitrarily sized, then it probably makes sense to represent them as blobs instead.

                • 5. Re: Large size of Blob data in Cassandra
                  shawkins
                  • 6. Re: Large size of Blob data in Cassandra
                    haifen_bi

                    Great, looking forward for the fix.

                     

                    Thank you for all the help!

                    • 7. Re: Large size of Blob data in Cassandra
                      haifen_bi

                      Thanks for the quick fix!

                       

                      I have applied the fix manually and tested. Reading blob data from Cassanda is working now.  However I am getting following exception when inserting a blob data:

                       

                      Internal Exception: org.teiid.jdbc.TeiidSQLException: TEIID30504 CASSLOG: Invalid STRING constant (javax.sql.rowset.serial.SerialBlob@7a3b044b) for data of type  blob

                      Error Code: 30504

                      Call: INSERT INTO "CASSLOG.applicationbinarydata" ("key", "TEIID_MULTI_DATA_SOURCE_COLUMN", "data") VALUES (?, ?, ?)  bind => [Test3, CASSLOG, javax.sql.rowset.serial.SerialBlob@7a3b044b]

                      Query: InsertObjectQuery({applicationbinarydata [[Test3, CASSLOG]: -520003258]})

                      Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid STRING constant (javax.sql.rowset.serial.SerialBlob@7a3b044b) for data of type blob

                              at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35)

                              at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:259)

                              at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:175)

                              at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)

                              at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:36)

                              at org.teiid.resource.adapter.cassandra.CassandraConnectionImpl.executeQuery(CassandraConnectionImpl.java:88)

                      • 8. Re: Large size of Blob data in Cassandra
                        shawkins

                        Yes, the literal logic is only currently handling the varbinary case and not blobs.  That will require another commit to TEIID-3514.