5 Replies Latest reply on Jun 27, 2019 9:10 AM by rareddy

    Bigquery connection taking too much time while deploying vdb

    pranitag

      Bigquery connection taking too much time while deploying vdb:

       

      We are using 10.2.1 version of teiid and to connect to Big Query we are using Simba driver and successfully able to connect to Bigquery:

      Using jdbc-ansi translator.

      We are using org.teiid.adminapi.jboss.AdminFactory.AdminImpl to deploy VDB.

       

      Datasource details:

        <datasource jndi-name="java:/DSBIGQUERY20190510143305716" pool-name="DSBIGQUERY20190510143305716">

                          <connection-url>jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=abc;OAuthType=0;OAuthServiceAcctEmail=abc@project.iam.gserviceaccount.com;;OAuthPvtKeyPath=C:\Softwares\BigQueryChangesTeiid\teiid-10.2.1\standalone\abc.json;datasetId=abc_dataset;OAuthServiceAcctEmail=abc@project.iam.gserviceaccount.com;;OAuthPvtKeyPath=C:\Softwares\BigQueryChangesTeiid\teiid-10.2.1\standalone\abc.json;datasetId=abc_dataset;</connection-url>

                          <driver>simba</driver>

                          <pool>

                              <max-pool-size>20</max-pool-size>

                          </pool>

                      </datasource>

       

      If there is large amount of data present on bigquery it is taking more than 10 minutes to deploy the VDB.

       

      And even if we are providing 'datasetId' in connection url it is fetching metadata of all datasets present under given projectID and service account email.I think which might causing delay while deploying VDB

       

      Could you please help me with this. As it is taking too much time and very annoying every time.

        • 1. Re: Bigquery connection taking too much time while deploying vdb
          rareddy

          The issue is on the deployment of the VDB, if any model/schemas defined without metadata, then the system will download the metadata from source using the using JDBC Metadata. This can take arbitrary amount of time based on type of source and type of import properties (for filtering) that are defined. If you want to avoid doing this then define the medata on the model. For example see the metadata with DDL fragment here

           

          <vdb name="reuse" version="1">

              <model visible="true" type="VIRTUAL" name="sales">

                  <source name="west" translator-name="jdbc-ansi"  connection-jndi-name="java://bigQueryDS">

                   <metadata type = "DDL"><![CDATA[

                      CREATE FOREIGN TABLE Customer (

                          id integer PRIMARY KEY,

                          firstname varchar(25),

                          lastname varchar(25),

                          dob timestamp

                      );

                   ]]>

                   </metadata>

              </model>

          </vdb>

          • 2. Re: Bigquery connection taking too much time while deploying vdb
            pranitag

            Thanks Ramesh,

            I will try define model in deployment.

            But I have concern for other Databases like Mysql,MSSQL,Flat File we are not defining metadata DDL with model in deployment and it is not taking this much time. This issue is only with Bigquery using Simba driver.

             

            Below is the vdb format we are using for Big query connection:

            <vdb name="bq" version="1"><description>VDB for: bq, Version: 1</description><connection-type>BY_VERSION</connection-type><model name="bq" type="PHYSICAL" visible="true"><property name="importer.tableTypes" value="TABLE"></property><property name="importer.importKeys" value="false"></property><source name="bq" translator-name="jdbc-ansi" connection-jndi-name="java:/DSBIGQUERY"></source></model></vdb>

             

            And for reference below is the format for mysql connection:

            <vdb name="mysqlTest" version="1"><description>VDB for: mysqlTest, Version: 1</description><connection-type>BY_VERSION</connection-type><model name="mysqlTest" type="PHYSICAL" visible="true"><property name="importer.tableTypes" value="TABLE"></property><property name="importer.importKeys" value="false"></property><source name="mysqlTest" translator-name="customtranslator" connection-jndi-name="java:/DSMYSQL"></source></model><model name="mysqlTestNative" type="PHYSICAL" visible="true"><source name="mysqlTestNative" translator-name="mysql-native" connection-jndi-name="java:/DSMYSQL"></source></model><translator name="mysql-native" type="customtranslator"><property name="SupportsDirectQueryProcedure" value="true"></property></translator></vdb>

             

            Please suggest.

            • 3. Re: Bigquery connection taking too much time while deploying vdb
              rareddy

              When we write explicit translators for any source we do try to optimize it. Second, I am not sure how Simba's driver's metadata facilities are like? Or this could be entirely something else. Try putting the metadata in that just one model and see if that helps.

              • 4. Re: Bigquery connection taking too much time while deploying vdb
                pranitag

                Thanks Ramesh,

                After adding Metadata in VDB, deployment issue has been resolved.

                 

                I need one more favor for BigQuery connection :

                While connecting to Bigquery using simba driver we are using Google Service Account  for authentication

                As given in https://www.simba.com/products/BigQuery/doc/JDBC_InstallGuide/content/jdbc/bq/authenticating/serviceaccount.htm

                 

                So we need to provide OAuthServiceAcctEmail and OAuthPvtKeyPath as below in connection URL:

                 

                jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=abc;OAuthType=0;OAuthServiceAcctEmail=abc@project.iam.gserviceaccount.com;;OAuthPvtKeyPath=C:\Softwares\BigQueryChangesTeiid\teiid-10.2.1\standalone\abc.json;datasetId=abc_dataset;OAuthServiceAcctEmail=abc@project.iam.gserviceaccount.com;;OAuthPvtKeyPath=C:\Softwares\BigQueryChangesTeiid\teiid-10.2.1\standalone\abc.json;datasetId=abc_dataset;

                 

                 

                And OAuthPvtKeyPath should be kept on server where teiid has been deployed.

                In our case, our services which are consuming teiid services are deployed on different server and teiid is kept on different server.

                So if we need to provide dynamic OAuthPvtKeyPath and OAuthServiceAcctEmail each time we ned to first upload OAuthPvtKey json file on teiid server.

                 

                Is there any way  provide remote file location while generating connection URL?

                 

                Thanks in advance,

                Pranita Gurav

                • 5. Re: Bigquery connection taking too much time while deploying vdb
                  rareddy

                  Can you not use the same file name location always even the contents change?