Teiid on top of Apache HBase

Version 9

    Overview

    This article demonstrates how to use the HBase Translator with Phoenix Data Source to access the data in HBase, the content including

    • Prepare Sample Data
    • Access HBase Data via Teiid Embedded
    • Integrate HBase Data via dynamic VDB and Teiid Driver
    • Model HBase Data with Teiid Designer

    Prepare Sample Data

    This section is the precondition for the following sections, we will give a step by step procedures for setting up HBase with Phoenix, create table and put sample data in HBase and set up Phoenix data source.

    Setup HBase

    $ tar -xvf hbase-0.98.8-hadoop2-bin.tar.gz
    $ cd hbase-0.98.8-hadoop2/
    • Download Phoenix 4.x from Phoenix Downloads Page, install Phoenix via copying phoenix-core.jar to HBase lib directory, for example
    $ tar -xvf phoenix-4.2.1-bin.tar.gz
    $ cp phoenix-4.2.1-bin/phoenix-core-4.2.1.jar hbase-0.98.8-hadoop2/lib/
    $ ./bin/start-hbase.sh
    $ ./bin/hbase shell
    hbase(main):002:0> create 'Customer', 'customer', 'sales'
    hbase(main):003:0> put 'Customer', '101', 'customer:name', 'John White'
    ...
    
    
    
    

    In above steps, we create Customer table in HBase, the table structure like

    hbase-table-1.png

    NOTE- HBase Customer table have 2 column families, customer and sales, and each has 2 column qualifiers, name, city, product and amount respectively.

    Setup Phoenix Data Source

    • Copy phoenix-[version]-client.jar to $JBOSS_HOME, copy setup.clito $JBOSS_HOME, for example
    $ cd $JBOSS_HOME
    $ cp setup.cli ./
    $ cp .../phoenix-4.2.1-bin/phoenix-4.2.1-client.jar ./
    
    
    
    
    
    
    
    
    • Execute CLI commands to setup Phoenix Data Source
    $ ./bin/jboss-cli.sh --connect --file=setup.cli
    • Use Phoenix Command Line execute customer-schema.sql to map Customer table in HBase, for example
    $ cd PHOENIX_HOME
    $ ./bin/sqlline.py localhost .../src/teiidfiles/customer-schema.sql

    NOTE - More details about Phoenix Data Sources and Mapping Phoenix table to an existing HBase table please refer to Teiid Documents.

    Access HBase Data via Teiid Embedded

    Teiid Embedded is a light-weight version of Teiid, it contain an easy-to-use JDBC Driver that can embed the Query Engine in any Java application. The Embedded mode supply almost all Teiid features without JEE Container involved, it supply a convenient way for Users who want integrate Teiid with their Application. The following steps show how to access HBase Data via Teiid Embedded.

    Setup bitronix Data Source

    We use bitronix Data Source in Embedded mode, below is sample code for setting up with a given jndi name.

    pds = new PoolingDataSource();
    pds.setUniqueName(jndiName);
    pds.setClassName("bitronix.tm.resource.jdbc.lrc.LrcXADataSource");
    pds.setMaxPoolSize(5);
    pds.setAllowLocalTransactions(true);
    pds.getDriverProperties().put("user", "");
    pds.getDriverProperties().put("password", "");
    pds.getDriverProperties().put("url", "jdbc:phoenix:127.0.0.1:2181");
    pds.getDriverProperties().put("driverClassName", "org.apache.phoenix.jdbc.PhoenixDriver");
    pds.init();
    
    
    
    
    
    
    

    Setup Teiid EmbeddedServer

    The following code snippets show how to set up Teiid EmbeddedServer and JDBC connection

    EmbeddedServer server = new EmbeddedServer();
    HBaseExecutionFactory executionFactory = new HBaseExecutionFactory();
    executionFactory.start();
    server.addTranslator("translator-hbase", executionFactory);
    EmbeddedConfiguration config = new EmbeddedConfiguration();
    config.setTransactionManager(SimpleMock.createSimpleMock(TransactionManager.class));
    server.start(config);
    server.deployVDB(new FileInputStream(new File("src/test/resources/hbase-vdb.xml")));
    conn = server.getDriver().connect("jdbc:teiid:hbasevdb", null);
    
    
    
    
    
    

    The Following code used to execute JDBC query with Teiid Embedded Server and Connection

    TestHBaseUtil.executeQuery(conn, "SELECT * FROM Customer");
    TestHBaseUtil.executeQuery(conn, "SELECT city, amount FROM Customer");
    TestHBaseUtil.executeQuery(conn, "SELECT DISTINCT city FROM Customer");
    TestHBaseUtil.executeQuery(conn, "SELECT city, amount FROM Customer WHERE PK='105'");
    
    
    
    
    
    

    NOTE - For completed set up, incuding Maven dependency, VDB, etc, please refer to Unit Test Code TestHBaseExecution.

    Integrate HBase Data via dynamic VDB and Teiid Driver

    In this section we will show how to integrate HBase Data via Dynamic VDB and Teiid Driver, the completed architecture like

    Untitled.png

    In above figure, there have 4 parts,

    • HBase - As above sections, the Customer table exist in HBase, the sample table be put in Customer table
    • hbasevdb - A Virtual Database, it defines the logical schema model to combine HBase, hbasevdb is the VDB name
    • JVM - JBoss server run on JVM, it's a container for VDB, it provide the interface(like JDBC, REST) to access  data
    • User Application - the application want to access HBase data via Teiid.

    In the following, we will give the steps for how Java application access HBase data Via Teiid Driver and JDBC.

    Start the server

    Open a command line and navigate to the "bin" directory under the root directory of the JBoss server

    For Linux: ./standalone.sh -c standalone-teiid.xml  
    for Windows: standalone.bat -c standalone-teiid.xml

    Deploy VDB

    Copy the following 2 files to the "/standalone/deployments" directory

    hbase-vdb.xml
    hbase-vdb.xml.dodeploy

    JDBC Client

    With the Teiid JDBC Support Document, use the following parameters create JDBC Connection

    • JDBC_DRIVER = "org.teiid.jdbc.TeiidDriver"
    • JDBC_URL = "jdbc:teiid:hbasevdb@mm://localhost:31000;version=1"
    • JDBC_USER = "user"
    • JDBC_PASS = "password"

    Execute the following SQL with above created JDBC Connection will extract Customer Table Data  in HBase.

    SELECT * FROM Customer
    SELECT city, amount FROM Customer
    SELECT DISTINCT city FROM Customer
    SELECT city, amount FROM Customer WHERE PK='105'
    SELECT * FROM Customer WHERE PK BETWEEN '105' AND '108'
    SELECT * FROM Customer WHERE PK='105' AND name='John White'
    SELECT * FROM Customer ORDER BY PK
    SELECT * FROM Customer ORDER BY name, city DESC
    SELECT name, city, COUNT(PK) FROM Customer GROUP BY name, city
    SELECT name, city, COUNT(PK) FROM Customer GROUP BY name, city HAVING COUNT(PK) > 1
    
    
    

    Model HBase Data with Teiid Designer

    In this section we will show how to model HBase Customer table with Teiid Designer.

    Define Teiid Model Project

    Define Teiid Model Project HBaseCustomerExample with sources folder in the project root as below figure.

    hbasecustomer-create-project.png

    Create JDBC Connection

    Select Generic JDBC, create Connection Profile with Phoenix JDBC URL and Driver as below figure

    hbasecustomer-connection.png

    Create Source model for JDBC data source

    Import JDBC Datasource Source Model with above created Connection Profile and default JBDC Metadata Processor as below figure

    hbasecustomer-import-1.png

    Click Next, in Select Database Metadata wizard select TABLE as below figure

    hbasecustomer-import-2.png

    Click Next, in Select Database Objects wizard select Table Customer as below figure

    hbasecustomer-import-3.png

    Click Next, define the Model name and Save folder, Click Finish, the Source Model be created, it looks as below figure

    hbasecustomer-import-4.png

    Preview Data

    All execution capabilities in Teiid Designer (Preview Data, VDB execution) require you to connect to a running JBoss Data Virtualization Server, make sure you server is started, then select Customer table, click to Preview Data, The HBase data will show in dialog as below

    hbasecustomer-preview.png

    Define and execute VDB

    The Define VDB action allows you to create a VDB artifact for deployment to a JBoss Data Virtualization Server. The below is define VDB dialog

    hbasecustomer-vdb.png

    The Execute VDB Action will allow you to execute VDB, the result just like above Preview Data, HBase data will show.