Overview
This article demonstrates how to use the HBase Translator with Phoenix Data Source to access the data in HBase, the content including
- Prepare Sample Data
- Access HBase Data via Teiid Embedded
- Integrate HBase Data via dynamic VDB and Teiid Driver
- Model HBase Data with Teiid Designer
Prepare Sample Data
This section is the precondition for the following sections, we will give a step by step procedures for setting up HBase with Phoenix, create table and put sample data in HBase and set up Phoenix data source.
Setup HBase
- Using HBase quickstart steps to install a single-node, standalone instance of HBase, for example
$ tar -xvf hbase-0.98.8-hadoop2-bin.tar.gz $ cd hbase-0.98.8-hadoop2/
- Download Phoenix 4.x from Phoenix Downloads Page, install Phoenix via copying phoenix-core.jar to HBase lib directory, for example
$ tar -xvf phoenix-4.2.1-bin.tar.gz $ cp phoenix-4.2.1-bin/phoenix-core-4.2.1.jar hbase-0.98.8-hadoop2/lib/
- Start HBase and connect to HBase via shell, create table and put sample data in customer_sample_data.txt, for example
$ ./bin/start-hbase.sh $ ./bin/hbase shell hbase(main):002:0> create 'Customer', 'customer', 'sales' hbase(main):003:0> put 'Customer', '101', 'customer:name', 'John White' ...
In above steps, we create Customer table in HBase, the table structure like
NOTE- HBase Customer table have 2 column families, customer and sales, and each has 2 column qualifiers, name, city, product and amount respectively.
Setup Phoenix Data Source
- Copy phoenix-[version]-client.jar to $JBOSS_HOME, copy setup.clito $JBOSS_HOME, for example
$ cd $JBOSS_HOME $ cp setup.cli ./ $ cp .../phoenix-4.2.1-bin/phoenix-4.2.1-client.jar ./
- Execute CLI commands to setup Phoenix Data Source
$ ./bin/jboss-cli.sh --connect --file=setup.cli
- Use Phoenix Command Line execute customer-schema.sql to map
Customer
table in HBase, for example
$ cd PHOENIX_HOME $ ./bin/sqlline.py localhost .../src/teiidfiles/customer-schema.sql
NOTE - More details about Phoenix Data Sources and Mapping Phoenix table to an existing HBase table please refer to Teiid Documents.
Access HBase Data via Teiid Embedded
Teiid Embedded is a light-weight version of Teiid, it contain an easy-to-use JDBC Driver that can embed the Query Engine in any Java application. The Embedded mode supply almost all Teiid features without JEE Container involved, it supply a convenient way for Users who want integrate Teiid with their Application. The following steps show how to access HBase Data via Teiid Embedded.
Setup bitronix Data Source
We use bitronix Data Source in Embedded mode, below is sample code for setting up with a given jndi name.
pds = new PoolingDataSource(); pds.setUniqueName(jndiName); pds.setClassName("bitronix.tm.resource.jdbc.lrc.LrcXADataSource"); pds.setMaxPoolSize(5); pds.setAllowLocalTransactions(true); pds.getDriverProperties().put("user", ""); pds.getDriverProperties().put("password", ""); pds.getDriverProperties().put("url", "jdbc:phoenix:127.0.0.1:2181"); pds.getDriverProperties().put("driverClassName", "org.apache.phoenix.jdbc.PhoenixDriver"); pds.init();
Setup Teiid EmbeddedServer
The following code snippets show how to set up Teiid EmbeddedServer and JDBC connection
EmbeddedServer server = new EmbeddedServer(); HBaseExecutionFactory executionFactory = new HBaseExecutionFactory(); executionFactory.start(); server.addTranslator("translator-hbase", executionFactory); EmbeddedConfiguration config = new EmbeddedConfiguration(); config.setTransactionManager(SimpleMock.createSimpleMock(TransactionManager.class)); server.start(config); server.deployVDB(new FileInputStream(new File("src/test/resources/hbase-vdb.xml"))); conn = server.getDriver().connect("jdbc:teiid:hbasevdb", null);
The Following code used to execute JDBC query with Teiid Embedded Server and Connection
TestHBaseUtil.executeQuery(conn, "SELECT * FROM Customer"); TestHBaseUtil.executeQuery(conn, "SELECT city, amount FROM Customer"); TestHBaseUtil.executeQuery(conn, "SELECT DISTINCT city FROM Customer"); TestHBaseUtil.executeQuery(conn, "SELECT city, amount FROM Customer WHERE PK='105'");
NOTE - For completed set up, incuding Maven dependency, VDB, etc, please refer to Unit Test Code TestHBaseExecution.
Integrate HBase Data via dynamic VDB and Teiid Driver
In this section we will show how to integrate HBase Data via Dynamic VDB and Teiid Driver, the completed architecture like
In above figure, there have 4 parts,
- HBase - As above sections, the Customer table exist in HBase, the sample table be put in Customer table
- hbasevdb - A Virtual Database, it defines the logical schema model to combine HBase, hbasevdb is the VDB name
- JVM - JBoss server run on JVM, it's a container for VDB, it provide the interface(like JDBC, REST) to access data
- User Application - the application want to access HBase data via Teiid.
In the following, we will give the steps for how Java application access HBase data Via Teiid Driver and JDBC.
Start the server
Open a command line and navigate to the "bin" directory under the root directory of the JBoss server
For Linux: ./standalone.sh -c standalone-teiid.xml for Windows: standalone.bat -c standalone-teiid.xml
Deploy VDB
Copy the following 2 files to the "/standalone/deployments" directory
hbase-vdb.xml
hbase-vdb.xml.dodeploy
JDBC Client
With the Teiid JDBC Support Document, use the following parameters create JDBC Connection
- JDBC_DRIVER = "org.teiid.jdbc.TeiidDriver"
- JDBC_URL = "jdbc:teiid:hbasevdb@mm://localhost:31000;version=1"
- JDBC_USER = "user"
- JDBC_PASS = "password"
Execute the following SQL with above created JDBC Connection will extract Customer Table Data in HBase.
SELECT * FROM Customer SELECT city, amount FROM Customer SELECT DISTINCT city FROM Customer SELECT city, amount FROM Customer WHERE PK='105' SELECT * FROM Customer WHERE PK BETWEEN '105' AND '108' SELECT * FROM Customer WHERE PK='105' AND name='John White' SELECT * FROM Customer ORDER BY PK SELECT * FROM Customer ORDER BY name, city DESC SELECT name, city, COUNT(PK) FROM Customer GROUP BY name, city SELECT name, city, COUNT(PK) FROM Customer GROUP BY name, city HAVING COUNT(PK) > 1
Model HBase Data with Teiid Designer
In this section we will show how to model HBase Customer table with Teiid Designer.
Define Teiid Model Project
Define Teiid Model Project HBaseCustomerExample with sources folder in the project root as below figure.
Create JDBC Connection
Select Generic JDBC, create Connection Profile with Phoenix JDBC URL and Driver as below figure
Create Source model for JDBC data source
Import JDBC Datasource Source Model with above created Connection Profile and default JBDC Metadata Processor as below figure
Click Next, in Select Database Metadata wizard select TABLE as below figure
Click Next, in Select Database Objects wizard select Table Customer as below figure
Click Next, define the Model name and Save folder, Click Finish, the Source Model be created, it looks as below figure
Preview Data
All execution capabilities in Teiid Designer (Preview Data, VDB execution) require you to connect to a running JBoss Data Virtualization Server, make sure you server is started, then select Customer table, click to Preview Data, The HBase data will show in dialog as below
Define and execute VDB
The Define VDB action allows you to create a VDB artifact for deployment to a JBoss Data Virtualization Server. The below is define VDB dialog
The Execute VDB Action will allow you to execute VDB, the result just like above Preview Data, HBase data will show.
Comments