Starting with Teiid 8.12 release, Teiid adds support for Actian Vector in Hadoop. Actian Vector is data store based on HDFS which uses vector processing and columnar storage and provides full ANSI-SQL support for accessing the data. For more information about Actian Vector in Hadoop read more here Actian Vector - Agile SMP Analytics Database - Actian – Take Action on Big Data Note the same database is also bundled in their Vortex platform Actian Vortex – High Performance SQL Analytics in Hadoop - Actian – Take Action on Big Data
In this article, I will show how you can integrate the data in Actian's Vector in Hadoop with Teiid. For this purpose Teiid has added a translator for Vector In Hadoop in 8.12 version
- For that I downloaded the Teiid version that is 8.12 or later. I followed the installation procedures for Teiid from here Installation Guide - Teiid 8.12 (draft) - Project Documentation Editor,
- Vector In Hadoop works with any Hadoop distribution, I choose Hortonworks Sandbox as the base platform, and downloaded VirtualBox image for it at http://hortonworks.com/products/hortonworks-sandbox/#install choose to install version version 2.2 as the Actian recommended this version. You may double check now, as they have updated their recommendations with latest releases of their software.
- Start the Hortonworks sandbox, and on this sand box download the Actian Vector in Hadoop bits. For this exercise, I downloaded the Actian Vortex Express as this is free version. This software comes in a TAR file, follow directions here to install it Actian Analytics Platform - Express Hadoop SQL Edition 2.0
- Note, that in the above, when I ran "./express_install.sh" it did not work correctly. For it to work correctly I need to set up a "actian" user with "actian" group, then add this user to "sudoer" group. Then login into the sandbox using this user then issue "sudo ./express_install.sh" to successfully install the software.
- Then using the guides http://esd.actian.com/Express/AH_Tutorial_HSE_2.0.pdf and http://esd.actian.com/Express/AH_QuickStart_HSE_2.0.pdf I installed the "demo" quick start example using "director" and "knime" applications.
- Then have the port 16967 open for forwarding on the Hartonworks sandbox, otherwise just make sure that Hortonworks VM is accessible from the host machine where the Teiid is running. Use the "ping" command.
- Download the JDBC driver for Vector in Hadoop from ESD - Electronic Software Distribution
- Using SquirreL validate that you can make connection to the Vector In Hadoop instance on the HortonWorks Sandbox VM
- Follow the directons in <jboss-eap>/docs/teiid/datasources/actian-vector/readme.txt and set up the data source in JBoss EAP for Vector in Hadoop installed on Hortonworks Platform.
- Deploy the following Dynamic VDB
<vdb name="actian" version="1"> <model visible="true" name="demo"> <source name="demo" translator-name="actian-vector" connection-jndi-name="java:jboss/datasources/vectorDS"/> </model> </vdb>
Then using the JDBC client like SquirreL, connect to above VDB using Teiid JDBC driver and issue any SQL queries against it.
When using the Designer all the installation and Teiid configuration steps are same, but for building the VDB use JDBC importer to read metadata from Vector In Hadoop, and build the rest of the VDB. Make sure you are using "actian-vector" as the translator name for the source model. Then once you deploy the access mechanisms are still same as Dynamic VDB.