In this article, I will walk through an example, where I will show to reading Microsoft Excel document into a relational table using Teiid. Then issue SQL queries to Teiid, to get the contents of Excel document. Once the search results are available in Teiid, you can further integrate/enhance this data with other data sources. The prerequisites for this example are to have some familiarity with Teiid. For the purposes of this article I am using Teiid 8.7.CR1 (or later).
Teiid uses Apache POI as underlying framework to access the Excel documents, so this works on all the platforms. Previous versions supported reading Excel documents in Teiid, however they required user to be on Windows platform and use "Excel" ODBC driver to access the document, then Teiid used "jdbc-odbc" bridge along with "jdbc" translator to access the contents. Although, it worked all these years, there was always been an issue about its workings and unpredictable behavior and un-common metadata definitions thus resulted in confusing set of queries. This old option is till available, however this new implementation tries to overcome all the above issues and provide a simple and easy way to access Excel documents.
You can find more documentation about this translator in Teiid documentation at Microsoft Excel Translator
For the purpose of this example, I am using sample Excel document called "names.xlsx". Data inside the document looks like below. A copy of this file is also attached to this article.
To read this Excel file in Teiid, a VDB needs to be created, which contains a source model that uses the "excel" translator. In Teiid there are two different ways you can create VDB. Dynamic VDB or Designer VDB. In Step 1, you only need to do create either Option 1 or Option 2
Step 1: Building a VDB
Option 1: Using Teiid Designer (Skip to Option 2 for Dynamic VDB)
- Install Teiid 8.7.CR1 (or Later)
- Install Teiid Designer 8.5 or later
- Start Teiid Designer, switch to "Teiid Designer" perspective.
- Configure the Teiid Server, make sure you have made connection to Teiid Server.
- Create a new "Teiid Model Project" and name it "ExcelDemo".
- The demo project is attached.
Build Excel Source Model
- Choose "File -> import", then choose "Teiid Designer/Teiid Connection >> Source Model".
- Next ..
- On the "Data the Source to use for Import" dialog box, click "New..", on the Create Data Source dialog should look like
- OK, then "Next >"
- In this dialog box make sure your "translator" is set to "excel". Set the import property 'Excel File' to 'names.xlsx' and 'Header Row Number' to '1' as shown in below image
- Click "Next>", then you will see the page below. Enter 'excel' for the Target Model Name.
- Click "Next>" and you will see the next page (shown below). This is the DDL for the model table 'Sheet1' to be generated.
- Click 'Next' to see the items to be created. Then click "Finish" to complete the import.
- When finished, you will see the table below in ModelExplorer.
- Right click on project name -> New -> Teiid VDB, name it "exceldemo", and add "excel" model to it. Then click "Finish". Selecting the VDB you should see image like
- Deploy to server. You can right click on VDB and deploy. Make sure you created data sources correctly.
- Go to Testing step.
Option 2: Building with Dynamic VDB (Skip this if you did Option 1)
- There two steps in this process, first create a data source that points to the directory of Excel document, so that File resource adapter can provide a File access to Excel document. Then create VDB itself.
- Edit "standalone-teiid.xml", and under "resource-adapter" subsystem section add the following XML for creation of file data source. Edit the below fragment for correct directory paths that represent your environment directory structure.
Data Source for Flat File
<resource-adapter id="file-ds"> <module slot="main" id="org.jboss.teiid.resource-adapter.file" /> <transaction-support>NoTransaction</transaction-support> <connection-definitions> <connection-definition class-name="org.teiid.resource.adapter.file.FileManagedConnectionFactory" jndi-name="java:/fileDS" enabled="true" pool-name="file-ds"> <config-property name="ParentDirectory">/exceldocs</config-property> </connection-definition> </connection-definitions> </resource-adapter>
Start JBoss EAP + Teiid Server using command
./standalone.sh -c standalone-teiid.xml
Create the following dynamic VDB, name the file as "exceldemo-vdb.xml"
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <vdb name="exceldemo" version="1"> <model name="excel"> <property name="importer.headerRowNumber" value="1"/> <property name="importer.ExcelFileName" value="names.xlsx"/> <source name="connector" translator-name="excel" connection-jndi-name="java:/fileDS"/> </model> </vdb>
- Using either web-console or CLI, deploy the above VDB. See directions at Deploying VDBs
Step 3: Testing
- Make sure you have active VDB by checking the Console or Log files or by using web-console http://localhost:9990/console/App.html#vdb-runtime
- Now using a tools like SquirreL or Eclipse Data Tools, using Teiid JDBC driver connect the "exceldemo" VDB. For connection URL, see Connecting to a Teiid Server - Teiid 8.7 (draft) - Project Documentation Editor
See all the names of the tables exposed by this VDB
select * from SYS.Tables
You will see a table called "Sheet1" under "excel" schema (model), now you can issue a query like
select * from excel.Sheet1
will return results
Teiid translator during the creation of metadata would have created a implicit row called "ROW_ID", using this column you can issue queries like
select * from excel.Sheet1 where row_id = 2; select * from excel.Sheet1 where row_id in (2,3); select * from excel.Sheet1 where row_id > 2 and row_id < 5;
Hopefully this article gave how to read Excel data with Teiid. Let us know if you have any comments or suggestions in Teiid forums.