6 Replies Latest reply on Mar 3, 2017 2:12 PM by rareddy

HDFS as a source from Teiid

sanjay_chaturvedi Feb 28, 2017 11:12 AM

We have a file stored in HDFS Hadoop cluster on some box, Is there any way in teiid to take this file as a source to create view model.

We have successfully accessed Hadoop objects using hive but will it be same for a JSON file distributed over cluster.

Please assist.

Thanks,

Sanjay

1. Re: HDFS as a source from Teiid

rareddy Feb 28, 2017 11:35 AM (in response to sanjay_chaturvedi)

Sanjay,

Unfortunately there is no HDFS based resource adapter, which is what ones needs then you can use the TEXTTABLE to parse the contents if they are in CSV, or XMLTABLE for the JSON. I do not think this very complicated to do, take a look at Developer's Guide to write a resource adapter. This may be a good opportunity for you to contribute back to Teiid community

See the corresponding JIRA [TEIID-3647] Create native connector to interact with HDFS as a datasource - JBoss Issue Tracker

Ramesh..
Actions
2. Re: HDFS as a source from Teiid

sanjay_chaturvedi Mar 1, 2017 8:42 AM (in response to rareddy)

Thanks Ramesh, would love to do so.
Btw can we make connection to Apache drill in that case, any pointer please.
Do we have resource adapter and translator for this. Or can we use some alternative misal components.

Thanks.
Actions
3. Re: HDFS as a source from Teiid

rareddy Mar 1, 2017 9:42 AM (in response to sanjay_chaturvedi)

Apache Drill is interesting proposition, this is query engine is some what similar to Teiid, but has a distributed execution capabilities. So, if you are just trying to call using the SQL then it may be possible using existing "jdbc-simple" or "jdbc-ansi" translators (we have not tried it).

However, there is roadmap thought in Teiid as to how we can either leverage or contribute into Apache Drill community with Teiid optimizer engine. This is a very long pole, essentially shifts Teiid architecture totally. So there is no decisions/ideas forward there yet on future direction.

Ramesh..
Actions
4. Re: HDFS as a source from Teiid

sanjay_chaturvedi Mar 3, 2017 11:40 AM (in response to rareddy)

Hi Ramesh,

Thanks for the info.

Even from the teiid designer I tried to make connection to drill using JDBC importer. Translator I used were jdbs-ansi and jdbc-simple. But both ended up with following error:
Caused by: java.lang.NullPointerException
at oadd.org.apache.calcite.avatica.AvaticaConnection.isReadOnly(AvaticaConnection.java:176)
at org.apache.drill.jdbc.impl.DrillConnectionImpl.isReadOnly(DrillConnectionImpl.java:452)
at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.<init>(BaseWrapperManagedConnection.java:199)
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnection.<init>(LocalManagedConnection.java:62)
at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:336)

I know its coming from Drill, but I am completely stucked on this. Would be great to have some assistance around it.
I used drill-jdbc-all-1.9 jar for this.This jar is single enough to make connection to drill as it includes sufficient dependencies as well. drill-jdbc-all-1.9.0.jar

Thanks,
Sanjay
Actions
5. Re: HDFS as a source from Teiid

sanjay_chaturvedi Mar 3, 2017 2:09 PM (in response to sanjay_chaturvedi)

An update, JDBC importer still not worked.
But somehow if connection can be managed, I found source query appends table name as a prefix before column name. In that case it doesnot work with Drill.
Ex.

SELECT PNRId FROM dfs.compleat.pnr_view ===============this work in drill
but SELECT dfs.compleat.pnr_view.PNRId FROM dfs.compleat.pnr_view ====this is not working in Drill.

I tried giving column name in source as PNRId only, but it always adding table name as prefix. Any way to skip that ? I know multi table same column problem will occur, but still..ne guess..

Thanks,
Sanjay
Actions
6. Re: HDFS as a source from Teiid

rareddy Mar 3, 2017 2:12 PM (in response to sanjay_chaturvedi)

If this driver is not following some what strict JDBC rules, then a specific Apache Drill translator is required before it can be used.
Actions

Go to original post