-
1. Re: WFS translator & spatial
shawkins Sep 16, 2014 10:04 AM (in response to tom9729)> Just curious if anyone has looked at, or is interested in a translator to talk to catalogs over WFS.
To my knowledge no, but it does look like a good integration strategy.
> I guess this might also be a good time to discuss what would need to happen to add first-class support for spatial queries to Teiid.
There has been some interest there. However given the rigidity of our type system and the breath of the spec we haven't made a concerted effort.
> Adopting a subset of the PostGIS functions for spatial filters and converting to/from WKT.
You are correct that It would be easiest to circumvent our lack of struct support and use textual representation like http://en.wikipedia.org/wiki/GeoJSON
> Add several capabilities to the translator API. Probably needs to be a little more specific than "SupportsSpatial".
Yes it will depend upon the variations in what the possible sources support.
> Add a "geometry" column type. This could be based on the Geometry model from JTS.
This could be done as a subtype of clob/json, but yes ideally it would need to be its own type. However you can see from the changes when varbinary was added that there is quite a bit that goes into introducing a type.
> Possibly add support for the engine to handle spatial filters. This would probably need to leverage something from GeoTools.
If there is an existing library to leverage, then we'd definitely be interested in going that direction.
If you want to issues and even a git fork to collaborate on, then we can flesh this out more.
-
2. Re: Re: WFS translator & spatial
tom9729 Nov 3, 2014 12:07 AM (in response to shawkins)I'm looking at adding a Geometry type and have a few design questions. For reference, I'm working outward from org.teiid.core.types. Also I've been looking at the Simple Features spec. This is the basis of PostGIS and some of Oracle SDO, so it seems to be a good place to start. FYI, JTS is an LGPL Java implementation of the spec.
An initial focus will be on adding support for this to the JDBC translator. Every database has its own spatial types exposed as structs, but this seems to be a tedious approach because it would require vendor classes for each translator (e.g., oracle.sql). An alternative is to use vendor functions like AsWKB to encode geometries as WKB and then read them with JTS. This appears to be the approach taken by the GeoTools MySQL module. One caveat with this approach is that the SRID cannot be encoded in WKB, but it could be pulled from table metadata instead. First step here would be to figure out how to rewrite queries like "SELECT GEOMETRY" to "SELECT ASWKB(GEOMETRY)".
Does it make sense to expose Geometry directly, or should I look at wrapping it?
Is it okay to add a dependency against JTS to common-core?
I've got a branch here but haven't checked anything in yet. This is a side project so work might be a little slow.
Thanks,
Tom
-
3. Re: Re: WFS translator & spatial
shawkins Nov 4, 2014 9:34 AM (in response to tom9729)> Does it make sense to expose Geometry directly, or should I look at wrapping it?
If there isn't too much of a performance hit, I'd be in favor of letting the runtime type be an extension of blob with the aswkb encoding. We have done similar things for xml with subtypes (doc, element, text, etc.) and json.
> Is it okay to add a dependency against JTS to common-core?
Ideally we wouldn't add a direct dependency. Especially anything that would be required on the client side. The type system isn't were we need it to be for easy extensibility - especially with complex java object types.
-
4. Re: WFS translator & spatial
tom9729 Nov 4, 2014 11:23 PM (in response to shawkins)The WKB blob approach should work. I will start with that. Not concerned with performance this early.
Would this be a new runtime type? Extend BlobType? As a client connecting with JDBC I should see BLOB? Any thoughts on exposing Geometry (WKB + SRID) over JDBC as a UDT? For example: http://docs.oracle.com/javase/tutorial/jdbc/basics/sqlcustommapping.html
(+1 for most acronyms used in a sentence...)
Quick sketch of WKB conversions that will need to happen.
- Translators will need to normalize geometry values to WKB bytes.
- JDBC - Use source functions (like AsWKB).
- MongoDB - Use Geotools geojson module to convert GeoJSON to JTS Geometry, then JTS to go to WKB.
- WFS - Use Geotools to get to JTS Geometry, then JTS to go to WKB.
- Geometry constructor functions will need to convert literal values to WKB bytes. Subset of these would be useful: http://postgis.net/docs/reference.html#Geometry_Constructors. JTS can do most of these.
An important piece of information that will have to be carried around is the SRID, but I think this can be attached to the column metadata as a property. Should have an assertion that the SRID of the column and any literal values being used are the same. Later, could use Geotools to reproject the literal value in situations where the SRID does not match.
Thanks,
Tom
Edit: While I have your attention, I think I found a typo (i.e., line repeated twice): teiid/JDBCSQLTypeInfo.java at master · teiid/teiid · GitHub
- Translators will need to normalize geometry values to WKB bytes.
-
5. Re: WFS translator & spatial
shawkins Nov 5, 2014 8:49 AM (in response to tom9729)> Would this be a new runtime type? Extend BlobType? As a client connecting with JDBC I should see BLOB? Any thoughts on exposing Geometry (WKB + SRID) over JDBC as a UDT? For example:
The JDBC sql type mapping takes a infrastructure that doesn't exist yet in the Teiid logic and I believe is targeted at structs.
In ClobType there is a type designation to differentiate between text and json. Blob could be done similarly. Older clients would just see it as a blob value.
> Edit: While I have your attention, I think I found a typo (i.e., line repeated twice): teiid/JDBCSQLTypeInfo.java at master · teiid/teiid · GitHub
Yes we can remove the extra. It doesn't affect anything because it's just overwriting the previous map entries.
-
6. Re: WFS translator & spatial
tom9729 Nov 9, 2014 2:21 PM (in response to shawkins)I've been spinning my wheels on this for a bit and am looking for some guidance. I started going down the route of adding geometry as a subtype of Blob, but ran into a couple of issues.
What I changed in the JDBC translator.
- Modified MetadataProcessor to recognize Oracle and MySQL geometry columns. For reference, MySQL exposes geometry columns as BINARY with a runtime type of "geometry". Oracle exposes STRUCT with a type of "mdsys.sdo_geometry".
- Modified the SQLConversionVisitor to call visitGeometry() for geometry DerivedColumns. The idea is that translators would override this and wrap any selects that return geometry types in SQL to return WKB blob. For example, Oracle would wrap the expression with SDO_UTILS.TO_WKBGEOMETRY(expr).
Both of these require somewhere to store the information that the column is a geometry. I think there are two ways to do this.
- Storing the blob encoding (i.e., WKB) as a property in the column metadata. I would like to report the runtime type to the client as "geometry", but I'm not sure if the column properties are available there.
- Adding a runtime type "geometry". It seems like this would require adding a new class like GeometryType, which I assume we want to avoid.
Thoughts?
Thanks,
Tom
Edit: Committed initial bit using column property to detect WKB.
Edit2: Here's an unfinished patch that adds a runtime type like "blob.WKB" and uses that instead of the property. This is simpler and has the advantage of showing up in the JDBC metadata, although for some reason the SQL type has changed from BLOB to JAVA_OBJECT.
-
7. Re: WFS translator & spatial
rareddy Nov 10, 2014 10:30 AM (in response to tom9729)Tom,
Both of these require somewhere to store the information that the column is a geometry. I think there are two ways to do this.
- Storing the blob encoding (i.e., WKB) as a property in the column metadata. I would like to report the runtime type to the client as "geometry", but I'm not sure if the column properties are available there.
- Adding a runtime type "geometry". It seems like this would require adding a new class like GeometryType, which I assume we want to avoid.
To add a new runtime type, yes you would need to add data type like "GeometryType", but as Steve said above it would require quite bit of changes in to introduce new type through out, but they are required. As he suggested take a look at VARBINARY support jira and its commits to see the scope of the changes. It may be OK to add the type, but can avoid the jar dependency to the client. The other possibility is what you are saying, a blob type. Here, you can introduce a metadata property that defines a particular column as WKB blob, or introduce "type" in BlobType's class as Steve suggests. In metadata processor when you create the column, you can add the property or type.
What are others vendors doing in exposing these types through JDBC layer in the client side? Are there any efforts in standardizing these types, any proposals we can lean on? otherwise they will show up as object or blob types in the client, and we need to introduce some type specific classes in the client, where user can use these to extract the data, but that will introduce strong dependency on Teiid.
Ramesh..
-
8. Re: WFS translator & spatial
shawkins Nov 10, 2014 1:56 PM (in response to rareddy)> What are others vendors doing in exposing these types through JDBC layer in the client side?
Mostly structs, which have some standard JDBC access routines and as Tom was mentions can be used with type mapping to map to objects.
> To add a new runtime type, yes you would need to add data type like "GeometryType", but as Steve said above it would require quite bit of changes in to introduce new type through out, but they are required.
Given that there are so many subtypes for geometry, how much benefit would there be with a single new type?
How much of the typing issue would be resolved by general struct support?
-
9. Re: WFS translator & spatial
shawkins Nov 10, 2014 2:58 PM (in response to shawkins)> Edit2: Here's an unfinished patch that adds a runtime type like "blob.WKB" and uses that instead of the property.
That's getting pretty close to adding a full type, but you'd want to use a name without a . such as glob or gemometryWKB, etc.
> This is simpler and has the advantage of showing up in the JDBC metadata, although for some reason the SQL type has changed from BLOB to JAVA_OBJECT.
If you want runtime/design time type checking then yes adding full new type makes sense. It would be rounded out with cast support to/from varbinar/blob. It is possible that additional client code would be needed, but ideally you'd want to see a fall back to blob rather than object.
Generally I haven't added a full type unless the runtime instance class was different (xml / varbinary), but not for JSON nor the xml subtypes.
For clarification types.dat is a bridge to the Designer side of the world, which would need a corresponding (by uuid / name) entry to have the type available for use.
Steve
-
10. Re: WFS translator & spatial
tom9729 Nov 10, 2014 9:25 PM (in response to shawkins)> That's getting pretty close to adding a full type, but you'd want to use a name without a . such as glob or gemometryWKB, etc.
I went with "blob.WKB" as an idea for a general way to handle subtypes. Basically, for situations where you want to know that the column holds WKB but to have everything else just treat it as blob. In the branch I currently have "blob$WKB". Something like "geometry" would be preferable however.
There are several places where it's helpful to know that it's geometry and not just a normal blob:
- In the SQL string visitor to rewrite references in the select. For example, most JDBC databases seem to return geometry as a proprietary struct with a corresponding JDBC class. I think a better approach is to just instruct the database to return the geometry encoded as WKB which is a standard format. For example, rewrite "SELECT SHAPE ..." to "SELECT AsWKB(SHAPE) ...".
- Not required, but it would be good to tell the client through the runtime type JDBC info to simplify autodiscovery.
- Runtime type checking of SQL functions obviously.
> If you want runtime/design time type checking then yes adding full new type makes sense. It would be rounded out with cast support to/from varbinar/blob. It is possible that additional client code would be needed, but ideally you'd want to see a fall back to blob rather than object.
> Generally I haven't added a full type unless the runtime instance class was different (xml / varbinary), but not for JSON nor the xml subtypes.
MySQL geometries show up as SQL type BINARY, type name "geometry". I would like to emulate that in Teiid because it vastly simplifies the interface for the client. If that means adding a new runtime type (which will likely just extend BlobType) then so be it. The client gets back WKB which they can load into a JTS Geometry object, or something else completely.
(Note on MySQL geometry: I've seen some examples that read this directly and I'm guessing this is stored as EWKB natively. EWKB is a PostGIS extension to the spec that stores the SRID with the geometry. It's probably best not to rely on this though.)
> For clarification types.dat is a bridge to the Designer side of the world, which would need a corresponding (by uuid / name) entry to have the type available for use.
Something was breaking all of the test cases in system metadata init, which loads that file. I think it was not having the type referenced in types.dat, so I fudged up a row based on blob.
> Given that there are so many subtypes for geometry, how much benefit would there be with a single new type?
The databases that I've looked at all seem to expose one type for geometry. I don't know too much about how the subtypes are handled. MySQL for example will return null if you try to do something like calling AREA on a LINE. Would be good to look at the JTS Geometry type. Oracle has a similar type called JGeometry. MySQL does not seem to have anything in the driver, so I assume people just use JTS.
> What are others vendors doing in exposing these types through JDBC layer in the client side? Are there any efforts in standardizing these types, any proposals we can lean on? otherwise they will show up as object or blob types in the client, and we need to introduce some type specific classes in the client, where user can use these to extract the data, but that will introduce strong dependency on Teiid.
There is the OGC Simple Features SQL spec which I linked to a couple of posts up. JTS is an implementation of that. I'm not sure about the history but it seems like the spec affected some of the work done on Oracle spatial, and basically defined PostGIS.
I think the path forward is to add a type called GEOMETRY that is a blob of geometry encoded as WKB. It should be exposed to the client as SQL type BLOB / data type GEOMETRY (similar to MySQL). The client can read the BLOB using standard methods and decode the WKB data to whatever format they want. For example, JTS WKBReader. This requires adding a runtime type to the engine (I think), but does not require the client to use any Teiid classes.
The translators will need to convert to/from WKB using source functions (in the JDBC case) or libraries like JTS/Geotools.
The next step towards being compliant with the OGC spec (which will be useful for Geoserver support) is to expose some metadata views for geometry columns. The spec defines a table called GEOMETRY_COLUMNS that stores information about geometry columns (like SRID). There's another table that defines valid SRIDs and some other stuff. GIS tools seem to use these views for autodiscovery, and MySQL requires this to be created and maintained manually.
Next is implementing some of the SQL functions defined in the spec. In cases like JDBC, most of these can probably be pushed down. For the others, I think minimally JTS will be required in the engine to implement. License is compatible and the library isn't too large (and it would not be required on the client).
Geotools adds some nice stuff and would be required for a WFS translator, but is probably not necessary in the engine.
Thanks!
Tom
-
11. Re: WFS translator & spatial
shawkins Dec 8, 2014 1:47 PM (in response to tom9729)Just wanted to check to see how your progress has been. Since we are early in 8.10 it's possible that we can evaluate this work for the current cycle.
-
12. Re: WFS translator & spatial
tom9729 Dec 8, 2014 8:42 PM (in response to shawkins)I've been busy with other things but hopefully can pick it back up this week.
Thanks,
Tom
-
13. Re: WFS translator & spatial
tom9729 Dec 27, 2014 2:20 AM (in response to tom9729)Haven't made much progress since last post, but I squashed my commits to simplify review. What I have seems to work for MySQL, but I seem to have broke one or more of the unit tests in the process. Will keep working on it..
https://github.com/Tom9729/teiid/commit/6608ab38e4d0557e14f3f48356ec79ca71b76d9e
Thanks,
Tom
-
14. Re: WFS translator & spatial
tom9729 Jan 2, 2015 3:08 AM (in response to shawkins)Quick update..
- Added GeometryType and necessary boilerplate. This is an extension of BlobType and doesn't really do much right now besides hold shapes encoded as WKB.
- Added GEOMETRY type to grammar for DDL.
- Implemented ST_AsText() and ST_GeomFromText() in the engine using JTS library.
- Implemented ST_Intersects() and ST_Contains() as pushdown.
- JDBC translator - Changed to recognize source column types like SDO_GEOMETRY and report them as GEOMETRY. Changed to wrap GEOMETRY selects in the source-equivalent of ST_AsBinary() to avoid having to deal with proprietary structs. (Ditto for the other direction with literals.) So far this only works on MySQL but should be a matter of adding converts for the others.
Example query against my MySQL DB:
select mkt_id, shape from cola_markets where ST_Intersects(shape, ST_GeomFromText('POLYGON ((30 0, 50 50, 0 50, 0 0, 30 0))'));
"shape" is a GEOMETRY that comes back as WKB encoded BLOB. User can use JTS WKBReader to get this into a format to work with.
Commits are here: https://github.com/Tom9729/teiid/commits/TEIID-2384
Next focus:
- Fixing the tests - I broke a great deal of the integration tests by adding types and functions. It looks like there are a bunch of dumps of the system schema tables. I think I inadvertently caused some of the UUIDs to change. What's standard procedure here? Just update the "expected" from the results?
- Adding more tests...
- Adding support to one of the non-JDBC translators; probably MongoDB.
- Adding some quickstarts.
Thanks,
Tom