5 Replies Latest reply on Dec 12, 2012 9:17 AM by rareddy

    How different is TEIID from ModeShape ?

    capoorhimanshu

      Hi ,

       

      I was thinking of teiid as a solution for my data virtualization and data federation requirements,  I implemented some of the sample code and teiid was fitting fine with project.

      However I came across Modeshape in recent days and found it provides functionalities of data virtualization( i.e. providing connectors for different data sources ) as well as provides

      data federation features.

       

      So how different are these projects ?

       

      Yes , they have different approach underlying i.e. relational in teiid and modeshape is graph oriented , but how they differ on functional aspects of data virtualization and federation ?

       

      Hope to hear from the experts soon.

       

      Thanks

      Himanshu Kapoor

        • 1. Re: How different is TEIID from ModeShape ?
          shawkins

          > So how different are these projects ?

           

          There would be several ways to answer this, but the initial is that they are completely different.  Modeshape was more recently developed and targeted federation/MetaMatrix concepts at the metadata/JCR space.  Modeshape was developed starting with a fresh codebase and continues to evolve independently of Teiid.

           

          The other similarities, such as being targeted for AS deployment and other JBoss integration, etc., ave little to do with the APIs and paradigms each project provides.  What really matters is how you are going to utilize the projects.  If you are dealing primarily with consumers who want a relational virtualization and are dealing mostly with real-time integration, then Teiid is good choice.

           

          Steve

          • 2. Re: How different is TEIID from ModeShape ?
            rhauch

            (I'm the ModeShape project lead and founder, so my description of Teiid might not match those of the Teiid community.)

             

            Teiid and ModeShape are very different.

             

            First and foremost, ModeShape is an "elastic and strongly-consistent hieararchical database." Perhaps the most important two words in that definition are "hiearchical database", which mean that ModeShape does persist its data, and all data is in the form of nodes and properties that make up a hiearchical (or tree-like or graph-like) structure. The hierarchical nature of the data informs all of ModeShape's behavior and how clients interact with it, ranging from the programmatic API (we implement the standard JCR API) to the structure and makeup of the hiearchy and how much validation and schema enforcement is performed. Finally, its very easy to add processes to or remove processes from a ModeShape cluster (there is no "master"), even when ModeShape is configured to "store" data in an in-memory data grid. ModeShape has other features, like versioning, events, queries, locking, etc.

             

            ModeShape is perhaps better classified as a particular kind of NoSQL database. Like other NoSQL databases, ModeShape can be used with little (or no) schema. But ModeShape is hierarchical rather than document-oriented, column-oriented, or key-value stores. It also is strongly consistent (meaning it uses ACID transactions), whereas many of NoSQL databases are weakly-consistent (aka, eventually-consistent) and therefore require your application to know how to handle conflicts.

             

            Quite simply put, applications need to choose the right kind of data storage technology that suits their needs. Lots of data is naturally hierarchical, and applications are stll far easier with ACID transactions. These are the kinds of use cases where ModeShape excels.

             

            On the other hand, Teiid is very much a relational technology. IMO, it is a relational database in almost every sense, including how it works internally and how clients interact with it. The only way it is not a conventional relational database is that it doesn't persists data itself, but instead always federates the data that is stored and persisted in a wide variety of other databases, services and systems.

             

            Now, both ModeShape and Teiid share some features that on the surface are very similar. Both support (read-only) queries, though only Teiid statements can create, update, or delete data. Both have web-service front-end, though ModeShape's are quite narrowly focused while Teiid (and Teiid Designer) can generate custom services that expose the specific tables/views/procedures of the database.

             

            And both have "federation", though they differ wildly in what that means. ModeShape's federation exists to augment the persistence mechanism, so that clients can more tightly integrate the data they are storing in ModeShape with related data that exists outside of ModeShape. For example, ModeShape is often used to store (among other things) files and documents with metadata about those artifacts. Often the files are stored (and versioned) within ModeShape. Sometimes some of the files need to be managed elsewhere, yet there still needs to be a single view of all the files and data that applications can expect. (Other use cases include storing web content for a CMS while also exposing to the CMS access to live systems.)

             

            Hopefully this helps describes how ModeShape and Teiid differ. Best regards,

             

            Randall

            • 3. Re: How different is TEIID from ModeShape ?
              capoorhimanshu

              Hi Randall,

               

              Thanks for the reply.  I understand from the post that Teiid is purely a relational approach and Modeshape is hiearachical (No SQL) approach. So in teiid schema is fixed and in mdeshape

              schema can evolve.However there are still doubts in my mind how they differ in functional point of view as well have some technical queries.

               

              Technical queries :

               

              • You mentioned above that Modeshape presist its data. When you are saying presist data here , does it mean that it stores the MetaData about the various data sources say

                       for e.g. text file , data bases and so on , or does it mean it stores the actual data ? I went through some of the slides available on internet regarding modeshape and they says

                         it leaves the data where it is in case of federation that means do not store the data . http://www.slideshare.net/rhauch/an-overview-of-modeshape#btnNext. (Please refer slide 21) .

              • Does it replicates the whole external Data Source actual data in memory or Store it on the disk and keeps the refreences to that in memory or it just stores the metadata ?

               

               

              Functional Queries:

               

              • In Teiid , it have the Models and VDB's . So when the SQL query over the VDB reaches the teiid run time engine, it parses the query and delegates the query to the actual external

                      federated system (the info of external metadata has benn stored in xml files) and than gives the result. There are caching mechanisms available in teiid , it too stores the data in

                      infinispan, but are optional not manadaotry

               

                             In case of modeshape , are there any specific virtual views available for the end users through APi's  or it exposes all the repository as a whole ?

                             For e.g say there are Table 1, Table 2 , Table 3 in MSSQL and Table 4 , Table 5, Table 6 in Oracle , now in TEIID i can define two VDB here say VDB 1 on Table 1 , Table2

                             and Table 4 and VDB 2 on Table 3 , Table 5 and Table 6 and cane expose them as separate entities to the end user. Now end user can write a query over these VDB's and

                             teiid take care of other things.

               

                             So can we define logical virtual views over the repository (may be not in relational way thats fine) or it exposes the repository as a whole ?

               

              • Are there any real time updation capabilties in Modeshape ?

               

                             In Teiid if i choose to cache the data , there are options available where i can schedule the refreshing the data at certain time intervals from the actual data sources. Also

                             Teiid exposes some of the events , which i can hook up with third party library to notify the teiid to load the data as soon as external data source changes (i.e almost real time )

               

              Hope to hear from you soon. Thanks in advance.

               

              Regards

              Himanshu Kapoor

              • 4. Re: How different is TEIID from ModeShape ?
                rhauch

                I fear that you're trying to compare ModeShape one-for-one with Teiid. That's like trying to understand what Cassandra does by putting it in terms of a relational database: Cassandra is not a relational database, and so you'll never really understand the benefits of Cassandra by just thinking about relational-like features.

                 

                Think of ModeShape as a kind of database that has unique features. Also, ModeShape 3 is far different than earlier versions. Check out this (far more recent) presentation: http://www.slideshare.net/rhauch/modeshape-3-overview

                 

                 

                • You mentioned above that Modeshape presist its data. When you are saying presist data here , does it mean that it stores the MetaData about the various data sources say

                         for e.g. text file , data bases and so on , or does it mean it stores the actual data ? I went through some of the slides available on internet regarding modeshape and they says

                           it leaves the data where it is in case of federation that means do not store the data . http://www.slideshare.net/rhauch/an-overview-of-modeshape#btnNext. (Please refer slide 21) .

                 

                ModeShape is a data store. You put data in, and it stores it so you can get it back out. And that data is always in the form of nodes with properties and child nodes. ModeShape stores all of its own data in Infinispan, which can be configured as an in-memory data grid (which distributes multiple copies of each node across the cluster), a replicated in-memory "cache" (where every node is copied to every machine in the cluster), or as a local in-memory "cache". In the distributed data grid and large enouh replicated configuration, the copies are your backup, whereas with smaller replicated configuration and local configurations you should use one of Infinispan's cache stores to persist the data to disk, to a relational database, to cloud storage, to Cassandra, etc.

                 

                So why would you use ModeShape then if ModeShape is just storing the data in Infinispan which might be storing the data in a relational database, Cassandra, etc.? Firstly, you don't have to persist data like that: the best and most scalable option is to use Infinispan as an in-memory data grid. But if you do choose to persist your data conventionally, you would use ModeShape because it adds value on top of the conventional storage: elastic clustering, hierarchical structure, flexible schemas (unstructured to very structured), events, versioning, full-text search, querying, storage of large files and strings, and the ability to store your data and then add schemas where needed.

                 

                Also, federation is entirely optional. When you use federation in ModeShape 3, you're still have your own data but you can also add external data to a repository. And when you do, the external system still owns the data; ModeShape only caches the data as clients use it, and any changes made via ModeShape are sent back to the external system (assuming the connector is writable).

                 

                • Does it replicates the whole external Data Source actual data in memory or Store it on the disk and keeps the refreences to that in memory or it just stores the metadata ?

                Again, it stores whatever you put in ModeShape. It doesn't matter to ModeShape if you consider that information to be data or metadata. It's all just data to ModeShape.

                 

                (Every ModeShape repository does have a "system" area in which ModeShape stores its own metadata: things like the node types, namespaces, versions, configuration information, etc. Applications are welcome to access it, but it's read-only to them.)

                 

                 

                               In case of modeshape , are there any specific virtual views available for the end users through APi's  or it exposes all the repository as a whole ?

                               For e.g say there are Table 1, Table 2 , Table 3 in MSSQL and Table 4 , Table 5, Table 6 in Oracle , now in TEIID i can define two VDB here say VDB 1 on Table 1 , Table2

                               and Table 4 and VDB 2 on Table 3 , Table 5 and Table 6 and cane expose them as separate entities to the end user. Now end user can write a query over these VDB's and

                               teiid take care of other things.

                 

                               So can we define logical virtual views over the repository (may be not in relational way thats fine) or it exposes the repository as a whole ?

                 

                ModeShape doesn't have "tables" or "columns" - it has a tree-structure of nodes, where each node has properties and child nodes. Most applications use the JCR API to access and update these nodes programmatically. Please see our documentation for an overview of how this works and how queries play into this: https://docs.jboss.org/author/display/MODE/Introduction+to+JCR

                 

                So if you're wanting to access existing data in MySQL and Oracle, use Teiid.

                 

                But if you're writing a new application that will need a database, choose the data storage technology that best suits the requirements. Relational databases may be best for some applications, but you'd be surprised how many times a relational database is chosen simply because that's what's been done in the past. Did you ever create a relational database to hold hierarchically-organized data? It's absolutely horrible. File systems are hiearchical and allow you to easily navigate to find related data. The same is true with ModeShape, except that instead of folders and blob-like files you use organize a structure of nodes with properties that can store anything.

                1 of 1 people found this helpful
                • 5. Re: How different is TEIID from ModeShape ?
                  rareddy

                  Teiid is a data *integration* engine, where it can integrate data from disparate sources. ex: relational databases, files, olap, SAP, salesforce, web-services, xml and custom sources.

                  Teiid is a data *federation* engine, where it provides ways to create canonical views on top of any data. ex. Provide single view of your data, or provide logical views of your data, no matter where the original data is from.

                  Teiid provides API, through its translators to expose any datasource as a relational source.

                   

                  The relational aspect comes to Teiid, by the choice how and what it uses for transformation glue language between data sources and querying language for the end user applications, that is SQL. Following the that paradigm, Teiid also supports JDBC, ODBC interfaces on top.

                   

                  http://teiid.blogspot.com/2009/06/relational-data-integration-engine.html

                   

                  Teiid is not a database itself, it does not store any data. Granted it does have caching capability, but those should be considered ephemeral and used performance reasons. For integration with end user applications seamlessly Teiid exposes itself as database facade.

                   

                  Ramesh..