Skip navigation
2007
slaboure

Lost on the MetaMatrix?

Posted by slaboure Jul 13, 2007

 

A few weeks ago we finally closed the acquisition of MetaMatrix. MetaMatrix is a pioneer in federated data services and metadata management. I am not sure you all understood what MetaMatrix is about when we initially released the formal press announcement back in April. At least I didn't ;) Consequently, now that we have closed this acquisition, I thought I would take time to describe in more technical term what MetaMatrix is all about.

 

For the busy developers, here is the “(Eiffel Tower) Elevator Pitch”®:

“MetaMatrix provides a way to aggregate disparate (and possibly heterogeneous) data sources (databases, mainframes, XML documents, etc.) and make them look like a single unified virtual database that you can access through standard interfaces like ODBC or JDBC or even as an XML document (XQuery interface) and accessible through Web Services. You can obviously perform a multitude of transformations on these various back-end database schema and even perform joins between multiple heterogeneous sources. MetaMatrix works both with Read and Write/Update/Delete operations.”

 

Now, for the less busy ones, let's go through a typical MetaMatrix use case. Let's say your company uses to store customer information in an Oracle DB. Over time, around 10 applications (Java, C, etc.) have been deployed and directly leverage this data source. At some point, your company merges with a competitor. This competitor stores its customer information in a DB2 DB and has a similar number of applications directly leveraging it.

 

When working out the IT integration plan, the CIO makes the following decisions:

  • Over the next 24 months, existing applications will be migrated to a new unique schema that will contemplate the specificities of both legacy DBs; this means that on an average, about one application will be migrated every month.
  • A set of new applications will be developed ASAP so that the company can start rolling out new services to the combined customers of the merged entities. Given that each system has its own concept of essential business metadata like "customer" and "account", this will have to somehow be unified.

 

The typical problems faced with that kind of realistic) scenario are the following:

  1. You cannot migrate all existing applications in one shot, which means various applications will be using the old and the new schema at the same time;
  2. You don't want to develop brand new applications on top of what's now considered a pair of legacy schema: you want to use the new schema for all new applications, otherwise you are just making your migration issue worse;
  3. You might not be able to replicate data contained in the “old DBs” into a new fresh DB: data inconsistencies or synchronization delays are not compatible with most applications, which means you must keep a single repository of the data.

 

See where I am going? One easy way to solve that solution is to use MetaMatrix. This is the typical steps you would follow:

  • Your architects would define a clean and new DB schema (probably based on the concepts of the two legacy ones);
  • Using the MetaMatrix Eclipse-based tools, architects will graphically implement the new schema i) by capturing the 2 legacy schemas in a graphical representation and then ii) by defining any required transformation to perform the mapping. Architects are also able to set a load of settings such as security scheme, caching, etc
  • Once defined, the new schema and transformation logic is stored in MetaMatrix's meta-data repository;
  • The team developing the new applications (based on JBoss AS and Hibernate, obviously) use the MetaMatrix JDBC driver and the newly defined schema. At runtime, MetaMatrix loads the referenced schema from the repository (it can load and run several schema at the same time obviously) and acts as a SQL database.
  • In parallel, the team dedicated to the migration of existing applications will focus on one application at a time and upgrade each to the new data model according to their migration schedule: each application can be migrated to the new schema independently of the others.

 

Once all applications have been migrated to the new schema (if such a thing is possible in the first place), the company can either decide to keep running things this way or migrate the database itself to the new schema, possibly getting rid of the MetaMatrix mediating layer. However, keeping this mediator can have several advantages, for example:

  • During the 24 months of the migration program, maybe other mergers take place or the new schema itself has to be modified/improved, hence leading to several versions of the unified schema running in parallel;
  • Even if the physical database itself is being migrated to the new schema, it might still make sense to keep MetaMatrix in the middle with a null “identity” mapping. That way you can make schema changes without affecting your applications–. At worse you will need modeling changes (all the benefits of model-driven architecture).
  • Perhaps add the following....Even after the DB is migrated to the new schema, it is likely that requests from other departments will come up, asking to combine both departments' databases for reporting purposes. Rather than providing another copy or an extract of the data, MetaMatrix can provide a view that combines both departments's data for a variety of reporting purposes.

 

What has been described above is a typical scenario where several relational databases are seen as a single one through a relational interface. However, MetaMatrix also provides the following features:

  • On the back-end: ability to aggregate relational and non-relational data sources. Typical examples include mainframes API (through adapters), XML documents and even Excel documents.
  • On the front-end: ability to represent the aggregated information not only as a relational source but also as an XML source and perform queries through the XQuery interface or as a Web Service, for example.

 

Hence MetaMatrix is not just a one-to-many relational aggregation layer, but really a many-to-many aggregation layer, relational or not.

 

As you can guess, MetaMatrix products will be open sourced at JBoss.org. We've already opened up a forum there, so you can start discussing new approaches for using MetaMatrix, including its roadmap and schedule for open sourcing.

 

Onward,

 

 

 

Sacha

The Background

 

This week is an important week for the JBoss division: we've released our first Enterprise Platform. A few months back, I had blogged about how we had decided to split our release work into two “branches”: the JBoss.org releases (i.e. JBoss as you always knew it) and the Enterprise releases (the only ones we will sell support for).

 

As you can imagine (if you cannot, I am telling you), such a new release and productization scheme required quite a few changes in engineering. While we have been able to leverage a lot of what RHT has been building in the last few years to put in place the RHEL/Fedora model, there was no ready-to-consume pattern we could apply to make it happen. For example, JBoss software is by definition OS-agnostic. What seems like a simple requirement à priori has immediate consequences on what tools we were able to reuse as-is (or not) from the RHEL team. We are going to drive a post-mortem of this first Enterprise Platform in the near future so we can improve our processes and tools for the next platforms to be released this year (including the EAP 5.0 and SOA 4.2 platforms – more on this below).

EAP 4.2

 

EAP 4.2 features many new components including JBoss 4.2.0, Tomcat 6, Hibernate 3.2.4SP1, Seam 1.2.0 as well as newcomers such as our new Web Services stack, a preview of EJB3 and the new transaction monitor acquired from Arjuna.

 

When defining what would make it into EAP 4.2 we had two main goals in mind:

  1. We wanted to offer a stabilized EE1.4-based environment for which we could commit to provide support for the next 5 years (with backward compatible fixes), and
  2. we wanted to provide a stepping stone towards EE5/EAP 5.0 (to be released later this year) by making sure we bundle in EAP4.2 as many EE5-based modules that had already been finalized.

 

EAP 4.2 has been tested on many different OSes (HP-UX, RHEL, Solaris, Windows), JVMs (BEA, HP, SUN) and DBs (MS SQL, MySQL, Oracle, Postgres SQL) and will be available in 7 languages. From a dependency standpoint, EAP 4.2 will be used as the foundation for the future SOA platform 4.2 and JBoss Communication Platform.

 

I told you, big engineering changes :)

Next...

 

Now that EAP 4.2 is out, we can fully focus our efforts on EAP 5.0, which will feature JBoss AS 5.0.

 

Practically, most of the work that remains to be done for JBoss AS 5.0 has not much to do with the implementation of the EE5 services themselves (we are in the high ninety percent TCK completion with not too much time spent on it), but mostly around the new JBoss Microcontainer and some long due refactoring we wanted to do (invokers, interceptors, metadata, etc.). With its new core (and its long awaited profile service), a new administration console, JBoss Messaging as its default broker, JBoss AS 5.0 will set a new standard in the JBoss AS releases. So stay tuned!

 

Onward,

 

 

 

 

 

Sacha

Filter Blog

By date: