Version 1

    Three levels of "caching" discussed:

     

    1) First is passive cache (passive from user standpoint) similar to  current results set cache.  Perhaps result object cache in future. As queries execute and flow through the system, similar queries hit  cache and are fulfilled from cache. Ideally this cache would be distributed among nodes in a clustered  environment but perhaps not initially (Note: I added this after our  discussion) Least amount of user-involvement, least flexibility from  user-standpoint, most automated. Only control is config params like TTL and cache size that user sets on  instance/system basis.

     

    2) Next level of caching is "pinned in memory" or "checkbox-enabled"  cache/materialization backed by buffer manager (with buffer manager  possibly backed by something like JBCache or Infinspan) or potentally  backed by an in-memory embedded database.  Here the user selects virtual tables and decides to materialize/cache  them through a simple UI gesture. They would just be materialized to memory.  User has limited control of  matl process.  Not suitable for complex matl schemes. How is refresh managed, implemented? Not for large data sets where lack of indexing will kill performance.  More for data sets with intermittently available data sources or data  sets that require complex calculations or aggregations. This is a desirable, new level of caching/materialization not currently  offered in the product.

     

    3) Third level is revamping of matl views as we know them.  Moving away  from process where user checks box in Designer and matl view source  model and ddl scripts are magically generated under the covers, and when  vdb is deployed scripts are spit out. Withing designer, make changes for user gesture to identify tables/views  for "full" materialization.  Gesture causes generation of matl view  source model and matl view model on top, where matl view model matches  virtual model from which it was generated.  Underlying matl source model may ultimately differ but that would be  user customization.  Automated process would generate view/source models  that match and have direct transformations by default. Open issue: How to decide when to hit matl view or not and how to  present that to the user.  Benefits to having single view for client  apps (e.g. clients coded to always hit the foo table), client code  should not have to embed decision of whether to hit matl view or source  data.

     

    Chaning client code too disruptive.  Should be able to control shunt in  data services layer. Options discussed include  

     

    - specialized language construct to allow joining/unioning between  matl and source data, with query directive to override the default   

    - specialized materialization transformations that control this  through explicit logic   

    - others? Matl view maintenance options discussed include:   

    - external scripts to pull data out of source and write to matl source   

    - virtual procedure logic

     

    When materialized source model is created, need account for status  data.  Status table in materialized schema and source model suggested.  What should it contain?    Num rows materialized?  Last updated? what else? This should be available to materialization management processes so  users can check this information when performing more complex scenarios and as a basis for monitoring/auditing.

     

    This is most complex/advanced form of materialization proving greatest  flexibility to user but requiring most user involvement.