I am looking at Teiid caching to improve the performance and also minimize the time taken for query planning. Our observation has been that for large datasets, the time taken by Teiid to return data from the source (there is 1-1 mapping between source and view model) is significantly higher than for small datasets for the same entity. I have also noticed that the 1st query for an entity takes a large amount of time (say 18 secs) while subsequent execution of the same query takes 3-4 secs with no result-set caching.
In order to address this problems, I am looking into Teiid caching/query plans and have some questions about how Teiid internally manages it.
1. As per the documentation, the source tuples (batch of data fetched) is stored in the buffer manager which handles the memory and disk storage and we have the result-set cache which stores the result set cache entries. So the question is what is exactly held in the resultset cache? Is it mapping from the exact query to the location of source tuples that hold data to the select query? If the select resulted in more than 1 batch (and hence multiple source tuples) how are these multiple source tuples matched in the resultset cache?
2. Based on the documentation it looks like the resultset cache is already turned on but query results are not cached by default; is this correct? So the way to turn on caching for resultset would be to add a hint in the query like "
/*+ cache/Select * from A" or have the statement run the "set resultSetCacheMode true" query first. If each subsequent query requires the same hint, would running the resultSetCacheMode be a better option for resultset caching?
3. What is the preparedplan-cache used for and what is held at each node of the prepared plan cache? Is the sql statement the key and parsed query plan the value in the datagrid? This question will help us determine how to tune plan cache.
4. How is the scope of the cache determined by Teiid to be session/user/vdb? It can be over-ridden in the query hint but I don't really want to use query hint because it is not standard JDBC and doesn't lend itself to ORM solutions like JPA.
5. What are the implications of moving Teiid memory off the heap? How is the reference to previous source tuples swapped out in case the off heap memory becomes full?
6. Based on the architecture writeup; Teiid uses cursors for all data access. How do the cursors interact in caching? We use CursoredStream to hold a reference to large data sets and read data in batches from the CursoredStream. Will Teiid cache the results from the CursoredStream opened on the Teiid JDBC connection?
7. How is the invalidation of resultset cache handled when suppose that the user add/update/delete a row from the entity and this affects the resultset?
8. If we need to reduce the amount of time taken for query planning and make the first instance of the same query take the same time; would firing the same query with "set noexec on" be a valid option? Are there any alternatives to improving the query plan as the noexec assumes that I know the queries that I need to prepare plan for beforehand.
9. I am assuming that support for Infinispan in server mode would be defined by the Infinispan subsystem. Based on how the cache and buffer manager are used we will think about whether to use Infinispan in embedded mode or client-server mode.
Thanks in advance in reading through this long list of questions