iNodes and DataBlocks
bpiepers Feb 17, 2016 8:49 AMI'm currently debugging a situation where Teiid breaks up a particular "WITH" query into at least two separate queries that yield millions of records. I see differences in behavior between clients that connect and am trying to investigate what the problem is. In order to understand the situation better is there someone here that can answer the following questions? (jdv 6.0 with teiid 8.4.6)
1.
when the buffer manager is working on the queries I see a lot of statements that seem to deal with storing intermediate results somewhere:
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (BufferManager Cleaner) Allocating inode 11002 to 90 30600
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker0) Allocating storage data block 939524119 of size 16384 to 24007
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (Worker0_QueryProcessorQueue10826) q46uoHEnQCqA.0 Blocking on source request(s).
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker0) Assigning storage data block 939524119 of size 16384
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (BufferManager Cleaner) 91 30599 writing batch to storage, total writes: 29934
14:35:59,459 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker0) freeing inode 4579 for 90 24007
What do these statements mean and what is JDV doing at that point?
2.
if you decide to cancel the query on the client side because it takes too long we see that JDV does not stop logging these buffer statements. We see statements like this:
14:39:32,075 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) 90 reading batch 23661 from storage, total reads: 17003
14:39:32,075 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) Getting object at block 134217741 1 90 23661
14:39:32,075 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) Allocating inode 1936 to 90 23661
14:39:32,075 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) adding object 88 17830
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) 90 reading batch 23665 from storage, total reads: 17004
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) Getting object at block 268435469 1 90 23665
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) Starting memory buffer cleaner
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (Worker1_QueryProcessorQueue11969) Allocating inode 2352 to 90 23665
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker1) Assigning storage data block 13 of size 16384
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker1) freeing inode 1520 for 90 23657
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker1) Assigning storage data block 134217741 of size 16384
14:39:34,133 DEBUG [org.teiid.BUFFER_MGR] (FileStore Worker1) freeing inode 1936 for 90 23661
So very similar to the above. Again: what is JDV doing here and why does it continue to loop over these "batches". I have seen our test and acceptance environment being busy throughout the entire night. Although I understand that buffering is necessary in some situations I fail to understand why it is taking so long and why it doesn't stop looping while no client is connected.
3.
the specific query that we execute does not need to be split up into several queries. I know that it is impossible to instruct JDV to push down the query as is entirely to the source but am keen to know if there are other aspects that may influence the query translator/optimzer to decide when to split up queries. If using a specific translator (so not the simple-jdbc translator) and assuming that the drivers are correct, do factors like adding foreign keys between tables in the VDB models (even if they are not specified on the underlying database) or primary keys also play a role? Can they make the query engine more efficient, rather than just importing the tables/views as they are defined in the underlying database?