I am posting a conversation with Steve about how to tune/calculate minimum required heap size in Teiid. Will try to get this info back into document, but if you are interested read on, and ask questions. - Ramesh.
(09:11:25 AM) rareddy: Hi, seeing OOM errors with queries
(09:12:23 AM) rareddy: Do not have caching on, default settings, but they are using a one humongous query
(09:12:48 AM) rareddy: 100+ columns and 3 million rows
(09:13:09 AM) rareddy: have any suggestions to tune?
(09:25:24 AM) shawkins: this is with eds? how many concurrent queries, what is the vm size, etc.
(09:25:39 AM) rareddy: 1
(09:25:43 AM) rareddy: 1303
(09:26:11 AM) rareddy: yes it is on EDS
(09:26:44 AM) shawkins: is this one of their union all queries?
(09:26:53 AM) rareddy: yes
(09:27:31 AM) rareddy: I asked for their average column size, they have not replied yet
(09:28:20 AM) shawkins: so the minimum memory footprint would be something like (connector batch size)*(number of sources)*(driver size of rows) +
(09:28:58 AM) rareddy: may be it is going over that limit
(09:29:25 AM) rareddy: I told them reduce the driver batch size, but not the connector
(09:29:47 AM) shawkins: (processor batch size)*(max reserve batch columns)*(average java value size) +
(09:30:12 AM) shawkins: at least one more processing batch in flux
(09:31:08 AM) shawkins: do you mean the connector batch size?
(09:31:28 AM) rareddy: driver size of rows == fetch size ?
(09:32:01 AM) shawkins: I'm not sure what you mean
(09:32:51 AM) shawkins: connector batch size is a buffer manager setting. it gets turned into a jdbc fetch size, but there is no direct translator setting to manipulate fetch size
(09:32:57 AM) rareddy: maxRowsFetchSize?
(09:34:32 AM) shawkins: that is for our jdbc client
(09:35:50 AM) rareddy: but the engine needs to pack that much buffer going out, so that does not matter?
(09:37:49 AM) rareddy: what setting did you mean above by 'driver size of rows', in the source JDBC driver?
(09:37:58 AM) shawkins: that setting currently doesn't matter unless it is set to less than the processing batch size, because the code currently doesn't allow for more than 1 batch to contribute to the client batch. That is something that has been a long standing todo
(09:39:04 AM) rareddy: ok, so defaults to "processBatchSize".
(09:39:10 AM) shawkins: in theory it should matter that one of clients sets it fetch size to something like 100000 rows and then expects the engine to send that in a single shot. we need to limit that to something practical
(09:39:32 AM) rareddy: yep, that makes sense
(09:39:43 AM) shawkins: back to issue, do they have value caching turned off?
(09:40:06 AM) shawkins: that would be the recommendation as they are dealing with such large data sets
(09:40:10 AM) rareddy: They are not connecting through ODBC, so no caching on from client side
(09:40:38 AM) shawkins: canonical value caching is currently controlled through a system property
(09:41:05 AM) rareddy: please explain,
(09:41:19 AM) shawkins: RTFM
(09:41:29 AM) rareddy: LOL, will do
(09:42:14 AM) rareddy: back to above question, then canonical value cache is not off
(09:42:51 AM) shawkins: in short the recommendation to them would be to 1. ensure that canonical value caching is turned off (which is more for avoiding the lookup costs). However this should have the effect of making their memory footprint worse, not better.
(09:47:30 AM) shawkins: 2. approximate what 1024*(number of sources)*(java row size) should be. The back of the napkin math will be something like (32bit?5:7)+4*(raw bytes) per column
(09:51:46 AM) shawkins: they're previous estimate to us seemed to be along the lines of 100 megs per million rows or 100 bytes per row for 10 columns - or about 10 raw bytes per column. scaling to 100 columns with 3 sources on 64bit gives - 1024 * 3 * (47) * 100 = 14 megs
(09:52:43 AM) shawkins: if they are using much larger strings, then that estimate is too low. but what it should confirm is that the problem is not with the connector batch size
(09:53:46 AM) rareddy: How about this part of equation? (processor batch size)*(max reserve batch columns)*(average java value size)
(09:54:12 AM) rareddy: they have 100+ columns on each row
(09:54:46 AM) shawkins: right that's the next part, but I forgot to ask. If this is a scenario where the client is using a forward only resultset, then we don't expect the buffer to come into play here unless their client is lagging
(09:55:13 AM) rareddy: yes, it is ODBC
(09:55:40 AM) rareddy: so, Forward only
(09:56:41 AM) rareddy: does not mean we proactively ship results, before the more results call comes in?
(09:56:45 AM) shawkins: in any case then, TEIID-1463 partially address this issue as we started limit the size of the results buffer
(09:57:19 AM) shawkins: no we don't ship the results, but prior to TEIID-1463 we would just keep throwing them into the results buffer as more results become available
(09:58:17 AM) rareddy: this is in EDS, so we may be seeing some of that
(09:59:02 AM) shawkins: so on the reserve batches we have 512*47*16384 ~= 394 megs
(09:59:07 AM) rareddy: but in that scenario, when the memory bounds reached we use disk right?
(09:59:56 AM) shawkins: yes, but the memory bound is determined from the maxReserveBatchColumns. so if it's set inappropriately, then there's a problem
(10:00:51 AM) rareddy: looks like with less than one batch they seems to exceed the count
(10:02:40 AM) shawkins: From our observation/expectations then for a 64 bit platform, we have a max of 1300 - 200 (or more for AS/EDS foot print) - 400 (for buffer/plan processing), leaves us with about 700 megs of unaccounted for memory.
(10:03:41 AM) rareddy: you only counted for 10 rows
(10:03:49 AM) shawkins: If their raw estimate is around 10 bytes per column, then it would be good to get a diagnosis of what is in the heap.
(10:03:55 AM) shawkins: no
(10:04:22 AM) shawkins: what do you mean
(10:04:30 AM) rareddy: times 10 on the first would be good, that still only at 140 mbs
(10:04:47 AM) shawkins: I'm still not sure what you mean
(10:05:06 AM) rareddy: They have more than 100 columns per row
(10:05:56 AM) shawkins: that's fine. I'm giving these numbers using the estimate of 100 columns
(10:06:55 AM) rareddy: oh I see *100 above, I was reading text
(10:08:38 AM) shawkins: In other words with canonical value caching disabled and an estimate of 10 "raw" bytes per value, it looks to us like they should be fine. However if the estimate is off by a factor of 2, or just 20 "raw" bytes per value, then that would push them out of memory
(10:09:44 AM) rareddy: it would twice as much, possible they may be in between. They did say 100+
(10:10:02 AM) shawkins: are you talking bytes or columns
(10:10:12 AM) rareddy: columns
(10:10:22 AM) shawkins: right, I'm talking bytes
(10:10:48 AM) rareddy: both I mean, it is over 100 rows + more bytes per column
(10:11:57 AM) rareddy: if we were to keep the "Value Caching" on, how much damage more?
(10:11:57 AM) shawkins: sure, what I'm trying to get at here is that the biggest determinant is the bytes per value. We already consider the number of columns as a primary factor in buffering, so it's not as big as deal as it seems
(10:12:49 AM) rareddy: I see, because they only matter in the Connector Fetch size buffers.
(10:12:57 AM) rareddy: processing does not matter.
(10:13:27 AM) rareddy: Also, we did not account how much the native driver itself is holding
(10:13:36 AM) shawkins: in other words, assuming 64 bit, have them cut maxReserveBatchColumns in proportion to 10/(actual bytes per value)
(10:14:08 AM) rareddy: did not follow the last comment
(10:14:22 AM) rareddy: why 10?
(10:14:48 AM) rareddy: because we estimate by 10 bytes per column
(10:14:57 AM) shawkins: because when we ran through the numbers with 10, we came to the conclusion there should be 700 megs of extra memory
(10:15:05 AM) rareddy: got it
(10:15:26 AM) shawkins: it's not quite as simple of a scaling as that, but it's a quick explanation
(10:16:30 AM) rareddy: ok, now back to value caching, how does this come into picture?
(10:16:35 AM) shawkins: leaving value caching turned on means that as the buffer fills up then each comparable value is looked up in a canonical cache so that if it exists we only hold one reference
(10:17:23 AM) shawkins: for millions of rows with unique values, the benefit of value caching diminishes greatly as you are more likely to get cache misses
(10:18:00 AM) rareddy: this is good with joins, they may be doing simple unions
(10:19:16 AM) shawkins: it is good in any situation where the values you are dealing with are constrained and you are dealing with a smaller memory environment (our default AS/EDS). but in a large production environment, the customer should adjust the heap up and is probably dealing with a larger volume of data
(10:20:53 AM) shawkins: TEIID-1509 covers changing the default for value caching
(10:23:36 AM) rareddy: cool, that explains for at least part of 700 mb even if the column size is 10 right?
(10:23:50 AM) shawkins: another complicating factor in this would be lobs. Since we directly hold all lob references, they are a black hole of memory. We're not sure what the driver may be holding and then on the odbc side we're materializing them
(10:24:52 AM) shawkins: my numbers were assuming that value caching was disabled. if it's enabled, then our earlier numbers should have been a little lower
(10:25:46 AM) rareddy: yes, EDS does not support lobs yet, and it would be a client side setting how big buffer they provide to read lobs
(10:26:43 AM) rareddy: (when caching on) lower for overall numbers + caching might be high (did you say it only caches the indexes?)
(10:29:45 AM) rareddy: situation will so much verse with concurrent clients? The maxReserveColumns is system wide, so only the connector side should matter in that case
(10:29:58 AM) shawkins: canonical value caching means that all values in Teiid (whether they are from a source or created by us through a function/udf) have a lookup performed against a canonical cache. the cache itself adds no practical memory overhead (only small datatype values are held directly) anything else is held by a weak reference
(10:31:29 AM) shawkins: with concurrent clients, yes it will be the maxActivePlans that prevents too much access at the source level from happening at the same time
(10:33:27 AM) rareddy: cool, there is a cap we can calculate for multi client situation then
(10:34:35 AM) shawkins: unfortunately what we just saw was that the maxActivePlan setting is stuck at 20, see TEIID-1505
(10:35:05 AM) rareddy: yep, but we can calculate for 20 for now.
(10:35:21 AM) shawkins: sure
(10:35:40 AM) rareddy: This is been good lesson on how to size the heap, I will make sure this info gets to Van's document
(10:36:07 AM) rareddy: Thanks Professor.
(10:38:27 AM) shawkins: there's just too many assumptions. I was mentioning to van the other day that more and more properties are going to have default settings of 0, so that we can calculate a system default rather than having something fixed in the config. There's also a greater chance that the user may not have to mess with the value altogether.
(10:40:30 AM) rareddy: that sounds like really good goal, but there would be still variables like number of sources and column sizes and number rows
(10:40:49 AM) rareddy: so it has be a runtime tuning tool
(10:41:18 AM) rareddy: constantly adjusting based on the volume and available heap
(10:41:24 AM) shawkins: right, we just want to get at a better minimum set of information that the user should/must enter and then we'll do the rest
(10:42:33 AM) rareddy: may be we can write simple utility and ask these questions and spit some of these settings perhaps?
(10:42:54 AM) rareddy: I mean in the mean while
(10:44:02 AM) shawkins: what I'm getting at is that we may need to set maxReserveBatchColumns to 0, but add a config item for average bytes per source value.
(10:44:23 AM) shawkins: and really, it's not all values we can be specific to strings/lobs
(10:45:03 AM) rareddy: right, that sounds good