> In comparison if the data is not yet cached it can be 3 to 10 times slower than normal JDBC. Is this normal?
I don't think that we've capture an exact amount of overhead. Of course it's expected to add overhead, but that does seem excessive.
> Are there steps we can take to optimize this as to get closer to normal JDBC performance? Should we increase memory size of the server (right now they have 8GB).
It would be best to identify what is contributing most to the slow down.
> I don't see any big loads on the server when using OData, using top and iotop.
Is it possible that this is more of a network issue?
> Any suggestions on how to tackles this would be quite helpful.
It would be good to isolate the testing as much as possible. I'll start locally using a dummy source that provides configurable row amounts and see what the overhead is just over localhost.
> Is it possible that this is more of a network issue?
Running over localhost I can confirm that it's not a network issue. With the default of a 256 row batch size I'm seeing for a single client that it takes between 3 and 16 times (for small to large results) longer than socket based JDBC for OData. As this removes any meaningful execution and network time, this should be representative of the additional IO overhead of OData. There is likely not much that can be done to improve things on the low end. I'll do more digging to see what can be done on the high end - adaptive batching for larger results possibly. In general you won't be able to get close to JDBC performance as the messages are much more compact and a prefetch is being utilized.
That is a considerable overhead. By IO do you mean disk IO? If that is the case we could opt for faster storage. If by IO you also mean cpu, at the moment we got a 2 core machine which can be scaled to 16 if necessary, if OData can utilize mutiple cores it should increase its speed (for each batch another core?).
I know that getting JDBC performance is not possible, but getting half that should be feasible if the OData component is done good? I might be making stupid assumptions now .
Thanks for the investigation you already did, I appreciate that!
> That is a considerable overhead. By IO do you mean disk IO?
I mean OData input/output in a generic sense. There is very little disk overhead from what I'm seeing.
> If by IO you also mean cpu, at the moment we got a 2 core machine which can be scaled to 16 if necessary, if OData can utilize mutiple cores it should increase its speed (for each batch another core?).
More cores will not help compared to JDBC as the OData processing on both the client and the server is effectively single threaded per request. About a quarter of the overhead I'm seeing is coming from reading the response JSON (I was just using the fork of the Simple JSON parser in Teiid). Worse while that processing occurs there is nothing happening on the server side. With JDBC there is a prefetch so that while the results are being read on the client side, the server is delivering the next batch.
> I know that getting JDBC performance is not possible, but getting half that should be feasible if the OData component is done good?
At a minimum there is too much JSON overhead - both production and consumption - for that to be feasible. The most improvement would take changes on both the consumption and production sides. For example Olingo could write the next token as the first field in the response object and the client could use stream processing with an immediate asynch call to get of the next results while processing the rest of the response - but I'm not sure many clients would operate in that manner.