4 Replies Latest reply on Apr 30, 2018 9:59 AM by shawkins

    OData performance considerations

    marc.kusters

      We want to expose our VDB's but at moment we are unsure whether to use OData or ODBC/JDBC.

       

      With our own measurement we see that OData has an initial startup time that is quite high. In comparison if the data is not yet cached it can be 3 to 10 times slower than normal JDBC. Is this normal? How big should the overhead of OData be in comparison to JDBC, can anyone give some more specific numbers?

       

      We tried different OData batchsizes, the default is 256, we also tried 50.000, this is considerably faster for large data-sets however for smaller data-sets the startup delay is a lot bigger.

      Are there steps we can take to optimize this as to get closer to normal JDBC performance? Should we increase memory size of the server (right now they have 8GB).

      I don't see any big loads on the server when using OData, using top and iotop.

       

      Is there a "right"  way to test the OData performance and compare it with JDBC? We tried to test using powerBI, but the tool is quite inefficient in getting the data. Building a custom OData test in SOATest seems to give the best results as for now, but it is a lot more work to implement skiptokens.

       

      Any suggestions on how to tackles this would be quite helpful.

        • 1. Re: OData performance considerations
          shawkins

          > In comparison if the data is not yet cached it can be 3 to 10 times slower than normal JDBC. Is this normal?

           

          I don't think that we've capture an exact amount of overhead.  Of course it's expected to add overhead, but that does seem excessive.

           

          > Are there steps we can take to optimize this as to get closer to normal JDBC performance? Should we increase memory size of the server (right now they have 8GB).

           

          It would be best to identify what is contributing most to the slow down.

           

          > I don't see any big loads on the server when using OData, using top and iotop.

           

          Is it possible that this is more of a network issue?

           

          > Any suggestions on how to tackles this would be quite helpful.

           

          It would be good to isolate the testing as much as possible.  I'll start locally using a dummy source that provides configurable row amounts and see what the overhead is just over localhost.

          • 2. Re: OData performance considerations
            shawkins

            > Is it possible that this is more of a network issue?

             

            Running over localhost I can confirm that it's not a network issue.  With the default of a 256 row batch size I'm seeing for a single client that it takes between 3 and 16 times (for small to large results) longer than socket based JDBC for OData.  As this removes any meaningful execution and network time, this should be representative of the additional IO overhead of OData.  There is likely not much that can be done to improve things on the low end.  I'll do more digging to see what can be done on the high end - adaptive batching for larger results possibly.  In general you won't be able to get close to JDBC performance as the messages are much more compact and a prefetch is being utilized.

            • 3. Re: OData performance considerations
              marc.kusters

              That is a considerable overhead. By IO do you mean disk IO? If that is the case we could opt for faster storage. If by IO you also mean cpu, at the moment we got a 2 core machine which can be scaled to 16 if necessary, if OData can utilize mutiple cores it should increase its speed (for each batch another core?).

              I know that  getting JDBC performance is not possible, but getting half that should be feasible if the OData component is done good? I might be making stupid assumptions now .

               

              Thanks for the investigation you already did, I appreciate that!

              • 4. Re: OData performance considerations
                shawkins

                > That is a considerable overhead. By IO do you mean disk IO?

                 

                I mean OData input/output in a generic sense.  There is very little disk overhead from what I'm seeing. 

                 

                > If by IO you also mean cpu, at the moment we got a 2 core machine which can be scaled to 16 if necessary, if OData can utilize mutiple cores it should increase its speed (for each batch another core?).

                 

                More cores will not help compared to JDBC as the OData processing on both the client and the server is effectively single threaded per request. About a quarter of the overhead I'm seeing is coming from reading the response JSON (I was just using the fork of the Simple JSON parser in Teiid).  Worse while that processing occurs there is nothing happening on the server side.  With JDBC there is a prefetch so that while the results are being read on the client side, the server is delivering the next batch.  

                 

                > I know that  getting JDBC performance is not possible, but getting half that should be feasible if the OData component is done good?

                 

                At a minimum there is too much JSON overhead - both production and consumption - for that to be feasible.  The most improvement would take changes on both the consumption and production sides.  For example Olingo could write the next token as the first field in the response object and the client could use stream processing with an immediate asynch call to get of the next results while processing the rest of the response - but I'm not sure many clients would operate in that manner.