4 Replies Latest reply on Nov 2, 2015 12:42 PM by shawkins

    Teiid MemoryBuffer: memory management and swap

    jduke123

      Hi everyone,

       

      I'll start with explaining the scenario.

       

      I need to fill a target DB (postgresql9.4) with data coming from different sources.

      DV is installed on a RHEL7. I am using the 8.7.1 version.

       

      I need to move data on day schedule, thus the total amount of data may not be particularly a problem.

      What it concernes me is the worst case scenario in which I need to align all table, and it needs to be done in the fastest way possible.

       

      I invoke Virtual Procedures for moving data. Basically each one is an insert query from a select in a time frame.

       

      I have assigned to DV JVM up to 8G.

      Other parameters that I have changed are written below (fisrt configuration, second configuration)

       

      ### Memory Management

      buffer-service-max-reserve-kb                   : -1        6291456

      buffer-service-max-processing-kb             : -1         204800

      buffer-service-max-file-size                      : 2048        32

       

      ### Scalability

      buffer-service-processor-batch-size             : 256        1024

      buffer-service-max-storage-object-size         : 8388608    16777216

       

      ### Disk Usage

      buffer-service-max-buffer-space            : 500 500

       

      The 500MB for disk space derives from the size of the partition where I have installed DV.

      I can increase the size of the partition but I would like to do that if it is the only way for my scenario.

       

      I tried whith default teiid-cache definition and the configuration below:

       

      <cache-container name="teiid-cache" default-cache="resultset">

                      <local-cache name="resultset-repl" batching="true">

                          <locking isolation="READ_COMMITTED"/>

                          <transaction mode="NON_XA"/>

                          <eviction strategy="LRU" max-entries="204800"/>

                          <expiration lifespan="7200000"/>

                      </local-cache>

                      <local-cache name="resultset" batching="true">

                          <locking isolation="READ_COMMITTED"/>

                          <transaction mode="NON_XA"/>

                          <eviction strategy="LRU" max-entries="20480"/>

                          <expiration lifespan="720000"/>

                      </local-cache>

                      <local-cache name="preparedplan" batching="true">

                          <locking isolation="READ_COMMITTED"/>

                          <eviction strategy="LIRS" max-entries="5120"/>

                          <expiration lifespan="28800"/>

                      </local-cache>

        </cache-container>

       

       

      I tested with 12 active plans.

      In all configurations I reached a point where the buffer get to the top of disk memory and it starts to abort exceeding plans.

       

      From documentation I have not yet understood:

      1. when teiid-buffer decides to swap?

      2. how it decides the amount of RAM to use?

      3. does the teiid-cache is involved in this kind of transaction?

      4. is there a way to pause or queue exceeding plans insted of aborting them?

       

       

      Thank you,

      AC

        • 1. Re: Teiid MemoryBuffer: memory management and swap
          rareddy

          Andrea,

           

          There were few buffer manager configuration fixes we made in DV 6.2, so make sure you have latest patches applied. In my experience, running the DV default configuration worked the best, as if you increase one, there are impacts on the others. So, you need to be sure what is impacting your usecase.

           

          From the above description, I did not know exactly your usecase. I see you want to move the data, but is that single query from single client running? or multiple different clients? The size of box matters as the availability of the number of CPUs for processing. You want take look at "max-active-plans" for increased concurrent plan processing, but increasing this value without much hardware support is useless.

          1. when teiid-buffer decides to swap?

          Every plan gets a certain amount of processing memory based on configuration, when the given "batch" of data exceeds that amount then disk swap occurs

           

          2. how it decides the amount of RAM to use?

          See here Memory Management - Teiid 9.0 (draft) - Project Documentation Editor

           

          3. does the teiid-cache is involved in this kind of transaction?

          Cache is used to cache the results for resultsset cache or Materialization, they are independent from processing memory configuration

           

          4. is there a way to pause or queue exceeding plans insted of aborting them?

          I am not sure where the plans being thrown out, typically they get queued. I am not sure what behavior you are seeing and your interpretation of it. Can explain more about what you are seeing, may be a log file?

           

          Ramesh..

          • 2. Re: Teiid MemoryBuffer: memory management and swap
            shawkins

            More than likely what you seeing was addressed by https://issues.jboss.org/browse/TEIID-3050

             

            For 8.7.1 if you are issuing a query like "insert into ... select ..." and the source supports insertion with an iterator, then the engine will attempt to give the entire insert to the translator so that a transaction is not needed at the user query level.

             

            > I am not sure where the plans being thrown out, typically they get queued. I am not sure what behavior you are seeing and your interpretation of it. Can explain more about what you are seeing, may be a log file?

             

            Exceeding expected disk utilization will result in failures as there isn't a graceful mechanism to handle the failures.  As more queries are terminated disk space becomes available and normal processing would resume.

            • 3. Re: Teiid MemoryBuffer: memory management and swap
              jduke123

              @Ramesh Reddy

               

              I have just seen that a patch was submitted last week. I'll install it asap. Thank you for meke me notice.

              Up to now this has been installed: jboss-eap-6.4.0-installer + jboss-eap-6.4.3-patch + jboss-dv-installer-6.2.0.redhat-3.

               

              I need to keep up to date a tartget DB (postgresql 9.4) from tables from different sources, among them DB2 and MSSQL.

              I need to update the target every day. In this case there should be no problems due to the not excessive amount of data.

              However the first time and in the worst case scenario I have to move all the data in the fastest way possible.

              I have 60 tables to move with different record sizes.

               

              >> Every plan gets a certain amount of processing memory based on configuration, when the given "batch" of data exceeds that amount then disk swap occurs

                   is this amount controlled by max-reserve-KB, max-processing-KB and max-active-plans? if yes in which way? in a test I have assigned to max-processing-KB 200MB but the maximum used memory showed in management reached only 180MB with 12 active plans and JVM was stable at 2G.

               

              >> Cache is used to cache the results for resultsset cache or Materialization, they are independent from processing memory configuration

                   thus in my case I can change parameters in standalone.xml without affecting anything?

               

              >> I am not sure where the plans being thrown out, typically they get queued. I am not sure what behavior you are seeing and your interpretation of it. Can explain more about what you are seeing, may be a log file?

                  

                   ERROR [org.teiid.BUFFER_MGR] (FileStore Worker1) TEIID30016 Error transferring block 6,056 of cache group 55 to storage : org.teiid.common.buffer.impl.OutOfDiskException: Max buffer space of 524,288,000 bytes has been exceed with an allocation of 262,144 bytes for a total of 524,386,304. The current operation will be aborted.

               

              @Steven Hawkins

               

              >> For 8.7.1 if you are issuing a query like "insert into ... select ..." and the source supports insertion with an iterator, then the engine will attempt to give the entire insert to the translator so that a transaction is not needed at the user query level.

              I have to check if postgresql can use iterators on insert.

              Up to now I can decide how large is the result set by a parameters that gives me the correct timeframe to ask, i.e. I want to move 1K records, then I ask the next timestamp which is 1K away from the current record.

               

              >> Exceeding expected disk utilization will result in failures as there isn't a graceful mechanism to handle the failures.  As more queries are terminated disk space becomes available and normal processing would resume.

                   Do you mean that the exception is thrown, the query is queued and it will restart from where it left whenever memory space is available?

               

              Thank you,

              Andrea

              • 4. Re: Teiid MemoryBuffer: memory management and swap
                shawkins

                > I have to check if postgresql can use iterators on insert.

                 

                Yes it does, so for large amounts of data you'd want TEIID-3050.  That would take raising a case with GSS to perform a back port.

                 

                > Do you mean that the exception is thrown, the query is queued and it will restart from where it left whenever memory space is available?

                 

                It will not restart and just error out.