4 Replies Latest reply on Dec 22, 2017 3:02 PM by markaddleman

    Teiid on AWS Lambda?

    markaddleman

      Hi folks -

       

      I'm starting a new project in which I'd like to marry Teiid and AWS Lambda. My thoughts are:

      • Teiid would respond to OData requests from clients
      • All Teiid caching would be externalized to a stateful AWS service (perhaps Aurora)
      • Don't rely on Teiid/Wildfly clustering, instead rely on AWS Lambda management for load balancing
      • Probably will need some form of translator metadata caching to reduce the time to Teiid ready

       

      Part of the decision to put Teiid on Lambda is the assumption that Teiid clustering support does not include spreading a query plan across multiple Teiid instances.  Is that true? 

      Ultimately, my question is, has anyone run Teiid on Lambda before?  Are there some challenges that make this a really dumb idea?

        • 1. Re: Teiid on AWS Lambda?
          shawkins

          > Don't rely on Teiid/Wildfly clustering, instead rely on AWS Lambda management for load balancing

           

          Yes that is the basically the same conclusion in openshift.  We should be able to start producing a community docker / openshift image that support clustering in openshift via a jgroups extension, but that won't still won't utilize domain mode.  And all of the load balancing would be handled by openshift routes.

           

          > Probably will need some form of translator metadata caching to reduce the time to Teiid ready

           

          Are you thinking that you want the internal serialized form of the metadata to be on the persistent store as well?

           

          > Part of the decision to put Teiid on Lambda is the assumption that Teiid clustering support does not include spreading a query plan across multiple Teiid instances.  Is that true?

           

          Yes that is currently true.  However it is possible to engineer that type of solution by layering a Teiid instance on top of other Teiid instances with using multi-source or other partitioning to spawn work across Teiid instances.

           

          > Ultimately, my question is, has anyone run Teiid on Lambda before?  Are there some challenges that make this a really dumb idea?

           

          We've certainly been run in AWS, but I'm not aware of usage as a Lambda service.  What you expect for startup/response time?  It seems like there is a lot of over head you would want to skip in that type of environment.

          • 2. Re: Teiid on AWS Lambda?
            markaddleman

            Hi Steven.  It's good to be chatting with you again!

             

            > Are you thinking that you want the internal serialized form of the metadata to be on the persistent store as well?

             

            It's been a while since I've looked at different Teiid APIs.  My original thought was to use a delegating translator to read each translator's metadata from the cache or delegate to the real translators on cache miss.

             

            Perhaps serializing the internal serialized form would be the most straightforward thing to do.  I suppose an alternative would be to put all the metadata into DDL.  I don't recall how Teiid behaves when DDL is available for a physical source - does it skip consulting the translator for metadata at statup?

             

            > Part of the decision to put Teiid on Lambda is the assumption that Teiid clustering support does not include spreading a query plan across multiple Teiid instances.  Is that true?

             

            Yes that is currently true.  However it is possible to engineer that type of solution by layering a Teiid instance on top of other Teiid instances with using multi-source or other partitioning to spawn work across Teiid instances.

             

            To be clear, I don't need distributed query plans.

            > Ultimately, my question is, has anyone run Teiid on Lambda before?  Are there some challenges that make this a really dumb idea?

             

            We've certainly been run in AWS, but I'm not aware of usage as a Lambda service.  What you expect for startup/response time?  It seems like there is a lot of over head you would want to skip in that type of environment.

            I'd be willing to tolerate five second delay for startup time.  Besides loading metadata, what additional overhead concerns you?

             

            When I've used Teiid in embedded mode, I know I can get startup time below a second but that was under a pretty specific use case.

            • 3. Re: Teiid on AWS Lambda?
              shawkins

              > Hi Steven.  It's good to be chatting with you again!

               

              Likewise.  I hope we'll be able to meet your needs on this.

               

              > Perhaps serializing the internal serialized form would be the most straightforward thing to do.  I suppose an alternative would be to put all the metadata into DDL.  I don't recall how Teiid behaves when DDL is available for a physical source - does it skip consulting the translator for metadata at statup?

               

              The full server has the logic for saving the serialized object form, but embedded does not.  Either that way or with DDL the source is not consulted unless it is declared as needed in the vdb.

               

              > I'd be willing to tolerate five second delay for startup time.  Besides loading metadata, what additional overhead concerns you?

               

              Creating the various thread/resource pools and buffer memory allocation - although you have a lot of control of using embedded.  There also preloading of materialized views for system tables (mostly odbc), which could be disabled.  There is a metadata validation phase that you'd likely want to want skip as well.  Some profiling would help determine what changes would be worth while.

               

               

              • 4. Re: Teiid on AWS Lambda?
                markaddleman

                > Creating the various thread/resource pools and buffer memory allocation - although you have a lot of control of using embedded

                 

                Then it seems that embedded is the way to go.  As long as the startup time is less than 5 seconds, I'm not going to worry about profiling.  As the project progresses and we become more concerned about startup time, we'll profile, tune and potentially make feature requests.

                 

                Thanks