4 Replies Latest reply on Jan 31, 2017 7:33 AM by Marco Ardito

    Connection from web app to teiid works but sometimes... fails.

    Marco Ardito Master

      I'm sorry for the poor title description, I don't know how I could better choose one for this issue...

      I'm experimenting an issue, since a few weeks, about the authentication of a web application which uses teiid as backend.

       

      The complete flow is:

      application => php framework => php => unixodbc => postgresql driver => network (LAN) => teiid => network (LAN) => back-end data sources (a virtual database composed by several mysql, sqlserver, xls files)

       

      When it happens, a simple page reload (with post parameters submitted by a form) gets a normal result, no errors whatsoever.

      And maybe any other application page that loads data from the back-end db works fine... until... for some reason, it could happen again.

       

      from the web application side, the error is reported as:

      "[unixODBC]Communication closed during authentication; Connection reset by peer., SQL state 28000 in SQLConnect"

       

      The error occours just for the connection, which the web framework initializes even before loading the page: so it doesn't matter if the page is doing an sql query or not. It happens anyway.

       

      I already asked on unixodbc mailing lists, with some possible hint that a MSSQL server could be causing this... but I don't think so:

      - I created a copy of the teiid virtual database, excluding all MSSQL servers from its models

      - changed the web server config to load, through unixodbc, the new virtual database

      - it happens again..

       

      After some more digging, I tried to observe the  page reloading every few seconds, and looking at the live server.log from teiid (9.0.2)

      and noted that when it happens, at the very same time teiid is logging this (where x.x.x.x is web server address and y.y.y.y is teiid address):

       

      2017-01-27 15:33:13,380 WARNING [io.netty.channel.AbstractChannel] (NIO0)  Force-closing a channel whose registration task was not accepted by an event loop: [id: 0xaa75aeab, /x.x.x.x:60290 => /y.y.y.y:35432]: java.util.concurrent.RejectedExecutionException: event executor terminated
              at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:707)
              at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:299)
              at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:690)
              at io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:421)
              at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:60)
              at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:48)
              at io.netty.channel.MultithreadEventLoopGroup.register(MultithreadEventLoopGroup.java:64)
              at io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.channelRead(ServerBootstrap.java:251)
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
              at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
              at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
              at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
              at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
              at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
              at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
              at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
              at java.lang.Thread.run(Thread.java:745)
      
      2017-01-27 15:33:13,421 SEVERE [io.netty.util.concurrent.DefaultPromise.rejectedExecution] (NIO0)  Failed to submit a listener notification task. Event loop shut down?: java.util.concurrent.RejectedExecutionException: event executor terminated
              at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:707)
              at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:299)
              at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:690)
              at io.netty.util.concurrent.DefaultPromise.execute(DefaultPromise.java:671)
              at io.netty.util.concurrent.DefaultPromise.notifyLateListener(DefaultPromise.java:641)
              at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:138)
              at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
              at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
              at io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.channelRead(ServerBootstrap.java:251)
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
              at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
              at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
              at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
              at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
              at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
              at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
              at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
              at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
              at java.lang.Thread.run(Thread.java:745)
      

       

      while, on the web server side, unixodbc(postgresql unicode driver) is logging something like this (x.y.z.w is the teiid address here):

       

      DSN info: DSN='dsnname',server='x.y.z.w',port='35432',dbase='vdbname',user='user',passwd='xxxxx'
                onlyread='No',protocol='7.4',showoid='No',fakeoidindex='No',showsystable='No'
                conn_settings='', conn_encoding='(null)'
                translation_dll='',translation_option=''
      conn = 0x8058d3c8, PGAPI_Connect(DSN='dsnname', UID='user', PWD='xxxxx')
      Driver Version='09.02.0100,201306020000'
      Global Options: fetch=10000, socket=4096, unknown_sizes=0, max_varchar_size=255, max_longvarchar_size=8190
                      disable_optimizer=0, ksqo=0, unique_index=1, use_declarefetch=0
                      text_as_longvarchar=1, unknowns_as_longvarchar=0, bools_as_char=1 NAMEDATALEN=64
                      extra_systable_prefixes='dd_;', conn_settings='' conn_encoding=''
      CONN ERROR: func=original_CC_connect, desc='', errnum=210, errmsg='Communication closed during authentication'
                  ------------------------------------------------------------
                  henv=0x8058c2b8, conn=0x8058d3c8, status=0, num_stmts=16
                  sock=0x8058b160, stmts=0x80589280, lobj_type=-999
                  ---------------- Socket Info -------------------------------
                  socket=38, reverse=0, errornumber=10, errormsg='Connection reset by peer.'
                  buffer_in=2153300704, buffer_out=2153316120
                  buffer_filled_in=0, buffer_filled_out=0, buffer_read_in=0
      CONN ERROR: func=PGAPI_Connect, desc='Error on CC_connect', errnum=210, errmsg='Communication closed during authentication'
                  ------------------------------------------------------------
                  henv=0x8058c2b8, conn=0x8058d3c8, status=0, num_stmts=16
                  sock=0x8058b160, stmts=0x80589280, lobj_type=-999
                  ---------------- Socket Info -------------------------------
                  socket=38, reverse=0, errornumber=10, errormsg='Connection reset by peer.'
                  buffer_in=2153300704, buffer_out=2153316120
                  buffer_filled_in=0, buffer_filled_out=0, buffer_read_in=0
      DSN info: DSN='dsnname',server='x.y.z.w',port='35432',dbase='vdbname',user='user',passwd='xxxxx'
                onlyread='No',protocol='7.4',showoid='No',fakeoidindex='No',showsystable='No'
                conn_settings='', conn_encoding='(null)'
                translation_dll='',translation_option=''
      conn = 0x8058d9c8, PGAPI_Connect(DSN='dsnname', UID='user', PWD='xxxxx')
      Driver Version='09.02.0100,201306020000'
      Global Options: fetch=10000, socket=4096, unknown_sizes=0, max_varchar_size=255, max_longvarchar_size=8190
                      disable_optimizer=0, ksqo=0, unique_index=1, use_declarefetch=0
                      text_as_longvarchar=1, unknowns_as_longvarchar=0, bools_as_char=1 NAMEDATALEN=64
                      extra_systable_prefixes='dd_;', conn_settings='' conn_encoding=''
          [ PostgreSQL version string = '8.1.4' ]
          [ PostgreSQL version number = '8.1' ]
      conn=0x8058d9c8, query='select oid, typbasetype from pg_type where typname = 'lo''
          [ fetched 1 rows ]
          [ Large Object oid = 14939 ]
          [ Client encoding = 'UTF8' (code = 6) ]
      conn=0x8058d9c8, PGAPI_Disconnect
      

       

      then, as said, if I reload the page, everything works fine...

       

      the teiid server has a low load (reported as 0.06 0.06 0.05), the web server even less, but both have an uptime of over 90 days...

       

      It is a bit annoying... can you spot what is happening, why, and how to solve?

       

      Marco

        • 1. Re: Connection from web app to teiid works but sometimes... fails.
          Steven Hawkins Master

          Unfortunately this is the first time we have seen this particular stack trace as well.  Issues involving this message have been logged against Netty with a similar description of behavior, but have simply been closed.  The Netty team seem to indicate this should only happen when the EventLoopGroup is shutdown.  Teiid would only initiate that shutdown when the transport closes, which doesn't seem to apply here.  Would it be possible to obtain a trace log from the Teiid server around the time of the issue to see if there is anything else that could be contributing?

          • 2. Re: Connection from web app to teiid works but sometimes... fails.
            Marco Ardito Master

            Would it be possible to obtain a trace log from the Teiid server around the time of the issue to see if there is anything else that could be contributing?

            Of course, any kind of info you need: just remember me how to activate/get that "trace log".. I imagine you need a logger with level of TRACE but for which context?

            Atm I have pretty much the standard settings I guess...

            • 3. Re: Connection from web app to teiid works but sometimes... fails.
              Steven Hawkins Master

              Yes a trace level log of either the whole org.teiid context, or in particular we're interested in org.teiid.RUNTIME, org.teiid.ODBC, and org.teiid.TRANSPORT

              • 4. Re: Connection from web app to teiid works but sometimes... fails.
                Marco Ardito Master

                ... I'm sorry,

                but in the meanwhile I did a stop/start of teiid instance on that server... and now the problem is no more happening... (but also )

                 

                tbh, I thought the only way to change the log level to trace was altering the standalone-teiid.xml file directly, so I thought to restart the instance and verify if the problem was still there... you know how it ended.

                what a n00b

                 

                Only now, under Configuration: Subsystems => Subsystem: Logging => Log Categories

                I've found that I could edit them from here... so Maybe I could have avoided the restart, and while I would still have the problem, you would have more data about it.

                 

                I've never used this web console part, before, so... my fault.

                Next time, when it happens again, I should probably just edit the setting here (I hope wildfly doesn't need a reload, as it would be similar to stopping/starting teiid as I did now)

                 

                Thanks, anyway,

                Marco