4 Replies Latest reply on Jun 23, 2016 12:27 PM by m.ardito

    something horrible happened to my teiid server... ?

    m.ardito

      I think something like this already happened to me once in the past... on older versions...

       

      now on my teiid test server (9beta1) I think I was replacing some of my vdbs... can't remember exactly WHEN it happened...

       

      suddenly the web interface showed no more subsystem, of any kind...?!?

      I tried reloading the server, stopping/starting, rebooting... nothing

       

      it seemed that all wildfly subsystems were gone... including teiid... reloading the server, I saw the log lines shown on the command line... after system init, all deployments were done...but nothing worked anymore...

       

      i renamed the folder where it was installed, to inspect later...

       

      then unzipped the same original wildfly+teiid downloaded file, copied .properties files, and started the new instance.

      then used my .cli to reload all RAs and DSs and then redeployed my vdbs...

       

      All fine. Quite. VDB run, but now my webserver => php => unixodbc => teiid

       

      often fails connection, and reports

      Message: odbc_exec(): SQL error: [unixODBC]Could not send Query(connection dead); Could not send Query(connection dead), SQL state 08S01 in SQLExecDirect

       

      This seems to happen after I reload until it works, and then, after a while, a new refresh gives again the error...

      But it's not completely reproducible... until now...  as of now the server is completely unreliable, after several weeks of nice stability, and good performance....

       

      it seems something related to connection timeouts... but I can't recall if there was some setting in the "broken" install that I need to set on the new also...

       

      what could it be, what to check, what to try?

       

      [edit] connection timouts seem to be gone after webserver restart (probably odbc had cached persistent connections?) [/edit]

       

      Marco

        • 1. Re: something horrible happened to my teiid server... ?
          shawkins

          > [edit] connection timouts seem to be gone after webserver restart (probably odbc had cached persistent connections?) [/edit]

           

          So things look better now?

           

          For JDBC pools there are test queries and a full flush that can happen on a stale connection exception, does there seem to be anything like that for unixODBC?

          • 2. Re: something horrible happened to my teiid server... ?
            m.ardito
            So things look better now?

             

            well, they look just fine, now, but I'm talking about a compeltely new install...

             

            what really scares me is what happened to the previous...  unfortunately I can't start two wildfly/teiid instances on that machine, atm (btw is there any relevant info about this)), so I can't run tests there, now, or see how things are running.

             

            I just know that while I was routinely changing (replacing) a vdb, somehow wildfly web interface was suddenly missing all subsystems, and there was no way to recover, as I said above, to my knowledge...

            and also from the jboss-cli, afair, deployments were "there", but there was nothing else, nothing to build resources no RA, no DS, and no VDB, of course. squirrelsql suddenly couldn't connect anymore... a complete mess in a few seconds...

             

            something similar already happened in the past, when I had to rebuild each RA and DS and VDB by hand, at the time I was less experienced, and I attributed that to my noobiness, and I thought I accidentally made some big unknown mistake...

             

            due to that past issue, later I learned how to build a sort of "fast setup" CLI script, and having a backup of that, and of all VDBs, and also the zip of the wildfly/teiid bundle just there on the server, I was able to completely replace my install with a fresh one, in just a couple of minutes (webserver odbc connection trouble was just makeing me unsecure of other problems, but that was independent, as I now realized, and solved).

             

            since I kept the "broken" install folder, everything is already there and I also tried to use a diff utility to spot what could have happened, in some file or else. of course the fresh install and the broken differ about server logs, configuration xml history, and about hte pseudorandom folder names where the server keeps the web deployed xml/binaries (under standalone/data/content), but all the rest of the files/folders, just had newer creation datetimes... the utility (winmerge), which scanned both complete folder trees searching for "full file modifications" just encountered some binary files it was not able to compare (some binary content under standalone/data/content), but I'm not sure that is related...

             

            I don't know how can I investigate (not that I have so much spare time, but) to spot what it could have happened, only the server.log? But I think teiid is not responsible here, since all base subsystems, apart teiid, were also missing...

            is there any other wildfly log file to check to look for trouble traces?

             

            Marco

            • 3. Re: something horrible happened to my teiid server... ?
              rareddy

              Your diff find anything? Also what happens if you try to start the server in broken install dir? were you using the web-console to replace the VDBs? or CLI?

               

              Is there a way you can share the whole broken directory of WildFly?

              • 4. Re: something horrible happened to my teiid server... ?
                m.ardito

                Your diff find anything?

                ..yes, that was the result... I'll check again

                Also what happens if you try to start the server in broken install dir? were you using the web-console to replace the VDBs? or CLI?

                 

                I'll try again next week, unless I was drunk that evening, looking the log shown at wildfly start, it starts, it "finds" already deployed files, loads the web interface and... nothing more. No errors, warnings, stop.

                 

                I want to find the cause, of course (and prevent another similar issue), so next week I'll schedule some test on that setup (before trying anything I'll zip the folder in order to preserve it as it was when I repleced it...)

                I'Il find a way to transfer (wetransfer?), I just need to check and remove passwords, sensitive info, etc.

                 

                Thanks for you willing to help.