5 Replies Latest reply on Jun 10, 2016 12:29 PM by Steven Hawkins

    weird results from simple left join...

    Marco Ardito Master

      My vdb has serveral models, linked mostly to mysql db (teiid is 9.0.0.Beta1)

       

      I'm trying a simple query like

       

      SELECT id, rag_soc, pi, cf, cod_ass

      FROM apifdbunica.ag

      LEFT JOIN personale_interno ON ag.id_utente_inserimento = personale_interno.idanagrafica

      ORDER BY rag_soc

      LIMIT 2000,20

       

      but I get different rows then from

       

      SELECT id, rag_soc, pi, cf, cod_ass

      FROM apifdbunica.ag

      ORDER BY rag_soc

      LIMIT 2000,20

       

      eg: the simplest query, without any join gives rows like

       

       

      while linking records through left join gets:

       

       

      What is the cause of this and how to solve?

      Am I doing anything wrong???

       

      [edit]

      I noted this: some "rag_soc" values are uppercased, while others not, and

      if I use

      ORDER BY UCASE(rag_soc)

      in both queries, it seems that I get the same results...

       

      Marco

        • 1. Re: weird results from simple left join...
          Steven Hawkins Master

          Is rag_soc case insensitive in the database?  Unfortunately that would cause the different results as the collation is different.  This is similar to [TEIID-4129] Wrong results with cross model join on 2 varchar fields - JBoss Issue Tracker  While there are workarounds, they aren't great.  I would like to add more logic around the case insensitive flag in our metadata (which requires detection logic on import for each source), but at the very least I need to add a sanity check in the join processing to detect when the expected ordering is not honored.

          • 2. Re: weird results from simple left join...
            Marco Ardito Master

            Yes, the ag mysql table is "utf8_general_ci", and the field ag.rag_soc is VARCHAR

             

            The left joined table fields are

            - ag.id_utente_inserimento is INT

            - personale_interno.idanagrafica is "latin1_swedish_ci" INT

             

            Here the join is done on integer ID codes, not on varchars, that is the ORDER BY...

             

            But there is something more weird that happens when I am not using ORDER BY UCASE(rag_soc)

            without that UCASE, look at what happens, changing the OFFSET

             

            offset 2000

             

            offset 2020

             

            offset 2040

             

            ... and so on.

             

            The last three records are always repeated, while the previous 17 change (it seems) as expected...

             

            ???

             

            As said, using UCASE for sorting resulst seems to solve the issue (I need to keep an eye on this)

            but I cannot understand if it is just a collation issue in the join, in the sort, in both... or else?

             

            Marco

            • 3. Re: weird results from simple left join...
              Steven Hawkins Master

              > but I cannot understand if it is just a collation issue in the join, in the sort, in both... or else?

               

              What does the query plan look like?  We should be pushing the sum of the the limit/offset to the left hand side using the rag_soc column for the ordering (since it's a left outer join we won't need more than those rows), then performing the join in Teiid possibly sorting on the join column.

              • 4. Re: weird results from simple left join...
                Marco Ardito Master

                Here is the server.log (zipped) output after issuing  in SquirrelSQL:

                 

                "SET SHOWPLAN DEBUG;

                SELECT id, rag_soc, pi, cf, cod_ass

                FROM apifdbunica.ag

                LEFT JOIN apifdbgesp.anagrafica as personale_interno ON ag.id_utente_inserimento = personale_interno.idanagrafica

                ORDER BY (rag_soc)

                LIMIT 2030,20"

                 

                is this what you asked for?

                 

                Marco

                • 5. Re: weird results from simple left join...
                  Steven Hawkins Master

                  Yes that confirms the plan I was describing and means that yes you can get the results that you describe.  To prevent this you can use upper/lower, disable order by pushdown, or use the translator settings described in the other issue - but you would also need to set org.teiid.requireTeiidCollation=true