3 Replies Latest reply on Mar 5, 2013 12:21 PM by adamw

    getRevisionNumberForDate SQL

    whiteside

      Very often in our application, we have to get the entity state at a given time.  To do this, we use two steps:

      1. get the revision for that time.

      2. get the entity at that revision. 

       

      To complete step 1 above, we are using AuditQuery.getRevisionNumberForDate(date).  As our revision table grows very large, I see a significant performance problem.  When I take a look at the SQL that's being generated it looks like the following:

       

      select max(this_.id) as y0_

                from

        MyRevisionEntity this_

                where

                          this_.timestamp<=1360108800000

       

      I'm kind of an index newbie, but I don't think there's a good way to create an index for a query like that.  For example, a much faster query might look like this:

       

      select id from MyRevisionEntity where timestamp=(

          select max(timestamp) from AEISSRevisionEntity where timestamp<=1359579669001

      )

       

      That query can at least make use of indexes for very fast performance. 

       

      Am I doing something goofy?  I'm pretty new to database tuning and could be missing something simple.

       

      Thanks very much!

        • 1. Re: getRevisionNumberForDate SQL
          adamw

          Wouldn't an index on timestamp work in both cases? Did you maybe do some tests?

           

          Or maybe a compound (timestamp, id) index?

           

          Adam

          • 2. Re: getRevisionNumberForDate SQL
            whiteside

            I did try adding a compound index like you say, but it didn't help.

             

            The problem is the '<='.  The database has to check every entry that matches that condition to find maximum ID.  Since we know the revision IDs are sequential, we can be smarter and just find the maximum timestamp that is less that our number.  This means we are only looking at one row, rather than many.

             

            I changed a method in the Envers class org.hibernate.envers.revisioninfo.RevisionInfoQueryCreator

             

            From:

            public Criteria getRevisionNumberForDateQuery(Session session, Date date) {

                   return session.createCriteria(revisionInfoEntityName).setProjection(Projections.max(revisionInfoIdName))

                                                                        .add(Restrictions.le(revisionInfoTimestampName, timestampAsDate ? date : date.getTime()));

            }

             

            To:

            public Criteria getRevisionNumberForDateQuery(Session session, Date date) {

              return session

              .createCriteria(revisionInfoEntityName)

              .add(Restrictions.le(revisionInfoTimestampName, timestampAsDate ? date : date.getTime()))

              .addOrder(Order.desc(revisionInfoTimestampName))

              .setProjection(Projections.property(revisionInfoIdName))

              .setMaxResults(1);

            }

             

             

            We have several million rows in our RevisionEntity table.  This change reduced our query time from several seconds to several milliseconds.

            • 3. Re: getRevisionNumberForDate SQL
              adamw

              Hmm, but while we assume that a revision "A" is newer than "B" if it has a greater revision number, that doesn't have to be true for timestamps, right? The timestamps may be even identical?

               

              Adam