Monday, March 7, 2011

MySQL Performance: Leveraging SQL_CALC_FOUND_ROWS and FOUND_ROWS inside JPA-QL

There are lots of articles on the Web about how to optimize paginated displays. The basic problem is: in order to paginate well, you must display the total number of hits to the user:

1...30 (of 54,342)


But finding that total number of hits can be slow. In particular doing select count(*) in MySQL is notoriously slow. There are lots of possible solutions. A promising, MySQL-specific one is select FOUND_ROWS(). Using this via native SQL is easy. But exposing it inside JPA-QL (which generates its SQL automatically) is tricky. The rules for using it are:

  1. Clients must run a normal query, using LIMIT for pagination, and prefix it as: select SQL_CALC_FOUND_ROWS * from myTable

  2. Immediately afterwards, before any other SQL statements are run, clients call select FOUND_ROWS() to obtain the number of rows there would have been if no LIMIT had been set
A further advantage of SQL_CALC_FOUND_ROWS is that it can be used with having queries, whereas count() cannot.

Here's one way to leverage this inside Hibernate-based JPA. We'll be using Interceptors. Like all the solutions to this problem, this one isn't perfect, but it may suit your purposes:

package com.kennardconsulting.core.hibernate.interceptor;

import org.hibernate.EmptyInterceptor;
import org.hibernate.HibernateException;
import org.hibernate.SessionFactory;

/**
* Hibernate interceptor to make use of MySQL's proprietary <code>SQL_CALC_FOUND_ROWS</code> hint.
* The <code>count()</code> method in MySQL is notoriously slow
* (http://mysqlha.blogspot.com/2009/08/fast-count-for-innodb.html) as it scans the entire index.
* <code>SQL_CALC_FOUND_ROWS</code> was introduced to allow clients to obtain a <code>count()</code>
* in a more performant way.
* <p>
* The rules are:
* <p>
* <ol>
* <li>Clients must run a normal query, using <code>LIMIT</code> for pagination, and prefix it with
* <code>SQL_CALC_FOUND_ROWS</code></li>
* <li>Immediately afterwards, <em>before any other SQL statements are run</em>, clients call
* <code>FOUND_ROWS()</code> to obtain the number of rows there <em>would</em> have been, if no
* limit had been set</li>
* </ol>
* <p>
* A further advantage of <code>SQL_CALC_FOUND_ROWS</code> is that it can be used with
* <code>having</code>, where <code>count()</code> cannot.
* <p>
* Other approaches we tried:
* <p>
* <ul>
* <li>Hibernate supports ScrollableResults, but on MySQL that seems to just bring back the entire
* results and is very slow</li>
* <li>You can try embedding count(*) as an extra column in the query, but that negates the indexes</li>
* <li>You can try 'show table status where name = Member' but that varies wildly and is
* disconcerting for the user (for 60,000 rows, +/- 3,000 rows)</li>
* <li>You can make 'count()' setCacheable( true )</li>
* </ul>
* <strong>Note: this is not a general purpose Interceptor. It assumes it will be used to execute a
* single select (possibly with secondary selects) and then thrown away. It cannot be
* re-used.</strong>
*/

public class MySQLCalcFoundRowsInterceptor
   extends EmptyInterceptor {

   //
   // Private statics
   //

   private final static String   SELECT_PREFIX         = "select ";

   private final static String   CALC_FOUND_ROWS_HINT   = "SQL_CALC_FOUND_ROWS ";

   private final static String   SELECT_FOUND_ROWS      = "select FOUND_ROWS()";

   //
   // Private members
   //

   private SessionFactory      mSessionFactory;

   private byte            mSQLStatementsPrepared;

   private Long            mFoundRows;

   //
   // Constructor
   //

   public MySQLCalcFoundRowsInterceptor( SessionFactory sessionFactory ) {

      mSessionFactory = sessionFactory;
   }

   //
   // Public methods
   //

   @Override
   public String onPrepareStatement( String sql ) {

      switch ( mSQLStatementsPrepared ) {

         case 0: {
            mSQLStatementsPrepared++;

            // First time, prefix CALC_FOUND_ROWS_HINT

            StringBuilder builder = new StringBuilder( sql );
            int indexOf = builder.indexOf( SELECT_PREFIX );

            if ( indexOf == -1 ) {
               throw new HibernateException( "First SQL statement did not contain '" + SELECT_PREFIX + "'" );
            }

            builder.insert( indexOf + SELECT_PREFIX.length(), CALC_FOUND_ROWS_HINT );
            return builder.toString();
         }

         case 1: {
            mSQLStatementsPrepared++;

            // Before any secondary selects, capture FOUND_ROWS. If no secondary selects are
            // ever executed, getFoundRows() will capture FOUND_ROWS just-in-time when called
            // directly

            captureFoundRows();
            return sql;
         }

         default:
            // Pass-through untouched
            return sql;
      }
   }

   public long getFoundRows() {

      if ( mFoundRows == null ) {
         captureFoundRows();
      }

      return mFoundRows;
   }

   //
   // Private methods
   //

   private void captureFoundRows() {

      // Sanity checks

      if ( mFoundRows != null ) {
         throw new HibernateException( "'" + SELECT_FOUND_ROWS + "' called more than once" );
      }

      if ( mSQLStatementsPrepared < 1 ) {
         throw new HibernateException( "'" + SELECT_FOUND_ROWS + "' called before '" + SELECT_PREFIX + CALC_FOUND_ROWS_HINT +"'" );
      }

      // Fetch the total number of rows

      mFoundRows = ( (Number) mSessionFactory.getCurrentSession().createSQLQuery( SELECT_FOUND_ROWS ).uniqueResult() ).longValue();
   }
}


To use this Interceptor inside JPA-QL:

SessionFactory sessionFactory = ((org.hibernate.Session) mEntityManager.getDelegate()).getSessionFactory();
MySQLCalcFoundRowsInterceptor foundRowsInterceptor = new MySQLCalcFoundRowsInterceptor( sessionFactory );
Session session = sessionFactory.openSession( foundRowsInterceptor );

try {
   org.hibernate.Query query = session.createQuery( ... )   // Note: JPA-QL, not createNativeQuery!
   query.setFirstResult( ... );
   query.setMaxResults( ... );

   List entities = query.list();
   long foundRows = foundRowsInterceptor.getFoundRows();

   ...
   
} finally {

   // Disconnect() is good practice, but close() causes problems. Note, however, that
   // disconnect could lead to lazy-loading problems if the returned list of entities has
   // lazy relations

   session.disconnect();
}


Hope that helps!

2 comments:

Anonymous said...

I think the interceptor has a bug.
I had to add a boolean field and change this lines:

...

// check not already capturing
if (!capturingFoundRows)
captureFoundRows();

...

// Set to true before capturing
capturingFoundRows = true;
mFoundRows = ...

Unknown said...

I have created a more complete solution based off of this solution that works very well with Spring+Hibernate+MySQL

See:
http://stackoverflow.com/a/26499062/1304288