2 Replies Latest reply on Mar 3, 2012 12:51 PM by andersbohn

    Best approach for database text index in JBoss 7 with hibernate search/infinspan/lucene ?

    andersbohn

      We've got three tables with data from an LDAP-directory (external to the project).

      Rows are updated with a timestamp by LDAP drivers.

       

      In a JBoss 7.1.0.Final EAR deployment, we want to make these available as lucene-text-searchable entities.

       

      Supposedly, we'll set up a timer (@Scheduled EJB), that scans the database for new changes and reloads these in the cache.

       

      Which would be the better approach:

       

      1) Use JPA-entities and Hibernate Search

          a) configure hibernate.search.default.directory_provider=infinispan

          b) at startup Search.getFullTextEntityManager(entityManager).createIndexer()

          c) scan for updates and reload/evict them via JPA-entityManager

       

      2) Setup Infinispan-search, index cache

          a) at startup, load the data (JPA or JDBC) into the cache - infinispan does the lucene indexing

          b) scan for updates and add/update them in the cache

       

      I have parts of both working in testcode (not yet from within AS7..)

       

      Note:

      - We'll never update the database from the application, it's readonly

      - the largest table is only ~35.000 rows 3-10 text columns to index

      - for now, this will run standalone only (no distribution necessary)

       

      Any advice much appreciated :-)

        • 1. Re: Best approach for database text index in JBoss 7 with hibernate search/infinspan/lucene ?
          sannegrinovero

          Hi Anders,

          using Search looks like the simplest solution as you're loading from a database, and the MassIndexer will do most of the hard work.

           

          With Infinispan Query there is no MassIndexer, so besides loading from database yourself you'll also have to make sure you can scan the full table at startup. Not hard at all, but it quickly becomes more complex if want to minimize bootup time using multiple threads for the initial indexing: they speed up a lot with Hibernate Search and a similar feature is not available on Infinispan Query yet (contributions welcome and not too hard as Search already provides most reusable logic).

           

          Using Hibernate Search seems the simpler approach; in case you wanted to try Infinispan to take advantage of in-memory caching, just enable a Infinispan 2nd level cache for Hibernate.

          • 2. Re: Best approach for database text index in JBoss 7 with hibernate search/infinspan/lucene ?
            andersbohn

            Hi Sanne,

             

            thank you very much for your answer.

             

            It turns out the tables are not that easily mapped to the actual searches, and in the name of decoupling, I am currently experimenting with option 2. In neither solution can I get infinspan query working in JBoss 7, but I'll post about this separately.