1 Reply Latest reply on Nov 15, 2015 1:06 AM by gustavonalle

    Lucene over Infinispan + postgres

    asafz

      Hi,

      We are using Lucene over infinspan that is stored in postgres in our production environment. Infinispan is running inside the tomcat instance, replicated between 2 different tomcat instances and stored in a shared postgres database.

      We found very little information about this topology, I'm not sure if this topology is supported or recommended.

      The problem is that the postgres tables related to infinispan are keep growing in an unproportional ratio comparing to the information size.

      table names start with public.ISPN_BUCKET_TABLE_* and they can grow by few giga's a day even though the real information is much smaller.

      We found out the doing a postgres vaccum on the table may reduce the size but it is not reasonable to do a manual vaccum every few hours

       

      - Is that a bug with the lucene infinspan postgres combination or a wrong usage?

      - do you think that I should use inifinspan over files instead of postgres?

        • 1. Re: Lucene over Infinispan + postgres
          gustavonalle

          - Is that a bug with the lucene infinspan postgres combination or a wrong usage?

           

          Lucene continuously merges multiple segments into one according to defined policies, and during this process, up to 3x more space than the index size can be used temporarily.

          If your system is under heavy writing, merges will be more frequent and it's reasonable the extra space being used as you described, up to Gbs of space depending on the size of the index.

          I suggest you try to use the property "hibernate.search.default.indexwriter.merge_max_size" to avoid big merges. By doing so, you will end up with more segments in

          your index at a performance penalty on searches (which may not be significant to you) but less temporary space will be using during routine merges.

           

          EDIT: If not using Hibernate Search to maintain the Lucene indexes, the equivalent way of achieving this is setting it directly in the Lucene merge policy, for example "setMaxMergeMB( value )"  if using TieredMergePolicy