8 Replies Latest reply on Mar 15, 2017 1:04 PM by william.burns

    Question about infinispan query indexing.

    seto

      Hi. I'd like to ask about the index mode. And what's the preferred index mode for CacheMode.LOCAL and CacheMode.DIST_SYNC with indexing auto config? Should they be with Index.LOCAL?

      And also about filesystem storage. If two caches are with of filesystem storage and index the same class with the same default.indexBase folder, will it cause problem? Should they be with separate indexBase?

        • 1. Re: Question about infinispan query indexing.
          seto
          @Indexed
          public class TestDataObject implements Serializable {
          
             @Field
             private String name;
            private DataRef<TestDataObject> childDataRef;
          
            public String getName() {
             return name;
             }
          
             public void setName(String name) {
             this.name = name;
             }
          
             public DataRef<TestDataObject> getChildDataRef() {
             return childDataRef;
             }
          
             public void setChildDataRef(DataRef<TestDataObject> childDataRef) {
             this.childDataRef = childDataRef;
             }
          
             @IndexedEmbedded(depth = 2)
             public TestDataObject getChild() {
             if (childDataRef != null)
             return childDataRef.get();
            else
            return null;
             }
          }
          

          And also I used DataRef<> with key to refer another object in the object to avoid serialization of the whole graph.

          But then there's a problem, the parent won't be indexed when the child is modified.

          I'll add a map to map the parent the child. Is there a way to index an object with a specified key? I can only find MassIndexer.

          • 2. Re: Question about infinispan query indexing.
            seto

            I know that put the parent again will trigger the indexing again. But it will produce other performance cost for cluster cache I think. Actually I didn't change the the data. I only need a way to re-index a parent object.

            • 3. Re: Question about infinispan query indexing.
              seto

              And also, I use auto config for indexing. For distributed cache, it uses org.infinispan.query.indexmanager.InfinispanIndexManager. I want to make the infinispan cache for indexes persistence with eviction and shared. Then I don't need to index the data every time I reboot the server.

              So I config the caches as below.

               

              @ConfigureCache("LuceneIndexesMetadata")
              @Produces
              public Configuration luceneIndexesMetadataConfig() {
                ConfigurationBuilder configurationBuilder = new ConfigurationBuilder();
                 distributeConfig(persistenceConfig(configurationBuilder));
                return configurationBuilder.build();
              }
              
              @ConfigureCache("LuceneIndexesData")
              @Produces
              public Configuration luceneIndexesDataConfig() {
                ConfigurationBuilder configurationBuilder = new ConfigurationBuilder();
                 distributeConfig(persistenceConfig(configurationBuilder));
                return configurationBuilder.build();
              }
              
              @ConfigureCache("LuceneIndexesLocking")
              @Produces
              public Configuration luceneIndexesLockingConfig() {
                ConfigurationBuilder configurationBuilder = new ConfigurationBuilder();
                 distributeConfig(persistenceConfig(configurationBuilder));
                return configurationBuilder.build();
              }
              
              private ConfigurationBuilder distributeConfig(ConfigurationBuilder builder) {
                builder
                .clustering()
                .cacheMode(CacheMode.DIST_SYNC);
                return builder;
              }
              
              
              private ConfigurationBuilder persistenceConfig(ConfigurationBuilder configurationBuilder) {
                configurationBuilder.memory()
                .evictionType(EvictionType.MEMORY)
                .storageType(StorageType.BINARY)
                .size(500 * 1024 * 1024)
              
                .persistence()
                .addStore(JdbcStringBasedStoreConfigurationBuilder.class)
                .async().enable()
                .preload(preload)
                .shared(shared)
                .fetchPersistentState(fecthPersistentState)
                .ignoreModifications(ignoreModifications)
                .purgeOnStartup(purgeOnStartup)
                .table()
                .createOnStart(createOnStart)
                .dropOnExit(dropOnExit)
                .tableNamePrefix("ISPN_STRING_TABLE")
                .idColumnName("ID_COLUMN").idColumnType("VARCHAR(255)")
                .dataColumnName("DATA_COLUMN").dataColumnType("BLOB")
                .timestampColumnName("TIMESTAMP_COLUMN").timestampColumnType("BIGINT")
                .connectionPool()
                .connectionUrl(connectionUrl)
                .username(username)
                .password(password)
                .driverClass(driverClass);
                return configurationBuilder;
              }
              

               

              But it will causes problem for LuceneIndexesLocking cache.

              Caused by: java.lang.IllegalArgumentException: Size of Class class org.infinispan.remoting.transport.jgroups.JGroupsAddress cannot be determined using given entry size calculator :class org.infinispan.container.entries.PrimitiveEntrySizeCalculator

              • 4. Re: Question about infinispan query indexing.
                seto

                And also, query is not working with OFF_HEAP storage type. The result list size is 0. It's working with BINARY storage type. I mean the indexed caches. Not lucene caches. Lucene caches is not working with my custom configs. It's working with the default config.

                • 5. Re: Question about infinispan query indexing.
                  gustavonalle

                  Replying all your questions here:

                   

                  Hi. I'd like to ask about the index mode. And what's the preferred index mode for CacheMode.LOCAL and CacheMode.DIST_SYNC

                   

                  For CacheMode.LOCAL the index mode can be both, the effect is the same

                  For CacheMode.DIST_SYNC, it depends if shared or non-shared indexes are used.

                   

                  For shared-indexes, see http://infinispan.org/docs/dev/user_guide/user_guide.html#effect_of_the_index_mode

                  and for non-shared see http://infinispan.org/docs/dev/user_guide/user_guide.html#effect_of_the_index_mode_2

                   

                   

                  And also I used DataRef<> with key to refer another object in the object to avoid serialization of the whole graph.

                   

                  But then there's a problem, the parent won't be indexed when the child is modified.

                   

                  @IndexedEmbedded effectively flattens the object graph in the parent inside the index. From the index point-of-view, there is no parent and child but only parent carrying child attributes.

                  What you are describing is essentially a relational data modeling of an entity (against itself) that is not supported by Infinispan.

                   

                  Is there a way to index an object with a specified key? I can only find MassIndexer.

                  I know that put the parent again will trigger the indexing again. But it will produce other performance cost for cluster cache I think. Actually I didn't change the the data. I only need a way to re-index a parent object.

                   

                  Whenever a cache.put(K,V) happens on a indexed cache, it will be indexed according to the annotations on the value's class.

                  Since the underlying index on Infinispan is Lucene, there is not way to re-index only a field of an entity: you need to re-index the whole entity.

                   

                  But it will causes problem for LuceneIndexesLocking cache.

                  Caused by: java.lang.IllegalArgumentException: Size of Class class org.infinispan.remoting.transport.jgroups.JGroupsAddress cannot be determined using given entry size calculator :class org.infinispan.container.entries.PrimitiveEntrySizeCalculator

                   

                  This seems to be an issue with evictionType(EvictionType.MEMORY), we are investigating. In any case, the LuceneIndexesLocking don't need eviction, it's very small (most of the time contains one entry) that is emptied when the index is closed and recreated when the cache is started and data is indexed. So you could remove this config.

                   

                  And also, query is not working with OFF_HEAP storage type. The result list size is 0. It's working with BINARY storage type. I mean the indexed caches. Not lucene caches. Lucene caches is not working with my custom configs. It's working with the default config.

                   

                  Query should not be impacted by OFF_HEAP usage, since indexes are stored elsewhere. Could you provide more details? Is there any errors related to indexing when putting data to the cache? Could you provide the full code for it, including configuration for all caches (apart from the indexing one) and how the cache manager is created?

                  • 6. Re: Question about infinispan query indexing.
                    william.burns

                    Seto Kaiba wrote:

                     

                     

                    But it will causes problem for LuceneIndexesLocking cache.

                    Caused by: java.lang.IllegalArgumentException: Size of Class class org.infinispan.remoting.transport.jgroups.JGroupsAddress cannot be determined using given entry size calculator :class org.infinispan.container.entries.PrimitiveEntrySizeCalculator

                    This is an issue when using BINARY storage type with MEMORY based eviction. Store as Binary used to have special classes it didn't serialize [1] thus it never worked with MEMORY based eviction.  We should be fine converting these classes to binary though. I have created [2] to allow these types to be serialized as well.

                     

                    [1] infinispan/MarshallerConverter.java at abd4d744bdf2a8777295317bfc8a5771caf3dd66 · infinispan/infinispan · GitHub

                    [2] [ISPN-7612] We should marshall all classes when using converter - JBoss Issue Tracker

                    • 7. Re: Question about infinispan query indexing.
                      william.burns

                      I found that with your configuration (ie. using size of 0 and OFF_HEAP it can cause an issue with pointers causing a JVM crash). I have fixed this issue as well [1]  Unfortunately we can't produce OFF_HEAP causing Indexing to fail, is there any more detail you can provide?

                       

                      [1] [ISPN-7618] Bounded Off Heap can crash when a single entry size is larger than total size - JBoss Issue Tracker

                      • 8. Re: Question about infinispan query indexing.
                        william.burns

                        Also I added some query tests using OFF_HEAP and BINARY and found there were some bugs when utilizing the putAll methods on Cache that were not being indexed properly.  These are fixed in [1]. I hope this was the issue you were encountering.

                         

                        [1] [ISPN-7627] Add test to verify off heap and binary storage work with indexing - JBoss Issue Tracker