6 Replies Latest reply on Jun 24, 2015 5:33 PM by Dan Berindei

    Wildfly 8.2, Infinispan cache loader and cluster state

    Flemming Harms Novice

      Hi

       

      I hope somebody can help clarify how Wildfly 8.2, Infinispan cache loader and cluster state works.

       

      If we have multiple wildfly nodes with the infinispan configuration below, where we use cache loader to preload the state from the database.

       

      What happens when node 2 or 3 is coming online and there is in the meantime been changes to the cache on node 1. Does the setup guarantee the nodes are consistent even the state is loaded from the database? 


      Thanks!

      /Flemming

       

      <replicated-cache name="LuceneIndexesMetadata" start="EAGER" mode="SYNC" remote-timeout="330000">
                <eviction strategy="NONE" max-entries="-1"/>
                <expiration max-idle="-1"/>
                <string-keyed-jdbc-store singleton="true"  preload="true"  passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES">
                       <write-behind />
                       <property name="key2StringMapper">
                                  org.infinispan.lucene.LuceneKey2StringMapper
                      </property>
                      <string-keyed-table>
                            <id-column name="ID_COLUMN" type="VARCHAR(255)"/>
                            <data-column name="DATA_COLUMN" type="bytea"/>
                            <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/>
                        </string-keyed-table>
                </string-keyed-jdbc-store>
               <indexing index="NONE"/>
      </replicated-cache>
                      
      <replicated-cache name="LuceneIndexesData" start="EAGER" mode="SYNC" remote-timeout="330000">
                   <eviction strategy="NONE" max-entries="-1"/>
                   <expiration max-idle="-1"/>
                   <string-keyed-jdbc-store  singleton="true"  preload="true" passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES">
                         <write-behind />
                          <property name="key2StringMapper">
                                  org.infinispan.lucene.LuceneKey2StringMapper
                          </property>
                          <string-keyed-table>
                              <id-column name="ID_COLUMN" type="VARCHAR(255)"/>
                              <data-column name="DATA_COLUMN" type="bytea"/>
                              <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/>
                          </string-keyed-table>
                      </string-keyed-jdbc-store>
                      <indexing index="NONE"/>
      </replicated-cache>
      
        • 2. Re: Wildfly 8.2, Infinispan cache loader and cluster state
          Paul Ferraro Master

          For starters, you are using singleton="true" - so only a single cluster node (i.e. the coordinator) will ever interact with the cache store.  Is that intentional?  Since any new node coming online is, by definition, not the coordinator, they will never preload or contribute state from its cache store - and, on startup, will only receive state from other nodes.

          To answer your specific question, yes, the cache state will be consistent.  While a given segment of data is being transferred to a newly joining node, any updates to that state will wait to be applied.

          • 3. Re: Wildfly 8.2, Infinispan cache loader and cluster state
            Flemming Harms Novice

            For starters, you are using singleton="true" - so only a single cluster node (i.e. the coordinator) will ever interact with the cache store.  Is that intentional?

            Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

             

            <string-keyed-jdbc-store preload="true" shared="true" passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES">
                <write-behind />
                <property name="key2StringMapper">
                      org.infinispan.lucene.LuceneKey2StringMapper
                </property>
                <string-keyed-table>
                      <id-column name="ID_COLUMN" type="VARCHAR(255)"/>
                      <data-column name="DATA_COLUMN" type="bytea"/>         
                      <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/>
                  </string-keyed-table>
            </string-keyed-jdbc-store>
            

             

            While a given segment of data is being transferred to a newly joining node, any updates to that state will wait to be applied.

            But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?

            • 4. Re: Wildfly 8.2, Infinispan cache loader and cluster state
              Paul Ferraro Master

              Flemming Harms wrote:

              Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

              If you are using write-behind, then you lose any guarantees of consistency between the cache store and the in-memory state.  If this is undesirable, do not enable fetch-state - or do not use write-behind.

              Flemming Harms wrote:

              But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?

              state-transfer is enabled automatically for replicated caches.  Typically, this is far more effective/efficient than loading state from a shared cache store.

              • 5. Re: Wildfly 8.2, Infinispan cache loader and cluster state
                Flemming Harms Novice

                Thanks Paul!

                 

                Actually found out, that the attribute purge="true" is default for cache store, which is not insignificant when you expect it to preload the data

                • 6. Re: Wildfly 8.2, Infinispan cache loader and cluster state
                  Dan Berindei Expert

                  Flemming Harms wrote:

                  Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

                  If you are using write-behind, then you lose any guarantees of consistency between the cache store and the in-memory state.  If this is undesirable, do not enable fetch-state - or do not use write-behind.

                   

                  Yes, unfortunately write-behind can cause inconsistencies during join, either with shared="true" or with singleton="true".  I have created ISPN-5575 for this, but the problem is rather complex so I'd rather avoid write-behind.

                   

                  Write-behind can also lose updates when a node crashes. singleton="true" actually has some protection against this, as after a coordinator crash the new coordinator will flush all its in-memory data to the store. But this still misses entries that are no longer in memory because they were removed or evicted.

                   

                  Flemming Harms wrote:

                  But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?

                  state-transfer is enabled automatically for replicated caches.  Typically, this is far more effective/efficient than loading state from a shared cache store.

                   

                  Agree, in most cases it's better to leave state transfer enabled. You could disable in-memory state transfer with a shared store, and rely only on the shared store, but only if the store isn't write-behind.

                   

                  Note that you can also speed up getCache() with await-initial-transfer="false".