6 Replies Latest reply on Jun 24, 2015 5:33 PM by dan.berindei

Wildfly 8.2, Infinispan cache loader and cluster state

fharms Jun 22, 2015 10:24 AM

I hope somebody can help clarify how Wildfly 8.2, Infinispan cache loader and cluster state works.

If we have multiple wildfly nodes with the infinispan configuration below, where we use cache loader to preload the state from the database.

What happens when node 2 or 3 is coming online and there is in the meantime been changes to the cache on node 1. Does the setup guarantee the nodes are consistent even the state is loaded from the database?

Thanks!

/Flemming

<replicated-cache name="LuceneIndexesMetadata" start="EAGER" mode="SYNC" remote-timeout="330000">
          <eviction strategy="NONE" max-entries="-1"/>
          <expiration max-idle="-1"/>
          <string-keyed-jdbc-store singleton="true"  preload="true"  passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES">
                 <write-behind />
                 <property name="key2StringMapper">
                            org.infinispan.lucene.LuceneKey2StringMapper
                </property>
                <string-keyed-table>
                      <id-column name="ID_COLUMN" type="VARCHAR(255)"/>
                      <data-column name="DATA_COLUMN" type="bytea"/>
                      <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/>
                  </string-keyed-table>
          </string-keyed-jdbc-store>
         <indexing index="NONE"/>
</replicated-cache>
                
<replicated-cache name="LuceneIndexesData" start="EAGER" mode="SYNC" remote-timeout="330000">
             <eviction strategy="NONE" max-entries="-1"/>
             <expiration max-idle="-1"/>
             <string-keyed-jdbc-store  singleton="true"  preload="true" passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES">
                   <write-behind />
                    <property name="key2StringMapper">
                            org.infinispan.lucene.LuceneKey2StringMapper
                    </property>
                    <string-keyed-table>
                        <id-column name="ID_COLUMN" type="VARCHAR(255)"/>
                        <data-column name="DATA_COLUMN" type="bytea"/>
                        <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/>
                    </string-keyed-table>
                </string-keyed-jdbc-store>
                <indexing index="NONE"/>
</replicated-cache>

1. Re: Wildfly 8.2, Infinispan cache loader and cluster state

ctomc Jun 22, 2015 5:00 PM (in response to fharms)

pferraro maybe?
Actions
2. Re: Wildfly 8.2, Infinispan cache loader and cluster state

pferraro Jun 23, 2015 9:32 AM (in response to fharms)

For starters, you are using singleton="true" - so only a single cluster node (i.e. the coordinator) will ever interact with the cache store. Is that intentional? Since any new node coming online is, by definition, not the coordinator, they will never preload or contribute state from its cache store - and, on startup, will only receive state from other nodes.
To answer your specific question, yes, the cache state will be consistent. While a given segment of data is being transferred to a newly joining node, any updates to that state will wait to be applied.
Actions
3. Re: Wildfly 8.2, Infinispan cache loader and cluster state

fharms Jun 23, 2015 10:05 AM (in response to pferraro)
For starters, you are using singleton="true" - so only a single cluster node (i.e. the coordinator) will ever interact with the cache store. Is that intentional?

Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

<string-keyed-jdbc-store preload="true" shared="true" passivation="false" fetch-state="true" datasource="java:jboss/datasources/PostgresDS" dialect="POSTGRES"> <write-behind /> <property name="key2StringMapper"> org.infinispan.lucene.LuceneKey2StringMapper </property> <string-keyed-table> <id-column name="ID_COLUMN" type="VARCHAR(255)"/> <data-column name="DATA_COLUMN" type="bytea"/> <timestamp-column name="TIMESTAMP_COLUMN" type="BIGINT"/> </string-keyed-table> </string-keyed-jdbc-store>

While a given segment of data is being transferred to a newly joining node, any updates to that state will wait to be applied.

But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?
Actions
4. Re: Wildfly 8.2, Infinispan cache loader and cluster state

pferraro Jun 23, 2015 6:08 PM (in response to fharms)

Flemming Harms wrote:

Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

If you are using write-behind, then you lose any guarantees of consistency between the cache store and the in-memory state. If this is undesirable, do not enable fetch-state - or do not use write-behind.

Flemming Harms wrote:

But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?

state-transfer is enabled automatically for replicated caches. Typically, this is far more effective/efficient than loading state from a shared cache store.
Actions
5. Re: Wildfly 8.2, Infinispan cache loader and cluster state

fharms Jun 24, 2015 11:50 AM (in response to pferraro)

Thanks Paul!

Actually found out, that the attribute purge="true" is default for cache store, which is not insignificant when you expect it to preload the data
Actions
6. Re: Wildfly 8.2, Infinispan cache loader and cluster state

dan.berindei Jun 24, 2015 5:33 PM (in response to pferraro)

Flemming Harms wrote:

Yes and no . I was concern that our original setup with shared="true" and write-behind could lead to inconsistent data, because the nodes will load the state from the database that might not be fully sync with the cache because of write-behind

If you are using write-behind, then you lose any guarantees of consistency between the cache store and the in-memory state. If this is undesirable, do not enable fetch-state - or do not use write-behind.

Yes, unfortunately write-behind can cause inconsistencies during join, either with shared="true" or with singleton="true". I have created ISPN-5575 for this, but the problem is rather complex so I'd rather avoid write-behind.

Write-behind can also lose updates when a node crashes. singleton="true" actually has some protection against this, as after a coordinator crash the new coordinator will flush all its in-memory data to the store. But this still misses entries that are no longer in memory because they were removed or evicted.

Flemming Harms wrote:

But does this mean you have to enable "<state-transfer enabled="true" /> or can we use a share cache loader to do the same?

state-transfer is enabled automatically for replicated caches. Typically, this is far more effective/efficient than loading state from a shared cache store.

Agree, in most cases it's better to leave state transfer enabled. You could disable in-memory state transfer with a shared store, and rely only on the shared store, but only if the store isn't write-behind.

Note that you can also speed up getCache() with await-initial-transfer="false".
Actions

Go to original post