13 Replies Latest reply on Feb 2, 2015 1:26 PM by ma6rl

    Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?

    ma6rl

      I have asked this question over on the Infinispan forum and Stackoverflow

       

      http://stackoverflow.com/questions/28127428/enlisting-a-infinispan-cache-store-in-a-cache-transaction

      https://developer.jboss.org/message/916877#916877

       

      but was hoping you guys might have some answers given that Infinispan is the persistence tier in Modeshape.

       

      I'm using Modeshape 4.0 with Wildfly 8.2 and have created a transactional cache that is used by ModeShape and backed by a JDBC cache store:

       

      <transport lock-timeout="60000"/>
      <replicated-cache name="repo" mode="SYNC">
          <transaction mode="NON_XA" locking="PESSIMISTIC"/>
          <locking isolation="READ_COMMITTED" striping="false"/>
          <string-keyed-jdbc-store shared="true" preload="false" passivation="false" purge="false" datasource="java:jboss/datasources/MyDS">
              <string-keyed-table prefix="modeshape">
                  <id-column name="id" type="VARCHAR(200)"/>
                  <data-column name="datum" type="LONGBLOB"/>
                  <timestamp-column name="version" type="BIGINT"/>
              </string-keyed-table>
          </string-keyed-jdbc-store>
      </replicated-cache>
      </cache-container>
      

       

      My concern is that after reading the Infinispan documentation

      4.5. Cache Loaders and transactional caches

      When a cache is transactional and a cache loader is present, the cache loader won’t be enlisted in the transaction in which the cache is part. That means that it is possible to have inconsistencies at cache loader level: the transaction to succeed applying the in-memory state but (partially) fail applying the changes to the store. Manual recovery would not work with caches stores.

      that there is no guarantee that just because a transaction commits/rolls-back in the cache that the cache store is also committed/rolled back correctly. Give that Modeshape typically updates/creates multiple cache entries when adding/removing nodes is there not a chance that over time the cache and cache store may become out of sync. if this is the case that seems to imply there is no guaranteed way to ensure that all nodes created in Modeshape and stored in infinispan will survive application restarts.

       

      Any information you can provide on this will be most appreciated.

        • 1. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
          hchiorean

          Unfortunately this is something that the ISPN guys should answer/elaborate on, since we don't have in depth knowledge of the inner-workings of Infinispan.

           

          However, what I do know is that a decision was made a while back that ISPN will not support XA cache stores (which is what I suspect the statement refers to). You can see/read more info here: https://www.mail-archive.com/infinispan-dev@lists.jboss.org/msg06425.html and [ISPN-604] Re-design CacheStore transactions - JBoss Issue Tracker (if you haven't already done so). So to my knowledge, the previous statement applies to a clustered environment and all but the originating node.

          1 of 1 people found this helpful
          • 2. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
            ma6rl

            Thanks hchiorean

             

            After spending more time researching this and looking through both the Infinispan source code and issue tracker including the links above (which are very helpful) I've confirmed the cache stores are never enlisted in the cache transaction. The call to the store is done in the commit phase cache entry that is added in the transactions is added as a separate call to the store and no attempt to rollback the store is made if any of the writes fail.

             

            This issues is only relevant if using the JDBC Cache Store as none of the other cache store implementations support transactions, and I can see why in the majority of cases it will not work (the cache store resource and JTA transaction manger are on different nodes). The one scenario where this could work if cache stores enlisted in transactions is using a shared JDBC Cache Store which is what I do.

             

            Given that Modeshape's persistence implementation uses Infinispan to persist nodes then even when using transactions to ensure all of the changes within the transaction are either committed or rolled back there is no guarantee that the writes to the underlying cache store are committed/rolled back correctly and may have a different sent of entries to the in-memory cache. The in-memory which will mask the inconsistencies until the application is restarted. While this may not happen frequently it will eventually happen which means you can not store data in modeshape with any confidence that it will be there after the application is restarted.

             

            It appears that Infinispan does provide a manual recovery mechanism as discussed in the above link, there is some controversy around it's usability with cache stores. The above link says it should work as does JIRA issue but the official documentation (see quote in first post) says it does not. Either way this is not an ideal solution when scaling to 100's writes a second and billions of nodes which is our long term goal.

             

            This is kind of deal breaker when it comes to using Modeshape as a database for business critical data that needs to survive an application restart and really limits it's uses which I'm sure is not the intention of the ModeShape developers. I understand that this is not something that can be fixed by the modeshape developers directly but give Infinispan limitation's around this is there any intention to look at more reliable persistence approaches in the future?

            • 3. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
              hchiorean

              We're happy to consider other persistence alternatives, but we haven't done so up to this point.

              IIRC one of the main reasons we chose Infinispan in the first place was the clustering/distribution capabilities across different types of persistent stores and its out-of-the-box integration with the JBoss family of containers.

              • 4. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                rhauch

                If you're concerned with the lack of XA support within the database cache store, please talk directly with the Infinispan developers because they do not believe it is a problem. After all, Infinispan itself is an XA resource, so it is indeed participating in your application transactions. I don't think it's a foregone conclusion that the cache stores have to be XA resources in order for everything to work properly in a fault-tolerant manner.

                • 5. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                  ma6rl

                  rhauch, I've asked the questions in the Infinispan forum but based on past experience they are not as responsive as you and Horia which is why I was hoping to engage your help.

                   

                  Can you please elaborate on

                  Randall Hauch wrote:

                   

                  I don't think it's a foregone conclusion that the cache stores have to be XA resources in order for everything to work properly in a fault-tolerant manner.

                  I've spent some time going through both the Cache Store framework and StringBasedJDBCStore implementation and from what I can see is:

                   

                  1. The call to store the entry is done in the commit phase

                  2. Each entry is stored in a separate call to the store which can either succeed or fail

                  3. If an error occurs a Persistence Exception is thrown, an previous calls to the store are not rolledback

                  4. The commit fails and the cache is not updated. At this point I'm a little hazy on what happens to other XA resources in the transaction but I believe those which have already committed are left committed.

                   

                  Given this I'm not sure how a cache-store can be fault tolerant.

                   

                  Part of the problem is that there is no batching in the Cache Store, if the framework passed all of the entries that needed to be persisted to the cache store at the same time then there is at least an opportunity to perform updates in batch and attempt to clean up if something goes wrong.

                  • 6. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                    rhauch

                    Infinispan has some fault-handling capability (I believe above the cache stores), so it's meaningless to evaluate the fault tolerance of the total system without taking this into account. We simply cannot keep informed about the inner workings of Infinispan as they release new versions, so you'll have to go to their community to get detailed information.

                    • 7. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                      ma6rl

                      rhauch, I've now had some feedback from the Infinispan community both on their forum and stack overflow. The general gist is that currently cache stores are not fault tolerant and can overtime become out of sync with the cache

                       

                      From Re: Enlist Cache Store in a Cache Transaction?

                      The short answer is: no.

                      We are thinking of allowing this for a very narrow case: with shared cachestores, where only the node which initiates the tx has the onus of writing all changes. However, because of the way the data is distributed in the cluster, the initiator node will need to retrieve all entries modified in the tx from the respective owners, and persist them.

                      From http://stackoverflow.com/questions/28127428/enlisting-a-infinispan-cache-store-in-a-cache-transaction/28146994#28146994

                      It applies to writes as well, failure to write to the store does not affect rest of the transaction.

                      The reason for this is that the actual persistence API is not transactional. Therefore, with 2-phase commits (in first phase - prepare - all locks are acquired, in second one - commit - the write is executed) the write to the store is executed in the second phase. Therefore, the failure cannot rollback changes on different nodes.

                      Saying that there are potential plans to support enlisting cache stores in XA Tx in certain uses cases. I've also put forward a proposal that could potential allow a Cache Store implementation to batch writes and attempt to rollback when participating in an XA Tx is not available.

                       

                      Given this do you think it might be useful to update the Modeshape documentation to include this information so that potential users evaluating Modeshape as a solution are aware of the limitations?

                      • 8. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                        hchiorean

                        Just for completion, I think the main scenario where this problem manifests itself (potentially) is when running a cluster with multiple cache stores (most likely 1 per node)

                        However, especially when using databases (i.e. the JDBC cache store) this problem should not come up with invalidation mode where there is, in effect, just 1 cache store which should always be in-sync with whatever node made the changes last. I think this is a viable alternative, since the back-end in an invalidation-mode cluster can be independently configured for HA.

                        • 9. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                          ma6rl

                          hchiorean I'm not sure I understand your previous reply. Given the information provided in the links above by the Infinispan community I don't believe you can assume:

                           

                          Horia Chiorean wrote:

                           

                          I think the main scenario where this problem manifests itself (potentially) is when running a cluster with multiple cache stores

                          Horia Chiorean wrote:

                           

                          in effect, just 1 cache store which should always be in-sync with whatever node made the changes last.

                          The current issue with cache stores in that Infinispan currently only does a best effort to ensure they remain in-sync with the cache (transactional or not).

                           

                          Currently when using transactions infinispan updates the cache store and cache as follows:

                           

                          1. A lock on the key is obtained either at write time (pessimistic) or prepare time (optimistic)

                          2. Infinispan then attempts to update the cache store for each key being modified in the transaction. This is done in the commit stage of the transaction (or via a one phase commit if using pessimistic locking). This step NEVER participates in a transaction and each modification is done as a separate call to the store.

                          3. If all of the modifications to the cache store where successful then the changes to the cache are marked as successfully committed, otherwise the changes to the cache are rolled back. No attempt is ever made to clean up an modifications to the cache store that occurred in the transaction prior to a failure.

                           

                          This means that if an error occurs writing to a cache store (and where Networking and IO are involved errors always occur) then any modifications made prior to the error will remain in the cache store but NOT be reflected in the cache. At this point if the cache is destroyed (either the single node in a local cache or all nodes in the cluster in a replicated cache are shutdown) and the node(s) are restarted the cache will be rebuilt using the cache store which may NOT actually represent the expected state of the application because it became out of sync with cache prior to the cache being destroyed.

                           

                          This behaviors should be expected when using a non-transactional cache store e.g. file system as you can not reasonably expect file system updates to be rollback but when you use a cache store that can support transactions e.g. JDBC it is not unreasonable to expect it to participate in a transaction. Given that this is not the case there is currently no way to configure ModeShape to ensure that the cache store will always correctly represent the applications state.

                           

                          While this may be expectable in some JCR use cases where consistency is not an issue, there are many cases where a JCR does need to be at least eventually consistent if not strongly consistent and currently Modeshape can not gaurantee this.

                           

                          As per the links above the Infinispan community is aware of this and is looking at supporting shared XA cache stores (e.g. JDCB) enlisting in the transaction but if this is implemented it will most likely be part of 7.x or 8.x where as Modeshape currently uses 6.0.2.Final and is unlikely to move forward for a while given Wildfly 8 and most likely the next EAP also uses 6.0.2.Final.

                          • 10. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                            hchiorean

                            My previous statement was related to way I understand local (non-clustered caches) work: ISPN will effectively register tx synchronizations (either if useSynchronization=true or preemptively as an optimization) for making both cache & cache store changes once a transaction completes. If something fails - i.e. the transaction does not complete successfully - both cache & cache store are rolled back and effectively kept in-sync.

                            When running in invalidation mode with a shared cache store, I assumed (perhaps wrongly) that since there is only 1 cache store instance in the environment, tx recovery would work automatically (not using synchronizations but as part of 2PC), keeping the cache store & caches in sync.

                             

                            EDIT: after reading a bit more on invalidation mode from Consistency guarantees in Infinispan · infinispan/infinispan Wiki · GitHub:

                            Because data is stored in the shared store after the invalidation command was executed on all the nodes, a node might execute the invalidation command first, then read the old value from the shared store, all before the originator managed to update the value in the store. If that happens, the node will keep the stale entry until it expires or there it is updated by another write operation.

                            it would seem that this type of setup isn't reliable at all (regardless of the XA contract of cache stores) since it can very well happen for a node in the cluster to hold onto stale data.

                             

                            Regarding ModeShape moving to ISPN >= 7 there are/were a couple of issues:

                            - [ISPN-4983] Public API for tracking completion of Infinispan work for a given user transaction - JBoss Issue Tracker - this was the main blocking issue for us, but it seems to have been fixed in 7.1.0.Final

                            - ISPN 7 is not backwards compatible (API-wise) with 6.x. This is currently the main issue for us as moving to this version of Infinispan means giving up support for the Wildfly 8.x series. Once Wildfly 9 goes final, if it uses a version of ISPN >= 7.1.0.Final, we might consider moving, but not before that.

                            • 11. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                              ma6rl

                              Horia Chiorean wrote:

                               

                              If something fails - i.e. the transaction does not complete successfully - both cache & cache store are rolled back and effectively kept in-sync.

                              When running in invalidation mode with a shared cache store, I assumed (perhaps wrongly) that since there is only 1 cache store instance in the environment, tx recovery would work automatically (not using synchronizations but as part of 2PC), keeping the cache store & caches in sync.

                               

                              The critical issue here is that the store is not enlisted in the transaction 1PC or 2PC and is never rolled back. If you look at the code in Infinispan that does this it actually suspends the Tx prior to writing to the store and resumes it afterwards (CacheWriterInterceptor.java) This also confirmed in the Infinispan documentation and the links I posted above.

                               

                              Given this how do you feel about me raising a JIRA issue to track this. I understand that this is not something that can be fixed in the modeshape code base as it depends on new features in infinispan being implemented or a significant redesign of modeshapea persistance tier but it would be good to be aware of the issue and mathe address it in the future.

                               

                              I am also looking at patching Infinispan 6.0.2.Final to see if I can get a cache store to participate in the transacroon. I'll let you know how it goes.

                              • 12. Re: Can a Infinispan Cache Store get out of sync with the Cache when using Transactions?
                                hchiorean

                                I'm fine with raising the issue to keep a track of this. It would really help though if that issue had link(s) to all the relevant ISPN issues and this discussion. Thanks.