7 Replies Latest reply on Apr 10, 2014 11:03 AM by nfilotto

    How to ensure consistency with a transactional cache and a JDBC Cache Store?

    nfilotto

      Hi all,

       

      Maybe I missed something but is seems that JDBC cache stores like JdbcStringBasedCacheStore don't commit or rollback anything explicitly. So If I understand well the logic, it means that if we use managed connections, the commit and the rollback will be managed by the Application Server itself otherwise it uses the pure auto commit mode. If so it means that we could only expect to ensure consistency between the content of the cache and the content of database if and only if we use the ManagedConnectionFactory that is configured with a Datasource that provides Managed Connections. BUT if we go a little bit deeper into the code, it looks like the modifications are applied in the database (in case of a 2PC) only in the commit method which seems to be called in afterCompletion phase in case useSynchronization is set to true or in commit phase in case useSynchronization is set to false which is in both cases already too late to revert the whole transaction if we have failures while modifying the content of the database. If my understanding is correct it could explain this limitation http://infinispan.org/docs/5.3.x/user_guide/user_guide.html#_cache_loaders_and_transactional_caches

       

      Did I miss something? If not is there any good reason to not commit and rollback the transaction explicitly and to apply the modifications in the commit method instead of the prepare method like you did in JBoss Cache?

       

      Thank you in advance for your answer,

      BR,

      Nicolas

        • 1. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
          nfilotto

          Let's be more concrete to make it clear. Let's say that I have ISPN 5.2.7.Final with the next configuration:

          <infinispan
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
                xsi:schemaLocation="urn:infinispan:config:5.2 http://www.infinispan.org/schemas/infinispan-config-5.2.xsd"
                xmlns="urn:infinispan:config:5.2">
              <global>
                <evictionScheduledExecutor factory="org.infinispan.executors.DefaultScheduledExecutorFactory">
                  <properties>
                    <property name="threadNamePrefix" value="EvictionThread"/>
                  </properties>
                </evictionScheduledExecutor>
                <globalJmxStatistics jmxDomain="exo" enabled="true" allowDuplicateDomains="true"/>
              </global>
              <default>
                <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="500" useLockStriping="false"/>
                <transaction transactionManagerLookupClass="org.exoplatform.services.transaction.infinispan.JBossStandaloneJTAManagerLookup" syncRollbackPhase="true" syncCommitPhase="true" transactionMode="TRANSACTIONAL" useSynchronization="true"/>
                <jmxStatistics enabled="true"/>
                <eviction strategy="NONE"/>
                <loaders passivation="false" shared="true" preload="true">
                  <store class="org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore" fetchPersistentState="true" ignoreModifications="false" purgeOnStartup="false">
                    <properties>
                       <property name="stringsTableNamePrefix" value="lk"/>
                       <property name="idColumnName" value="id"/>
                       <property name="dataColumnName" value="data"/>
                       <property name="timestampColumnName" value="timestamp"/>
                       <property name="idColumnType" value="VARCHAR(12)"/>
                       <property name="dataColumnType" value="VARBINARY(65535)"/>
                       <property name="timestampColumnType" value="BIGINT"/>
                       <property name="dropTableOnExit" value="false"/>
                       <property name="createTableOnStart" value="true"/>
                       <property name="connectionFactoryClass" value="org.infinispan.loaders.jdbc.connectionfactory.SimpleConnectionFactory"/>
                       <property name="connectionUrl" value="jdbc:hsqldb:file:db/data"/>
                       <property name="driverClass" value="org.hsqldb.jdbcDriver"/>
                       <property name="userName" value="sa"/>
                       <property name="password" value=""/>
                    </properties>
                    <async enabled="false"/>
                  </store>
                </loaders>
             </default>
          </infinispan>
          

           

          Please note that I intentionally set "VARCHAR(12)" as "idColumnType" to easily cause a failure.

          So now let's say that I have the following code:

          public class TestCacheStore
          {
             public static void main(String[] args) throws Exception
             {
                ParserRegistry parser = new ParserRegistry(Thread.currentThread().getContextClassLoader());
                ConfigurationBuilderHolder holder = parser.parse(FileLookupFactory.newInstance().lookupFileStrict("cache-store-configuration.xml", Thread.currentThread().getContextClassLoader()));
                GlobalConfigurationBuilder configBuilder = holder.getGlobalConfigurationBuilder();
                ConfigurationBuilder confBuilder = holder.getDefaultConfigurationBuilder();
                TransactionManagerLookup tml = new TransactionManagerLookup()
                {
                   public TransactionManager getTransactionManager() throws Exception
                   {
                      return com.arjuna.ats.jta.TransactionManager.transactionManager();
                   }
                };
                confBuilder.transaction().transactionManagerLookup(tml);
                Configuration conf = holder.getDefaultConfigurationBuilder().build();
                DefaultCacheManager manager = new DefaultCacheManager(configBuilder.build(), conf, true);
                Cache<String, String> c = manager.getCache();
                System.out.println("Show initial content: ");
                for (String key : c.keySet())
                   System.out.println(key + "=" + c.get(key));
                System.out.println("######");
                final TransactionManager tm = c.getAdvancedCache().getTransactionManager();
                System.out.println("TX BEGIN");
                tm.begin();
                c.put("a", "va1");
                c.put("b", "vb1");
                c.put("c", "vc1");
                System.out.println("TX COMMIT");
                tm.commit();
                System.out.println("Show content: ");
                for (String key : c.keySet())
                   System.out.println(key + "=" + c.get(key));
                System.out.println("######");
                System.out.println("TX BEGIN");
                tm.begin();
                c.put("a", "va2");
                c.remove("b");
                c.put("daaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "vd1");
                try
                {
                   System.out.println("TX COMMIT THAT SHOULD FAIL");
                   tm.commit();
                }
                catch (Exception e)
                {
                   System.err.print("Commit failed " + e.getMessage());
                }
                System.out.println("Show content: ");
                for (String key : c.keySet())
                   System.out.println(key + "=" + c.get(key));
                System.out.println("######");
             }
          }
          

           

          The expected result at the end is "a=va1", "b=vb1", "c=vc1" but you should more have something like "a=va2", "c=vc1", "daaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa=vd1" in your cache and "a=va2", "c=vc1" in your database (you can see that by simply relaunching the test without dropping the db)

           

          I hope, it is clear now. As workaround I implemented a sub class of JdbcStringBasedCacheStore that behaves exactly like the JDBCCacheLoader in JBC and now, I get back the consistency between my db and my cache

          • 2. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
            mircea.markus

            Hi Nicolas,

             

            Currently the update to the database happens during the commit phase and outside of the scope of a transaction: the tx associated with the calling thread (if any) is suspended during the commit phase, the database updates are executed as individual commands and then the tx is resumed. The reason for this suspension is that, in the general case, we cannot assume that the transaction even exists on the node where the databases is updated. E.g. if the originates on node A then during commits it needs to write some data on the node B, then B doesn't have the TransactionManager instance, nor the transaction object from node A.

             

            One way to keep the data consistent is by using recovery[1]: if the DB update fails during the commit, the TM recovery process would realize that and either notify the administrator who could manually reconcile the state (again see[1] for the tooling we provide for this). If you don't want manual intervention I think the JBossTM can be configured to replay the transaction automatically on failure.

             

            [1] http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_transaction_recovery

            1 of 1 people found this helpful
            • 3. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
              nfilotto

              So if I understand your answer correctly, you did that because you met some transaction issues in case you use infinispan in distributed mode with a cache store, right?

              For now, I use Infinispan as a replicated cache, so as workaround I implemented my own cache store that behaves like the old JDBC Cache Loader in JBC and it solves my consistency issue.

              In case of the distributed mode managing the transactions must be indeed very complex, I guess you would need one XAResource per Cluster Node involved into the transaction and dedicated commands to propagate the different states of the transaction over all the nodes involved.

              At the end it would be very complex and not scalable as we could easily have too many nodes involved, I guess that is the reason why there is nearly no transactional NoSQL db.

               

              Anyway thank you Markus for your answer

              • 4. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
                mircea.markus

                guess you would need one XAResource per Cluster Node involved into the transaction and dedicated commands to propagate the different states of the transaction over all the nodes involved.

                That would be very complex indeed.

                What we can do though is do an batch update (during commit) to the external cache store: that won't be XA(between ISPN and the store), but at least we can take advantage of store's batching support for speed.

                • 5. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
                  nfilotto

                  Batch Update or not, if it is done during the commit phase and you get an exception while updating the data such as a DB Deadlock or a constraint violation, your data will be inconsistent anyway.

                  • 6. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
                    mircea.markus
                    Batch Update or not, if it is done during the commit phase and you get an exception while updating the data such as a DB Deadlock or a constraint violation, your data will be inconsistent anyway.

                    not really - assuming that you have recovery enabled, your recovery process will pick up the fact that the transaction is in doubt and you have the chance to reconcile the state manually[1].

                    You're pretty much in the same situation with committing data to the store in prepare: the storage successfully persist it, but a commit might fail for whatever reason or not even be sent because the originator crashed. Arguably the window for this to happen might be smaller in your use case, though.

                     

                    [1] http://infinispan.org/docs/7.0.x/user_guide/user_guide.html#_transaction_recovery

                    • 7. Re: How to ensure consistency with a transactional cache and a JDBC Cache Store?
                      nfilotto

                      The jdbc calls are done in the prepare phase in case of 2PC (which is my case in practice)  and the connection commit is performed on the commit phase, if it fails during the commit phase, I agree that a recovery is required, we have no other way to manage this situation but the most common issues like db deadlocks and integrity constraints violations occur on jdbc calls so during the prepare phase which can still be properly rollbacked without the need of a manual recovery. I already tested it a lot with JBC and believe me it works well.