Exo JBossCache Implementation Code Review

Version 3

Created by galder.zamarreno on Dec 9, 2009 6:03 AM. Last modified by galder.zamarreno on Dec 10, 2009 8:23 AM.

Exo JBoss Cache implementation code review (2009.12.09) based on code in http://anonsvn.jboss.org/repos/exo-jcr/jcr/branches/1.12.0-JBC/

Registration of components in JBoss Cache's component registry

JBossCacheWorkspaceDataContainer constructor does things like this. This is then injected to ExoJCRCacheLoader1 via @Inject:

((CacheSPI<Serializable, Object>)cache).getComponentRegistry().registerComponent(this.persistentContainer,
WorkspaceDataContainer.class);

Rather than doing this, it'd be better if you built and configured ExoJCRCacheLoader1 into JBoss Cache programatically and then you'd have full control on how ExoJCRCacheLoader1 is created and you'd avoid having to deal with the component registry.

Same most like for the following code in JbossCacheIndexChangesFilter:
((CacheSPI<Serializable, Object>)cache).getComponentRegistry().registerComponent(searchManager,
SearchManager.class);

There's an downside though with adding a cache loader programatically which is how you externalize its properties, i.e. database driver, URL, or datasource name...etc. For those components such as interceptors that do not rely on externalized properties, the programatic way should be used. For those needing externalized properties, the choice is less clear.

Use of batching vs JTA transactions (or supporting a managed and non-managed environment)

In general, I see that you're using batches everywhere. Rather than using batches, transactions should be used since exo will be running in a managed environment, i.e. JBoss Application Server.

If batches are used instead of transactions, the managed datasource won't commit/rollback the database accordingly and hence forces your cache loader implementations (i.e. org.exoplatform.services.jcr.impl.storage.jbosscache.JDBCCacheLoader or ExoJCRCacheLoader1) to do the commit/rollback calls. If you use transactions, the application server will make sure that the managed DataSource (i.e. java:DefaultDS) that you're interacting with gets committed or rollbacked accordinly.

See what we did for org.jboss.cache.loader.ConnectionFactory and how we differentiate between a managed and non-managed environment.

You should be doing the same in your JDBC cache loader implementation, use an abstraction layer than gets the connection and commits/rollback it, or not, depending on the environment.

So, to summarise, if exo is going to run within AS, you should use transactions. If exo will run in a standalone mode where there's no transaction manager, then you should use batches.

A good explanation of what a JTA transaction gives you vs what a batching gives you and when to use one or the other is explained in this wiki: BatchModeTransactionManager

You shouldn't use the batch transaction manager as indicated in the JBossCache 3.x code since it's deprecated, but you do need to abstract your batching/transaction calls so that use one or the other depending on the environment. For example, apart from the work on the cache loader to retrieve the connection depending on whether you're in a managed or not managed environment, you could have an abstraction for being/commit/rollback a transaction. When running in standalone or non AS environment, this calls would call startBatch/endBatch and when in managed, they'd get a transaction manager (either from JNDI or via cache.getConfiguration().getRuntimeConfiguration().getTransactionManager()) and call begin/commit/rollback accordingly.

Now, if in standalone mode and using batches, an endBatch should be added to catch section so that it is executed regardless, otherwise if an exception is thrown, batches could be left unclosed
      cache.startBatch();
      this.nodes = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.NODES));
      this.properties = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.PROPS));
      this.locks = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.LOCKS));
      this.refs = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.REFS));
      cache.endBatch(true);

For example:
      cache.startBatch();
      try{
      this.nodes = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.NODES));
      this.properties = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.PROPS));
      this.locks = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.LOCKS));
      this.refs = cacheRoot.addChild(Fqn.fromElements(JBossCacheStorage.REFS));
      cache.endBatch(true);
      } catch(Exception e) {
      cache.endBatch(false);
      throw e;
      }

Finally, although batches are good for an standalone environment, you can also use a JTA TransactionManager like JBoss Transactions in an standalone environment. See org.jboss.cache.transaction.JBossStandaloneJTAManagerLookup for an example on how to get a JTA TransactionManager in an standalone environment. If you use transactions in a standalone environment as well, it will make your code simpler while having all the benefits of a proper transaction manager.

Thread safety in cache loader and interceptor implementations

ExoJCRCacheLoader1 needs to be thread safe as per CacheLoader contract: So, transaction HashMap most likely needs to be a thread safe collection.

CommandInterceptor / AbstractVisitor, are meant to be thread safe too, so:

IndexerInterceptor : don't know where addIndexInterceptorListener and removeIndexInterceptorListener are gonna be called from, but indexInterceptorListners should most likely be a thread safe colllection.

IndexModificationsBuilder: since removedNodes, addedNodes and updateNodes can be modified by different nodes in paralell, these need to be thread safe collections.

Potential performance improvements suggestions

WARNING NOTE!!! These are only suggestions and should only be implemented if see contention in the places I talk about. Contentions can be discovered via thread dumps, profiling tools....etc.

In JBossCacheStorageConnection, I see that you create fqns relatively often. Depending on how often the same Fqn is built, it might be interesting to have a cache of identifier to Fqn, and if the same id is asked again, return it directly from the map. This map would need to be thread safe.

JBossDeveloper

Exo JBossCache Implementation Code Review

Registration of components in JBoss Cache's component registry

Use of batching vs JTA transactions (or supporting a managed and non-managed environment)

Thread safety in cache loader and interceptor implementations

Potential performance improvements suggestions

Comments