4 Replies Latest reply on Oct 31, 2014 2:20 PM by ma6rl

Potential issue with concurrent writes resulting in node loss

ma6rl Oct 28, 2014 11:36 PM

I am seeing an issue with node loss when performing concurrent writes to the same parent node using the ModeShape 4.0.0.Final subsystem in Wildfly 8.1.

I currently have the following node structure which I create at startup:

/default/jobs

I then add 10 child nodes concurrently under jobs using multiple threads via REST API calls (approx. 3 threads writing concurrently). This results in only 5 to 6 of the child nodes being created, the remainimg ones appear to be lost.

Each child node is added in it's own session/transaction using an EJB. No errors occur when adding the nodes. Below is my Infinispan configuration:

          <cache-container name="modeshape" default-cache="my-repo" module="org.modeshape">
                <transport lock-timeout="60000"/>
                <replicated-cache name="mye-repo" mode="SYNC">
                    <transaction mode="NON_XA" locking="PESSIMISTIC"/>
                    <string-keyed-jdbc-store shared="true" preload="false" passivation="false" purge="false" datasource="java:jboss/datasources/MyDS">
                        <string-keyed-table prefix="JDG_MC_SK">
                            <id-column name="id" type="VARCHAR(200)"/>
                            <data-column name="datum" type="LONGBLOB"/>
                            <timestamp-column name="version" type="BIGINT"/>
                        </string-keyed-table>
                    </string-keyed-jdbc-store>
                </replicated-cache>
            </cache-container>

Looking through previous posts/issues it appears a similar issue was addressed and fixed in Modeshape 3.x a couple of years ago, with the caveat being that the infinispan cache be configured to use PESSIMISTIC locking. Is this still the case or is any additional configuration needed?

1. Re: Potential issue with concurrent writes resulting in node loss

hchiorean Oct 29, 2014 4:16 AM (in response to ma6rl)

Do the nodes appear to be lost across on other nodes in the cluster ? (other than the writing node) or on the node on which the writing is performed ?
If the problem is local - i.e. local writer/reader - pls try with a non-clustered cache and check if the problem is still there.
If the problem is remote - other readers in the cluster cannot see the nodes added by the writer - then something may not be replicating correctly (you should try setting the log levels of ModeShape/ISPN/JGroups to DEBUG and look for anything suspicious in the logs).

Another important aspect is that ModeShape up until [MODE-2336] Unstable work event listener monitor changes with user transactions - JBoss Issue Tracker (which will be part of 4.1) did not handle user transactions correctly. Since you're using EJBs, from ModeShape's perspective these are user-transactions, so this fix may be relevant. If possible, try building the latest master and testing with that.
The PESSIMISTIC cache configuration is correct (and required) and we have several tests around concurrent cache writers (albeit local): modeshape/ConcurrentWriteTest.java at master · ModeShape/modeshape · GitHub which all work fine.
We also have this issue [MODE-2280] Child node not found under high concurrency when eviction is enabled and SingleFile store is used - JBoss Is… which is directly related (as far as we can tell) to a bug in Infinispan (at least when running on Windows)

If you can provide a runnable test case we can run locally, we're happy to open a JIRA for this and investigate it ourselves.
Actions
2. Re: Potential issue with concurrent writes resulting in node loss

ma6rl Oct 29, 2014 12:50 PM (in response to hchiorean)

Thanks for the information. The problem is occurring when reading and writing to a single node. I switched to using a local-cache and am still seeing the same issue.

I don't believe that MODE-2280 is the issue here as I am using a JDBC store and have confirmed that cache and the store are in sync.

My next step is to build and deploy 4.1 to see if that resolves the issue with the fixes made around MODE-2336, I will also look at sending a pull request with a test that recreates the issue if I can.

UPDATE:
After additional digging into the JDBC Store and the Cache it appears all of the expected nodes are in the database but the parent node only has a reference to a subset of the child nodes.

UPDATE 2:
I've tested with both 4.0 and 4.1 (built from master) and am seeing the same issue.
Actions
3. Re: Potential issue with concurrent writes resulting in node loss

ma6rl Oct 29, 2014 4:51 PM (in response to hchiorean)

I've created an issue [MODE-2353] Concurrently creating child nodes for the same parent results in node loss when using Wildfly - JBoss Issue … and added a PR for a modified version on the modeshape-cdi quick start that can be used to recreate the issue.
Actions
4. Re: Potential issue with concurrent writes resulting in node loss

ma6rl Oct 31, 2014 2:20 PM (in response to ma6rl)

hchiorean tracked down the problem. It turns out that the default isolation level when using the Infinispan sub-system in Wildfly is REPEATABLE_READ not READ_COMMITTED. In order to support concurrent writes it needs to be READ_COMMITTED, see http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_isolation_levels for additional information.

The Infinispan cache used by modeshape needs to be configured as follows in order to safely support concurrent writes to the same ModeShape node in when using container managed or user transactions inside Wildfly:

<transaction mode="NON_XA" locking="PESSIMISTIC"/>
<locking isolation="READ_COMMITTED"/>
Actions

Go to original post