Do the nodes appear to be lost across on other nodes in the cluster ? (other than the writing node) or on the node on which the writing is performed ?
If the problem is local - i.e. local writer/reader - pls try with a non-clustered cache and check if the problem is still there.
If the problem is remote - other readers in the cluster cannot see the nodes added by the writer - then something may not be replicating correctly (you should try setting the log levels of ModeShape/ISPN/JGroups to DEBUG and look for anything suspicious in the logs).
Another important aspect is that ModeShape up until [MODE-2336] Unstable work event listener monitor changes with user transactions - JBoss Issue Tracker (which will be part of 4.1) did not handle user transactions correctly. Since you're using EJBs, from ModeShape's perspective these are user-transactions, so this fix may be relevant. If possible, try building the latest master and testing with that.
The PESSIMISTIC cache configuration is correct (and required) and we have several tests around concurrent cache writers (albeit local): modeshape/ConcurrentWriteTest.java at master · ModeShape/modeshape · GitHub which all work fine.
We also have this issue [MODE-2280] Child node not found under high concurrency when eviction is enabled and SingleFile store is used - JBoss Is… which is directly related (as far as we can tell) to a bug in Infinispan (at least when running on Windows)
If you can provide a runnable test case we can run locally, we're happy to open a JIRA for this and investigate it ourselves.
Thanks for the information. The problem is occurring when reading and writing to a single node. I switched to using a local-cache and am still seeing the same issue.
I don't believe that MODE-2280 is the issue here as I am using a JDBC store and have confirmed that cache and the store are in sync.
My next step is to build and deploy 4.1 to see if that resolves the issue with the fixes made around MODE-2336, I will also look at sending a pull request with a test that recreates the issue if I can.
After additional digging into the JDBC Store and the Cache it appears all of the expected nodes are in the database but the parent node only has a reference to a subset of the child nodes.
I've tested with both 4.0 and 4.1 (built from master) and am seeing the same issue.
I've created an issue [MODE-2353] Concurrently creating child nodes for the same parent results in node loss when using Wildfly - JBoss Issue … and added a PR for a modified version on the modeshape-cdi quick start that can be used to recreate the issue.
hchiorean tracked down the problem. It turns out that the default isolation level when using the Infinispan sub-system in Wildfly is REPEATABLE_READ not READ_COMMITTED. In order to support concurrent writes it needs to be READ_COMMITTED, see http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_isolation_levels for additional information.
The Infinispan cache used by modeshape needs to be configured as follows in order to safely support concurrent writes to the same ModeShape node in when using container managed or user transactions inside Wildfly:
<transaction mode="NON_XA" locking="PESSIMISTIC"/>