[ModeShape 5.2.0.Final, Oracle 11g] Corrupted node
illia.khokholkov Nov 8, 2016 12:12 PMProblem
In our application we utilize shallow open-scoped locks. The application has been running successfully for some time, but then a problem occurred. One of the nodes became permanently corrupted. Here is how corruption reveals itself as identified by the manual testing:
- Attempt to lock the node and get an error saying that the node is already locked.
javax.jcr.lock.LockException: The node at '/test' is already locked at org.modeshape.jcr.RepositoryLockManager.lock(RepositoryLockManager.java:393) at org.modeshape.jcr.JcrLockManager.lock(JcrLockManager.java:276) at org.modeshape.jcr.JcrLockManager.lock(JcrLockManager.java:240)
- Attempt to unlock the node, having sufficient privileges to do so, and get an error saying that the node is not locked.
javax.jcr.lock.LockException: The node at location '/test' is not locked at org.modeshape.jcr.RepositoryLockManager.unlock(RepositoryLockManager.java:446) at org.modeshape.jcr.JcrLockManager.unlock(JcrLockManager.java:308) at org.modeshape.jcr.JcrLockManager.unlock(JcrLockManager.java:286)
Notable things about the problematic node (where "lockManager" is an instance of "javax.jcr.lock.LockManager" and "node" is an instance of "javax.jcr.Node"):
- lockManager.isLocked("/test") - returns "false"
- lockManager.holdsLock("/test") - returns "false"
- lockManager.getLockTokens() - returns an empty array
- node.isLocked() - returns "false"
- node.getProperty(JcrLexicon.LOCK_OWNER.getString()) - returns a value representing the owner
- node.getProperty(JcrLexicon.LOCK_IS_DEEP.getString()) - returns "false"
Notes
We do utilize DB locking:
"clustering" : { "clusterName" : "${...}", "configuration" : "${...}", "locking" : "db" }
The typical lock usage pattern for node locking looks like this (when locking, lock timeout is set to 10 minutes):
lockManager.lock(...); try { // do something } finally { lockManager.unlock(...); }
Questions
- How is it possible that "LockManager#lock(...)" and "LockManager#unlock(...)" see inconsistent state and essentially contradict each other?
- How could a single node become corrupted in such a way? So far, out of many nodes, that one is the only problematic one. Additionally, I am unable to simulate this state so I cannot confirm/deny that this is a bug.
- I need to unlock the node, what are my options? My attempts to remove the "JcrLexicon.LOCK_OWNER" and "JcrLexicon.LOCK_IS_DEEP" failed per JCR 2.0 specification, stating that protected properties cannot be removed by the client. I could fork the source code of the ModeShape to remove the restriction regarding protected properties, attempt to remove those properties and see if things get back to normal, but I would rather not do that.
- If nothing else, would backup/restore procedure work for me, assuming I manually edit the JSON file, produced by the backup procedure, to remove properties that should not exist? Speaking of backup/restore, does version history get preserved? Do I need a clean DB schema (in terms of Oracle) or I can attempt to restore into the existing one?
Any help is greatly appreciated. It would be awesome if rhauch and hchiorean could take a look at this as well. Thank you.