Inconsistent version history after concurrent access
jkreutzfeld May 24, 2017 11:10 AMHi everyone,
i really don't have clue on this problem and i'm hoping you can shed some light on it. After migrating to Modeshape 5.4.0 we sometimes get the issue that a node has a baseVersion reference to a nonexistant version, so any further access to this node via the VersionManager results in an error. I don't know why this happens, but it must be related to a lot of concurrent actions with that node (creating, reading, copying, update through sequencer). I was unable to create a small, self contained sample test for this. I extracted some data from the corrupted node to make this clearer (using a simple JCR client):
This is the node and its properties (note the "jcr:baseVersion"):
karaf@root()> jcr:describe home/jkreutzfeld/Crashtest2/basic_male+\%28Kopie+2\%29.png
home/jkreutzfeld/Crashtest2/basic_male+%28Kopie+2%29.png
basic_male+%28Kopie+2%29.png (nt:file)
jcr:uuid = 74b69d27-9e1b-41f3-9dec-ccebf790b0ba
jcr:uuid = 74b69d27-9e1b-41f3-9dec-ccebf790b0ba
jcr:baseVersion = a69d00a317f1e783cbbfbb-9568-4d30-8fef-c4c24704553a
jcr:predecessors =
jcr:title = basic_male.png
ds:scanDate = 2017-05-24T10:41:32.822+02:00
jcr:createdBy = JKREUTZFELD
ds:checksum = 831dc6d4aa369579142276229e09a82aee11a6dc
jcr:primaryType = nt:file
jcr:created = 2017-05-24T10:41:31.816+02:00
jcr:mixinTypes = [ dsmix:virusChecked, mix:lockable, dsmix:checksum, mix:title, dsmix:thumbnails, mix:versionable ]
jcr:versionHistory = a69d00a317f1e774b69d27-9e1b-41f3-9dec-ccebf790b0ba
jcr:isCheckedOut = false
When i try to checkout the node, i get this error:
karaf@root()> jcr:checkout /home/jkreutzfeld/Crashtest2/basic_male+\%28Kopie+2\%29.png
RepositoryException: Cannot find the base version with the key 'a69d00a317f1e783cbbfbb-9568-4d30-8fef-c4c24704553a' for the node '/home/jkreutzfeld/Crashtest2/basic_male+%28Kopie+2%29.png'. Make sure that any checkin/checkout operations are atomic for that node.
javax.jcr.version.VersionException: Cannot find the base version with the key 'a69d00a317f1e783cbbfbb-9568-4d30-8fef-c4c24704553a' for the node '/home/jkreutzfeld/Crashtest2/basic_male+%28Kopie+2%29.png'. Make sure that any checkin/checkout operations are atomic for that node.
at org.modeshape.jcr.AbstractJcrNode.getBaseVersion(AbstractJcrNode.java:3297)
at org.modeshape.jcr.JcrVersionManager.checkout(JcrVersionManager.java:557)
at org.modeshape.jcr.JcrVersionManager.checkout(JcrVersionManager.java:524)
at de.doubleslash.servicecomponents.tools.contentrepositoryclient.impl.command.CheckoutCommand.executeCheckout(CheckoutCommand.java:97)
at de.doubleslash.servicecomponents.tools.contentrepositoryclient.impl.command.CheckoutCommand.checkout(CheckoutCommand.java:58)
at de.doubleslash.servicecomponents.tools.contentrepositoryclient.impl.command.CheckoutCommand.doExecute(CheckoutCommand.java:45)
at org.apache.karaf.shell.console.AbstractAction.execute(AbstractAction.java:34)
at org.apache.karaf.shell.console.OsgiCommandSupport.execute(OsgiCommandSupport.java:41)
at org.apache.karaf.shell.commands.basic.AbstractCommand.execute(AbstractCommand.java:34)
at org.apache.karaf.shell.compat.CommandTracker$1.execute(CommandTracker.java:112)
at org.apache.karaf.shell.impl.console.osgi.secured.SecuredCommand.execute(SecuredCommand.java:67)
at org.apache.karaf.shell.impl.console.osgi.secured.SecuredCommand.execute(SecuredCommand.java:87)
at org.apache.felix.gogo.runtime.Closure.executeCmd(Closure.java:480)
at org.apache.felix.gogo.runtime.Closure.executeStatement(Closure.java:406)
at org.apache.felix.gogo.runtime.Pipe.run(Pipe.java:108)
at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:182)
at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:119)
at org.apache.felix.gogo.runtime.CommandSessionImpl.execute(CommandSessionImpl.java:94)
at org.apache.karaf.shell.impl.console.ConsoleSessionImpl.run(ConsoleSessionImpl.java:270)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.modeshape.jcr.cache.NodeNotFoundInParentException: Cannot locate child node: a69d00a317f1e783cbbfbb-9568-4d30-8fef-c4c24704553a within parent: a69d00a317f1e774b69d27-9e1b-41f3-9dec-ccebf790b0ba
at org.modeshape.jcr.cache.document.LazyCachedNode.parentReferenceToSelf(LazyCachedNode.java:238)
at org.modeshape.jcr.cache.document.LazyCachedNode.getSegment(LazyCachedNode.java:282)
at org.modeshape.jcr.cache.document.LazyCachedNode.getPath(LazyCachedNode.java:291)
at org.modeshape.jcr.JcrSession.node(JcrSession.java:570)
at org.modeshape.jcr.JcrSession.node(JcrSession.java:534)
at org.modeshape.jcr.JcrSession.node(JcrSession.java:500)
at org.modeshape.jcr.JcrSession.node(JcrSession.java:478)
at org.modeshape.jcr.AbstractJcrNode.getBaseVersion(AbstractJcrNode.java:3293)
... 19 more
Here is the version history for the node and the corresponding versions:
karaf@root()> jcr:list jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7
jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7
├ jcr:versionLabels (nt:versionLabels)
├ jcr:rootVersion (nt:version)
├ 1.0 (nt:version)
└ 1.1 (nt:version)
karaf@root()> jcr:describe jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/jcr:rootVersion
jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/jcr:rootVersion
jcr:rootVersion (nt:version)
jcr:uuid = a69d00a317f1e7ee01bf25-8721-4bf6-91ab-ee50735a6115
jcr:successors = [ a69d00a317f1e795f5f976-3b11-4c0b-b59c-50c39de7a555 ]
jcr:primaryType = nt:version
jcr:created = 2017-05-24T10:43:53.151+02:00
karaf@root()> jcr:describe jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/1.0
jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/1.0
1.0 (nt:version)
jcr:uuid = a69d00a317f1e795f5f976-3b11-4c0b-b59c-50c39de7a555
jcr:successors = [ a69d00a317f1e7cb40dbc3-e391-42f7-aee2-6e26b74c77ba ]
jcr:primaryType = nt:version
jcr:predecessors = [ a69d00a317f1e7ee01bf25-8721-4bf6-91ab-ee50735a6115 ]
jcr:created = 2017-05-24T10:43:53.648+02:00
karaf@root()> jcr:describe jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/1.1
jcr:system/jcr:versionStorage/08/5f/e3/085fe35d39c56090d913f58baceec8d8862b9fa7/1.1
1.1 (nt:version)
jcr:uuid = a69d00a317f1e7cb40dbc3-e391-42f7-aee2-6e26b74c77ba
jcr:successors = [ a69d00a317f1e783cbbfbb-9568-4d30-8fef-c4c24704553a ]
jcr:primaryType = nt:version
jcr:predecessors = [ a69d00a317f1e795f5f976-3b11-4c0b-b59c-50c39de7a555 ]
jcr:created = 2017-05-24T10:43:54.619+02:00
As you can see, there actually is no version with an ID that matches the jcr:baseVersion property of the node.
I could not determine how this could even be possible, I hope anyone got an idea on this.
What are we doing wrong? Could it be possible that a checkin operation fails in a way that the node is updated but the history node was not created correctly?
I'm sorry that this problem is so vague.
Kind regards
Jan