4 Replies Latest reply on Nov 26, 2014 5:57 AM by bes82

Detail questions regarding indexes

bes82 Nov 20, 2014 2:17 AM

  
If I understand the doc - Query and search - ModeShape 4 - Project Documentation Editor - correctly, it is no problem to define asynchronous and synchronous index at the same time in one indexmanager. 
 
Is the following assumption correct:
 
Calling session.save() blocks at least as long as all affected synchronous indexes have been updated?
 
 
Then I have another question about how to define PATH indexes, consider the following query:
 
SELECT child.x
FROM [nt:child] AS child 
JOIN [nt:parent] as parent ON ISDESCENDANTNODE(child,parent) 
JOIN [nt:subchild] AS subchild ON ISCHILDNODE(subchild,child) 
WHERE child.y='12345' 
AND parent.z= '67890'
 
Indexes are defined on child.y and parent.z and they are used. What is not used is a PATH index for the subchild query that is also defined on all three nodetypes
 
The stripped down plan looks like this, and I don't understand why the access queries are ordered this way:
 
Project [child]
  Join [subchild,child]
    Join [parent,child] 
      Access [parent]
        IndexUsed
      Access [child]
        IndexUsed
    Access [subchild]
      NoIndexUsed
 
What I would have expected is:
 
Access to child using child.y index, access to parent using either parent.z or PATH index, access to subchild using PATH index.
 
What I guess is happening is, that as the subchild access is not a dependent access, no index is used, but why is it not a dependent query?

1. Re: Detail questions regarding indexes

rhauch Nov 20, 2014 9:43 AM (in response to bes82)

If I understand the doc - Query and search - ModeShape 4 - Project Documentation Editor - correctly, it is no problem to define asynchronous and synchronous index at the same time in one indexmanager.

Yes.

Calling session.save() blocks at least as long as all affected synchronous indexes have been updated?

Yes.

Then I have another question about how to define PATH indexes, consider the following query:

SELECT child.x

FROM [nt:child] AS child

JOIN [nt:parent] as parent ON ISDESCENDANTNODE(child,parent)

JOIN [nt:subchild] AS subchild ON ISCHILDNODE(subchild,child)

WHERE child.y='12345'

AND parent.z= '67890'

Indexes are defined on child.y and parent.z and they are used. What is not used is a PATH index for the subchild query that is also defined on all three nodetypes

The stripped down plan looks like this, and I don't understand why the access queries are ordered this way:

Project [child]

Join [subchild,child]

    Join [parent,child]

      Access [parent]

        IndexUsed

      Access [child]

        IndexUsed

    Access [subchild]

      NoIndexUsed

The 'parent' nodes are found via the index on 'z', and the 'child' nodes are found via the index on 'y', and the two are joined to find tuples for all the correct combinations of 'child' and 'parent' (per the ISDESCENDANTNODE criteria); any 'child' node that is not a descendant of a 'parent' will be discarded, and any 'parent' node that has no descendant in 'child' will also be discarded. Then, the final join finds all 'subchild' nodes for each of the remaining 'child' nodes.

The only kind of true dependent query that ModeShape supports is in correlated subqueries.

BTW, in your example query the 'subchild' nodes serve no purpose; I presume they do in the real query.
Actions
2. Re: Detail questions regarding indexes

bes82 Nov 20, 2014 10:14 AM (in response to rhauch)

The 'subchild' serves one purpose: there has to be such a child, otherwise I don't want to include the child in the resultset.

So the problem is now, that there is no criteria for subchild so the access query collects every node in the repository.

I thought that a path index is used (which exists) to narrow down the number of nodes that have to be fetched. Without this my query is extremely unperformant.

But how do I change that? There is no constraint on subchildI could ask for. I simply like to get all child nodes that have a child. I could add an index on primaryType for the subchilds, but the type of subchilds in my real query is rather nt:unstructured, so that wouldn't help either.
Actions
3. Re: Detail questions regarding indexes

rhauch Nov 20, 2014 11:06 AM (in response to bes82)

I'm surprised the implicit child node index is not used.
Actions
4. Re: Detail questions regarding indexes

bes82 Nov 26, 2014 5:57 AM (in response to rhauch)

Should I report this as a bug?

Slightly modified Query that works on every modeshape repository and doesn't use indexes at all with 4.0:

select sys.* from [mode:system] as sys
join [nt:nodeType] as ntx on ISDESCENDANTNODE(ntx,sys)
join [nt:propertyDefinition] as pd on ISCHILDNODE(pd,ntx)
Actions

Go to original post