Case Insensitive Indexing and Search?
dharrison Apr 3, 2017 8:10 PMHello,
I'm using ModeShape 5.3.0 and 5.4.0 Final
Is it possible to create a case-insensitive index and search on a mixed-case String property of a Custom Node ?
I've got approximately 20,000 test nodes, spread across two different languages (language_code).
What I would like to do is a JCR SQL2 query, to search for tests, in a specified language, and matching on a partial test name.
The problem is that the name data is mixed case but I would like for the search to work case insensitive.
I imagine I could create a lower cased additional property for indexing and searching, but I hope there is a more elegant solution. Can I tell ModeSHape and the Index and Search case-insensitive?
Thanks!
Example CND
<jcr='http://www.jcp.org/jcr/1.0'>
<nt='http://www.jcp.org/jcr/nt/1.0'>
<mix='http://www.jcp.org/jcr/mix/1.0'>
<example='http://www.modeshape.org/examples/example/1.0'>
[example:Test] > nt:unstructured, mix:created, mix:lastModified, mix:versionable
- example:global_test_id (string)
- example:language_code (string)
- example:name (string)
- example:description (string)
[example:referenceableTest] > example:Test, mix:referenceable
Indexes (defined in repository-config.json)
"indexes" : {
"indexT3LangAndName" : {
"kind" : "value",
"provider" : "lucene",
"nodeType" : "example:Test",
"columns" : "example:language_code (STRING), example:name (STRING)"
}
}
Example JCR SQL2 Queries
Attempt 1:
Query: SELECT [example:global_test_id], [example:language_code], [example:name], [example:description] FROM [example:Test] WHERE [example:language_code]='x-pig-latin' AND [example:name] LIKE '%insurance%'
Problem: If I look for test name like "%insurance%", it is case sensitive does not retrieve those tests with "Insurance"
Attempt 2:
Query: SELECT [example:global_test_id], [example:language_code], [example:name], [example:description] FROM [example:Test] WHERE [example:language_code]='x-pig-latin' AND LOWER(example:name]) LIKE '%insurance%'
Problem: Search was still case sensitive and did not pick up tests based on their lower cased names.