Full-Text search for binary content using ModeShape 4.5.0
roykaushik Apr 12, 2016 6:03 PMI am pretty new to JCR and ModeShape and I am using ModeShape version 4.5.0 in my application for document storage functionality. All the modehshape jars including tika-core-1.8.jar are present in my classpath. I am able to persist and retrieve binary data successfully;however when I perform full-text search on binary content I am not seeing expected results. My modeshape repository config JSON file looks like below :-
{
"name" : "DocumentRepository",
"workspaces" : {
"predefined" : ["otherWorkspace"],
"default" : "default",
"allowCreation" : true
},
"security" : {
"anonymous" : {
"roles" : ["readonly","readwrite","admin"],
"useOnFailedLogin" : false
}
},
"storage" : {
"cacheName" : "WSDocumentCache",
"binaryStorage" : {
"type" : "database",
"driverClass" : "${modeshape.database.driverclass}",
"username" : "${modeshape.database.username}",
"password" : "${modeshape.database.password}",
"url" : "${modeshape.database.url}"
}
},
"textExtraction": {
"extractors" : {
"tikaExtractor":{
"name" : "Tika content-based extractor",
"classname" : "tika"
}
}
}
}
A snippet of my full-text search query and the corresponding java code to retrieve the data is as follows :-
String jql = "SELECT file.* FROM [nt:file] AS file INNER JOIN [nt:resource] AS data " +
"ON ISCHILDNODE(data , file) WHERE CONTAINS(data.[jcr:data], $searchText)";
QueryManager queryManager = jcrSession.getWorkspace().getQueryManager();
Query query = queryManager.createQuery(jql, Query.JCR_SQL2);
Value tag = jcrSession.getValueFactory().createValue("New screen to be able to add");
query.bindValue("searchText", tag);
QueryResult queryResult = query.execute();
RowIterator rowIter = queryResult.getRows();
logger.info("Total no of rows returned by the query :: " + rowIter.getSize());
//further code to get the rows and the row data
With the above query I am always getting the RowIterator size as 0 althought I have multiple documents already saved in the database which contains the search text "New screen to be able to add". I tried replacing INNER JOIN with LEFT JOIN but that is returning me all nt:file nodes irrespective of whether the child nt:resource node contains the search text or not.
I am a bit stuck up on this one and I need to move ahead and any quick help on this is highly appreciated.
Thanks,
Kaushik