1 Reply Latest reply on Sep 5, 2013 9:15 AM by rhauch

    Full- text based search

    m.jawwad

      Hi,

       

          I am still exploring modeshape with 3.4 and now I am testing the searching features. So far I am not able to run searches based on full-text. Here is my query and result:

       

      Query String:  SELECT * FROM [nt:base] WHERE  CONTAINS(., '*') order by [jcr:title] desc

      Query execution was successfull. Total results 0

      But this did not work . Next I tried this:

       

      Query String: SELECT * FROM [nt:base] WHERE  CONTAINS([nt:base], '*') order by [jcr:title] desc

      Query execution was successfull. Total results 0

      and the result was same. But if I search using another property like :


      Query String: SELECT * FROM [nt:base] WHERE   ( lower([jcr:title]) like '%.log'  ) order by [jcr:title] desc

      Query execution was successfull. Total results 10


      Result are fine. So can anybody tell me what I am doing wrong in the query? also is there any config in the .json file I need to configure??


      Thanks in advance.


      Regards,


       

       

        • 1. Re: Full- text based search
          rhauch

          There are a couple of things wrong:

           

          SELECT * FROM [nt:base] WHERE  CONTAINS(., '*') order by [jcr:title] desc

           

          This query will find nothing because it is invalid for a couple of reasons.

           

          First, the full-text search expression (the '*' literal) is explicitly asking to find nodes where the asterisk character appears in a property. However, since full-text search involves normalizing words (e.g., suffixes like "-ed" and "-ing" are removed, whitespace and punctuation is removed), I suspect all asterisks would be removed when properties are actually indexed. See Section 6.7.19 of the JSR-283 specification for the actual definition of a full-text search expression, but think of a full-text search expression as a set of terms that you would submit to an internet search engine. IOW, the full-text search expression should contain one or more single terms or quoted phrases, each of which may optionally be preceded with a "-" if that term/phrase is to not appear result. Note also that a full-text search expression does not allow wildcard characters. If you want to use wildcards, use LIKE.

           

          Secondly, the first parameter of the 'CONTAINS' function is not valid. Yes, in XQuery you can pass '.' as the first parameter to the 'CONTAINS' function, but this is not valid in JCR-SQL2. The first parameter must either be a valid selector name and property name (e.g., "[nt:base].[jcr:primaryType]"), or a valid selector name with a wildcard for the property name (e.g., "[nt:base].*"). Note that if your FROM clause identifies an alias (e.g., "FROM [nt:base] AS nodes") then the alias will serve as the name of that selector.

           

          A valid example of a full-text search query is:

           

          SELECT * FROM [nt:base] WHERE CONTAINS([nt:base].*, 'log') ORDER BY [jcr:title] DESC

           

          This will find all nodes that have any property containing 'log'. Again, you cannot use a wildcard, so a full-text search cannot be used to find a property that contains some pattern; in such cases, use a LIKE expression.

          SELECT * FROM [nt:base] WHERE  CONTAINS([nt:base], '*') order by [jcr:title] desc

          This query will not work because of the same two reasons cited above: the first and second parameters to the 'CONTAINS' function are both incorrect.

           

          full-text search expression. However, the first parameter of the 'CONTAINS' function is also invalid. The first parameter must either be a valid selector name and property name (e.g., "[nt:base].[jcr:primaryType]"), or a valid selector name with a wildcard for the property name (e.g., "[nt:base].*"). Note that if your FROM clause identifies an alias (e.g., "FROM [nt:base] AS nodes") then the alias will serve as the name of that selector.

           

          BTW, both of these queries should have failed to parse and should not have run. I've logged [#MODE-2027] CONTAINS function in JCR-SQL2 should fail to parse when first parameter is '.' - JBoss Issue Tracker

           

          SELECT * FROM [nt:base] WHERE   ( lower([jcr:title]) like '%.log'  ) order by [jcr:title] desc

           

          As you say, this query will work correctly because it uses a valid LIKE expression.