1 Reply Latest reply on Apr 12, 2013 9:02 AM by rhauch

    Query on date type residual property of nt:unstructured

    l.tagliani

      Hi all,

        I'm trying to test the performance of date range query creating a set of 100000 nodes of type nt:unstructured, setting different property on them and then performing the query.

       

      Her's a strip of the code used to create those nodes:

       

       

      {code}

      Session session = repository.login(new SimpleCredentials("admin", "password".toCharArray()), null);

              Node testRoot = session.getRootNode().addNode("testDateRangeQueryXPATH", "nt:unstructured");

              testRoot.setProperty("nt:name", "testDateRangeQueryXPATH");

              session.save();

              // create 100000 elements with a property date and a property string

              // representing the same date as timestamp in three types:

              // nt:date (date)

              // nt:dateLong (string)

              // nt:dateMiddle (string)

              // nt:dateShort (string)

              // nt:dateMini (string)

              Calendar c = GregorianCalendar.getInstance();

              long size = 100000;

              SimpleDateFormat sdfMini = new SimpleDateFormat("yyyy-MM-dd");

              SimpleDateFormat sdfShort = new SimpleDateFormat("yyyy-MM-dd HH:mm");

              SimpleDateFormat sdfMiddle = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

              SimpleDateFormat sdfLong = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");

              DecimalFormat df = new DecimalFormat("000000");

              for (int i = 0; i < size; i++) {

                  Node newNode = testRoot.addNode("node" + df.format(i), "nt:unstructured");

                  newNode.setProperty("nt:date", c);

                  newNode.setProperty("nt:dateMini", sdfMini.format(c.getTime()));

                  newNode.setProperty("nt:dateShort", sdfShort.format(c.getTime()));

                  newNode.setProperty("nt:dateMiddle", sdfMiddle.format(c.getTime()));

                  newNode.setProperty("nt:dateLong", sdfLong.format(c.getTime()));

                  c.add(Calendar.HOUR_OF_DAY, 1);

                  c.add(Calendar.SECOND, 3);

                  c.add(Calendar.MILLISECOND, 134);

                  if (i % 1000 == 0) {

                      session.save();

                      logger.info("Saved " + i + " nodes");

                  }

              }

              session.save();

              logger.info("Saved " + size + " nodes");

              // wait for indexing to end

              Thread.sleep(5000);

      {code}

       

      And here's the code used to perform the query:

       

      {code}

              // ±YYYY-MM-DDThh:mm:ss.SSSTZD

              String startDate = "2012-03-21T00:00:00.000+01:00";

              String endDate = "2016-03-21T23:59:59.999+01:00";

              String queryString = "//element(*, nt:unstructured)[@nt:date >=xs:dateTime('" + startDate + "') and @nt:date <=xs:dateTime('" + endDate + "')] order by @jcr:score";

              logger.info("Query using date property");

              long startTime = System.currentTimeMillis();

              nit = qm.createQuery(queryString, Query.XPATH).execute().getNodes();

              long endTime = System.currentTimeMillis() - startTime;

              logger.info("Retrieved " + nit.getSize() + " in " + endTime + " ms");

              logger.info("First element date : " + sdfLong.format(nit.nextNode().getProperty("nt:date").getDate().getTime()));

              logger.info("===================================================");

              Thread.sleep(5000);

      {code}

       

      The query didn't perform at all, with the following execption:

       

      {code}

      java.lang.NullPointerException: null

                at org.modeshape.jcr.query.lucene.basic.BasicLuceneQueryFactory.findNodesWithNumericRange(BasicLuceneQueryFactory.java:718)

                at org.modeshape.jcr.query.lucene.basic.BasicLuceneQueryFactory.findNodesWithNumericRange(BasicLuceneQueryFactory.java:698)

                at org.modeshape.jcr.query.lucene.LuceneQueryFactory.createQuery(LuceneQueryFactory.java:366)

                at org.modeshape.jcr.query.lucene.LuceneQueryFactory.createQuery(LuceneQueryFactory.java:204)

                at org.modeshape.jcr.query.lucene.basic.BasicLuceneSchema.createQuery(BasicLuceneSchema.java:492)

                at org.modeshape.jcr.query.lucene.LuceneQueryEngine$LuceneAccessQuery.execute(LuceneQueryEngine.java:269)

                at org.modeshape.jcr.query.process.SortValuesComponent.execute(SortValuesComponent.java:65)

                at org.modeshape.jcr.query.process.QueryProcessor.execute(QueryProcessor.java:96)

                at org.modeshape.jcr.query.process.QueryEngine.execute(QueryEngine.java:140)

                at org.modeshape.jcr.query.lucene.LuceneQueryEngine$1.getResults(LuceneQueryEngine.java:153)

                at org.modeshape.jcr.query.JcrQuery.execute(JcrQuery.java:119)

                at it.cbt.wr.core.service.repository.jcr.modeshape.DateRangeTest.testDateRangeQuerySQL(DateRangeTest.java:138)

      {code}

       

      It seems that the Modeshape search component isn't aware of the presence of the nt:date property.

       

      If I perform a query usign e.g. the nt:dateMini property, which actually is a string, using this query string

       

      {code}

              //mini (yyyyMMdd)

              String miniStartDate = "2012-03-21";

              String miniEndDate = "2016-03-21";

              String miniQueryString = "//element(*, nt:unstructured)[@nt:dateMini >='" + miniStartDate + "' and @nt:dateMini <='" + miniEndDate + "'] order by @nt:date";

      {code}

       

      there isn't any problem.

       

      Is this correct or I've misunderstood something?

       

      BR

       

      Luca

        • 1. Re: Query on date type residual property of nt:unstructured
          rhauch

          First of all, where did the "nt:date" and "nt:dateMini" property definitions come from. If you created them yourself, please use a custom namespace for your own properties and property definitions. The "jcr", "nt", and "mix" are all reserved by the JSR-283 specification only for use with built-in node types. ModeShape also reserves the "mode" namespace prefix for use by our built-in node types. (Plus, even in the built-ins, "nt" is for node type names, not properties.)

           

          Secondly, you're nodes are of type 'nt:unstructured' and you're not using any mixins, which makes these properties residual. The JCR specification actually allows implementations to not support querying residual properties, but ModeShape does support it. However, as you've found out it is generally more difficult to do correctly. The cast sometimes works, but often that's not enough.

           

          The idiomatic and proper way to solve this is to use a mixin with definitions for the date property/properties, and add that mixin to your nodes. You can then issue a query that uses the mixin in a FROM clause (perhaps joined with other node types that have other properties you want to read or use in criteria). Here's an example:

           

          SELECT * FROM [my:dated] AS d WHERE d.[my:date]

          BETWEEN CAST( '2012-01-01T00:00:00.000+00:00' AS DATE) AND CAST( '2013-01-01T00:00:00.000+00:00' AS DATE) EXCLUSIVE

           

          JCR-SQL2 is far more powerful than XPath, so if you haven't looked at it I would strongly recommend you do.

           

          (When executing the XPath query, ModeShape actually converts it to an abstract query model (AQM), which is actually just the JCR-QOM object representation and is equivalent and transformable to/from JCR-SQL2. You can actually see the corresponding JCR-SQL2 query generated for your XPath by looking at the toString representation of the query.)