0 Replies Latest reply on Nov 4, 2012 10:36 AM by rbattenfeld

    SHRINKDESC-130 - Schema Validator

    rbattenfeld

      Hi

       

      I spent quite a long time for investigating possibel schema validator solutions. Basically, schema validation is part of the Java XML API and is not something new. I was struggled by the tremendious parsing time of some of the Java EE descriptors, like the application_6.xsd for example. This takes up to 30 seconds. I tried a long time to serialize parsed schemas but this failed with all serializers if found (xstream, kryo).

       

      Finally, I profiled the parsing step and realized that the time is spent in HTTP calls resolving external entities (xs:includes), mainly by downloading the xsd.xml. After that, a prototype was quickly developed. The solution is based on the Xerces XNI library. There is a new package called schema-validator.

       

      Here is an example of validating an XML file against one of the supported schemas:

       

      final XmlValidator validator = new XmlValidator(SchemaType.XSD);

              validator.loadGrammar("application_6.xsd");

              validator.validate("src/test/resources/test-valid-application-6.xml");

       

      Including initializing, the validation requires about 0.4s instead of 20-30s. But you can keep the instance for other validations and will be therefore even quicker.

       

      Currently, the validator is not integrated into the metadata-parser and allows to validate against the supported schemas.

       

      THe link to the branch is: https://github.com/rbattenfeld/descriptors/tree/SHRINKDESC-130

       

      Let me know what you think and possible further steps.

      Ralf