0 Replies Latest reply on Nov 4, 2012 10:36 AM by rbattenfeld

    SHRINKDESC-130 - Schema Validator




      I spent quite a long time for investigating possibel schema validator solutions. Basically, schema validation is part of the Java XML API and is not something new. I was struggled by the tremendious parsing time of some of the Java EE descriptors, like the application_6.xsd for example. This takes up to 30 seconds. I tried a long time to serialize parsed schemas but this failed with all serializers if found (xstream, kryo).


      Finally, I profiled the parsing step and realized that the time is spent in HTTP calls resolving external entities (xs:includes), mainly by downloading the xsd.xml. After that, a prototype was quickly developed. The solution is based on the Xerces XNI library. There is a new package called schema-validator.


      Here is an example of validating an XML file against one of the supported schemas:


      final XmlValidator validator = new XmlValidator(SchemaType.XSD);




      Including initializing, the validation requires about 0.4s instead of 20-30s. But you can keep the instance for other validations and will be therefore even quicker.


      Currently, the validator is not integrated into the metadata-parser and allows to validate against the supported schemas.


      THe link to the branch is: https://github.com/rbattenfeld/descriptors/tree/SHRINKDESC-130


      Let me know what you think and possible further steps.