Version 11



    This article will discuss the reasoning behind various strategies in parsing payload and design considerations for project PicketLink.



    PicketLink is a project that supports both SAML and WS-Trust specifications. In version 1.x, we utilized JAXB2 Object Model and Parsing mechanism provided by the JDK.




    • XML Security ( XML Digital Signature and XML Encryption) in the current version of specifications rely on a DOM representation of payload. ( If you see Sean Mullan's presentation, he talks about Stax as a potential future solution to the performance issues. )
    • We need a Java object model to correctly represent the state of the payload that came on the wire.
    • Because of XML Security, we had to parse the payload via DOM. Apply the XML Security semantics (Signature Validation and Decryption) and then take the DOM and apply xml transformation to pass to JAXB for an object model that PicketLink can rely on.
    • JAXB2 parsing is extremely complex and performance intensive. There is very little control over either the object model or the prefix that gets written to the wire.

    Performance Alternatives for Parsing


    The JDK provides various solutions to XML parsing:

    • JAXP - DOM and SAX parsing
    • JAXB2
    • StAX


    We can choose any of these based on our needs - memory, time, space, complexity considerations.


    • DOM parsing is extremely simple.  But it can be trouble for large documents.
    • SAX parsing is extremely fast.  But the code can become unmaintainable.
    • JAXB2 is performance intensive and there is not always clear binding between Java and XML.
    • StAX is way better than JAXB2 in performance but slightly bad than SAX.  It gives greater control to the developer in parsing.


    Stax operates on the philosophy of XML Pull parsing where as the other approaches use Push mechanism.

    Stax Design Considerations


    There are two ways of doing Stax parsing.

    • Streaming
    • Event based reading.


    Stax Streaming (XMLStreamReader) is extremely fast but the code can become cumbersome. You are dealing with a stream here - byte by byte. Stax Events mechanism (XMLEventReader) provides better code and is only slightly slower than the streaming mechanism.


    If you have absolutely need blazing fast parsing, then streaming is the way to go.


    In all normal circumstance, Event based parsing should suffice.


    PicketLink Parsing

    Design Choice 1:

    Choice: As the first pass, we are going to continue using the JAXB2 object model (it is just a bunch of Java objects) for both saml and ws-trust.

    But the parsing will be done using Stax.

    Reason: The SAML object model is so large that it is not productive to hand craft the object model.

    Design Choice 2:

    Choice: We are going to use Event based parsing.

    Reason:  We want maintainable parsing code.


    Design Choice 3:

    Choice: We will use an Event filter that will only emit start and end elements.

    Reason:  We can write decent code with these two elements - StartElement and EndElement.





    In this example, our event filter would kick in to provide the following XML events to our parser.




    When we reach the Audience start element, we are going to be making the call "getElementText() on the XMLEventReader". This method call basically chews in the end element for Audience.

    Other Useful Information:


    1. There is a utility class called as StaxParserUtil.  This provides the project with all the utility methods needed for parsing. All the methods throw a PicketLink ParsingException to wrap XMLStreamException.
    2. All our parsers implement the PicketLink interface ParserNamespaceSupport which has an important method called as "supports" which when passed a QName can tell whether it is capable of parsing that QName.
    3. When you get an element such as Conditions, it is better to have either a separate method or parser to parse the complex element.
    4. The getElementText() on the XMLEventReader gobbles up the endelement for that particular element.


    StaxParserUtil Class


    This is an important utility class in the PicketLink federation project.  This should be the one source for getting the stax events as well as validating endelement or startelement.



    Performance Numbers

    On Lenovo T61, Fedora 13 with 4GB RAM. Sun HotSpot JDK1.6

    File to Parse:


    <wst:RequestSecurityToken Context="validatecontext2" xmlns:wst="">
          <saml2:Assertion xmlns:saml2="urn:oasis:names:tc:SAML:2.0:assertion" ID="ID_cf9efbf0-9d7f-4b4a-b77f-d83ecaafd374" 
            IssueInstant="2010-09-30T19:13:37.911Z" Version="2.0">
            <saml2:Issuer>Test STS</saml2:Issuer>
              <saml2:NameID NameQualifier="urn:picketlink:identity-federation">jduke</saml2:NameID>
              <saml2:SubjectConfirmation Method="urn:oasis:names:tc:SAML:2.0:cm:bearer"/>
            <saml2:Conditions NotBefore="2010-09-30T19:13:37.911Z" NotOnOrAfter="2010-09-30T21:13:37.911Z">
            <ds:Signature xmlns:ds="">
                <ds:CanonicalizationMethod Algorithm=""/>
                <ds:SignatureMethod Algorithm=""/>
                <ds:Reference URI="#ID_cf9efbf0-9d7f-4b4a-b77f-d83ecaafd374">
                    <ds:Transform Algorithm=""/>
                    <ds:Transform Algorithm=""/>
                  <ds:DigestMethod Algorithm=""/>



    JAXB, time spent for 1000 iterations = 4169 ms or 4.169 secs
    STAX, time spent for 1000 iterations = 2347 ms or 2.347 secs
    JAXB, time spent for 10000 iterations = 21733 ms or 21.733 secs
    STAX, time spent for 10000 iterations = 16939 ms or 16.939 secs 
    JAXB, time spent for 5000 iterations = 12613 ms or 12.613 secs
    STAX, time spent for 5000 iterations = 10216 ms or 10.216 secs


    Note: Stax parsing just bypasses the contents of

    <ds:signature />

    element, by ignoring the streamed events.


    Test Code:


    private int runs = 1000;
    String fileName = "parser/perf/wst-batch-validate-one.xml";
    public void testParsingPerformance() throws Exception
          ClassLoader tcl = Thread.currentThread().getContextClassLoader();
          InputStream configStream = tcl.getResourceAsStream( fileName );
          Document doc = DocumentUtil.getDocument( configStream );
          Source source =  DocumentUtil.getXMLSource(doc);
          //JAXB way
          long start = System.currentTimeMillis(); 
          for( int i = 0 ; i < runs; i++ )
             useJAXB( source ); 
          long elapsedTimeMillis = System.currentTimeMillis() - start; 
          System.out.println("JAXB, time spent for " + runs  
                + " iterations = " + elapsedTimeMillis + " ms or " + elapsedTimeMillis/1000F + " secs");
          configStream = tcl.getResourceAsStream( fileName );
          byte[] xmlData = new byte[ configStream.available() ]; //This can be a problem on some jvm
 xmlData );
          //Stax Way
          start = System.currentTimeMillis(); 
          for( int i = 0 ; i < runs; i++ )
             useStax( new ByteArrayInputStream( xmlData ) );
          elapsedTimeMillis = System.currentTimeMillis() - start; 
          System.out.println("STAX, time spent for " + runs  
                + " iterations = " + elapsedTimeMillis + " ms or " + elapsedTimeMillis/1000F + " secs");
       private void useJAXB( Source source ) throws Exception
       private void useStax( InputStream configStream ) throws Exception
          WSTrustParser parser = new WSTrustParser();
          parser.parse( configStream );  



    An important point from David M Lloyd.


    dmlloyd: with stax you tend to do some of your processing inline with parsing, as well as validation
    dmlloyd: with jaxb the validation and parsing come first, and then the processing is after