Version 2

    The purpose of this document is simply to highlight the differences (and similarities) between "input" and "output" data types, how they are handled in the Smooks runtime, and how they could be handled in the JBT Smooks Editor.

     

     

     

    Smooks Runtime API

    The Smooks runtime API is quite simple from the point of view of specifying the input and output(s) i.e Source and Result(s):

     

    void filterSource(Source source, Result... results) throws SmooksException

     

    The Source and Result types are standard Java types.  The JDK provides a number of implementations, but the ones of most interest to us are the StreamSource and StreamResult classes.  A number of additional Source and Result implemenations are defined as part of the Smooks API:

     

     

    So as you can see, the Smooks API allows you to specify a Source and zero or more Result instances (yes, there's a method that just takes a Source and no Result).  Passing Result objects to the filterSource method is not the only way in which Smooks can produce a "Result".  For more on this, see this wiki page.

    What Format/Type?

    A common source of confusion in Smooks related conversations is the concept of data formats and whether people are talking about input data (Sources) or output data (Results).

     

    Put simply... the Smooks runtime "consumes" a message Source and can "produce" one or more message Results


    So the Smooks runtime:

    • Consumes a Source
    • Produces Result(s)

     

    We're still missing some information from the conversation however... every Source and every Result has an associated data "Type" or "Format".  When having these "conversations", we need to be clear on what that format is and how we describe it in the Smooks configurations generated by the Smooks Editor.

     

    Put simply (re-put )... the Smooks runtime "consumes" a message Source of Type X and can "produce" one or more message Results of Type X, and/or Y, and/or Z etc etc.

     

    So when we're talking about messages, we need to be more clear about what we're talking about exactly:

    1. Source or Result (aka Input or Output)
    2. Type/Format (XML, CSV, EDI etc)

     

    Smooks Runtime Configs

    So how do we describe all the Source and Result information we need to describe in the Smooks configuration.  Remember, the Editor needs to be able to reopen existing configs etc, so we need to store additional Editor specific information not normally required by the Smooks runtime itself.

     

    Inputs (Sources)

    Message Sources are described through the Reader configuration.  The reader config installs and configures a reader to read a message of a specific type.  For more information on Reader configurations, see the Smooks User Guide.

     

    Additional Editor Config Parameters

    The Editor requires additional information that's not required by the Smooks runtime.

     

    For XML Sources:

    1. A message schema (XSD) that describes the Source message type.  From this, the editor can construct the source data model.  If not supplied, we can't automate mappings etc that generate the Smooks runtime configs.  A future version should support specifying the source message model via a sample XML message. 

     

    For Java Sources:

    1. The source Java Object model
    2. Explicitly add a <reader> config for the Java Reader.  This is normally not required by the Smooks runtime because it is able to "work this out" at runtime based on the Source type supplied to the Smooks.filterSource method (i.e. it can see the JavaSource coming in and knows from that to set up a Java Reader).

     

    For JSON Source:

    1. A sample JSON message from which the editor can extract the source data model.

     

     

    For EDI and CSV Sources:

    1. We don't really need to ask the user for any additional info e.g. sample messages.  The reader configs for these sources already require specification of the source message format (fieldset for a CSV message, EDI Mapping model for an EDI message).  We can "construct" the source mapping model from this infomation.  The only thing that a sample message offers in these cases, is sample data for when testing the smooks configs.

     

    It's a pity that the <reader> configs don't support <param> elements.  That way, we could have specified the additional editor specific source parameters on the <reader> configs, possibly removing the need for the Editor's extended configuration namespace.  We could use the global <params> section to specify this information if we thought getting rid of the Editor's extended config namespace was a good idea.

    Outputs (Results)

    As already stated, Smooks can produce outputs (Results) in two ways:

     

    [#1]  By populating the Result objects supplied in the Smooks.filterSource method call.

    [#2]  By routing data to external processes during the filtering process i.e. message Splitting & Routing.  Result objects are not used during this process.  We are not planning to support this use case for this release, but it's still important for us to keep it in mind.

  • Both of these use cases have a number of things in common in terms of how they are handled by Smooks:

     

    1. Java object bindings are created using the standard <jb:bean> configurations. 
      • In the case of #1 above, the objects can be captured by the Smooks calling code by supplying a JavaReuslt instance to the Smooks.filterSource method.
      • In the case of #2 above, the beans can be routed to a target process using the JMSRouter or one of the ESB Service Router components.
    2. Character based data (XML, CSV, EDI etc) is generated using one of the templating components e.g  <ftl:freemarker>.
      • In the case of #1 above, this is the default behavior.  Smooks looks for a StreamResult instance in the list of Result objects supplied in the Smooks.filterSource call, outputting the templating result to that StreamResult.
      • In the case of #2 above, the templating component configuration (e.g. <ftl:freemarker>) is explicitly configured to "bind" the templating result String into the bean context with a specific beanId.  This templating result bean can then be routed like any other Java object.  See this example.

     

    The following is an example of a freemarker template configuration where the templating result is output to any StreamResult that may have been supplied to the Smooks.filterSource method (i.e. #1 above):

     

    <ftl:freemarker applyOnElement="#document">
        <ftl:template>... template...</ftl:template>
    </ftl:freemarker>
    

     

    And then an example of where we use a FreeMarker configuration to generate a split message, binding the result into the Smooks Bean Context and then routing that "bean" to a JMS Queue (i.e. #2 above):

     

    (see full example)

    <ftl:freemarker applyOnElement="order-item">
        <ftl:template>... template...</ftl:template>
        <ftl:use>
            <!-- Bind the templating result into the bean context, from where
                 it can be accessed by the JMSRouter (configured below). -->
            <ftl:bindTo id="orderItem_xml"/>
        </ftl:use>
    </ftl:freemarker>
    
    <!-- At each "order-iteam", route the "orderItem_xml" to the ActiveMQ JMS Queue... -->
    <jms:router routeOnElement="order-item" beanId="orderItem_xml" destination="smooks.exampleQueue" />
    

     

    The most important things to note in the above config snippet are:

     

    1. The <ftl:bindTo id="orderItem_xml"/> config on the <ftl:freemarker> config.
    2. The beanId="orderItem_xml" attribute config on the <jms:router> configuration.

     

    Additional Editor Config Parameters

    As far as the output is concerned, the Editor will need to record additional information regarding the output type/format being generated by the templating configutation e.g.:

    1. Message Type: XML, CSV, EDI etc
    2. Message Schema: Reference to XSD, EDI Mapping Model etc., depending on the message type.

     

    We can do this by simply adding <param> elements to the config to store the additional info.  These additional configs will not effect Smooks Core.

     

    Example:

    <ftl:freemarker applyOnElement="#document">
        <ftl:template>....csv...template....</ftl:template>
        <param name="messageType">CSV</param>
        <param name="csvFields">firstName,lastName,age</param>
        <param name="separatorChar">,</param>
    </ftl:freemarker>
    

     

    Basically, the additional configs added on the templating config would be the same information captured on a reader configuration for that message type.  Note the additional params for the above CSV example... same as those for a CSV Reader.

     

    Message Type Catalog

    Something that has become obvious is the fact that for Message Inputs and Outputs, we will end up asking the user to specify the same message typing parameters on the <reader> and templating configurations (<ftl:freemarker>) e.g. Message Type (CSV, XML etc), Schemas (XSDs etc).

     

    So, something for us to think about for a future version of the editor is the idea of a "Message Catalog".  This would be an area of the editor in which we could define specific message types and all the attributes associated with that message type, such as schemas etc.  Then, when creating Smooks configurations, the user can simply select (by name) the Input and Output messages from the Message Catalog.  So, they might select the "UN/EDIFACT INVOIC" message as input and then on the output template select the "xCBL 4.0 Invoice (XML)" message.

     

    Just something to think about for a future release, but NOT something we should look at for this release.