Wednesday, January 12, 2011

The JBossESB-Smooks Integration in the SOA Platform

One of the great strengths of the SOA Platform is its wealth of software integrations. In this post, we'll examine the JBoss ESB - Smooks integration in the SOA Platform and how it can be used to perform two of the primary operations or an ESB; message transformations and message routing:
  • Message transformation enables the JBossESB to translate message payloads (the message body plus attachments and properties) from one form to another so that the message can be processed by different services.
  • Message routing enables the JBossESB to move messages to files, to JMS assets such as queues, and over the ESB "bus" between services.
A 30-Second Introduction to Smooks

It's common to refer to Smooks as a transformation engine, but that's not the full story. Smooks is really a more general purpose processing framework that is capable of dealing with with fragments of a message. Smooks accomplishes this with "visitor logic", where a "visitor" is Java code that performs a specific action on a specific fragment of a message. This enables Smooks to perform different actions on different fragments of messages. As is stated by the Smooks project (http://www.smooks.org/mediawiki/index.php?title=Why_Smooks_was_Created):

Smooks supports these types of message fragment processing:
  • Templating: Transform message fragments with XSLT or FreeMarker
  • Java Binding: Bind message fragment data into Java objects
  • Splitting: Split messages fragments and rout the split fragments over multiple transports and destinations
  • Enrichment: "Enrich" message fragments with data from databases
  • Persistence: Persist message fragment data to databases
  • Validation: Perform basic or complex validation on message fragment data
The SOA Platform SmooksAction out-of-the-box action provides you access to all these Smooks capabilities.

The JBossESB implements several out of the box actions to support message transformation and routing. SmooksAction (org.jboss.soa.esb.smooks.SmooksAction) enables you to use a powerful set of Smooks operations within the SOA Platform. Transformations are probably the first type of operation that you think of with Smooks, but with the SmooksAction you can also make use of Smooks operations such as splitting and routing message payloads.

A Simple Transformation with Smooks

Let's look at a very simple example, the aptly named "transform_XML2XML_simple" quickstart. This quickstart performs a message transformation by applying an XSLT (EXtensible Stylesheet Language Transformations) to an XML message. The message is transformed into XML in a different form. The interesting parts of the quickstart's jboss-esb.xml file are:
1:  <actions mep="OneWay">  
2:  <action class="org.jboss.soa.esb.actions.SystemPrintln" name="print-before">  
3:  <property name="message" value="[transform_XML2XML_simple] Message before transformation">  
4:  </property>  
5:  <action class="org.jboss.soa.esb.smooks.SmooksAction" name="simple-transform">  
6:  <property name="smooksConfig" value="/smooks-res.xml">  
7:  <property name="reportPath" value="/tmp/smooks_report.html">  
8:  </property>  
9:  <action class="org.jboss.soa.esb.actions.SystemPrintln" name="print-after">  
10: <property name="message" value="[transform_XML2XML_simple] Message after transformation">  
11: </property>  
  • Line 1: This line is not specific to transformations, but it's worth mentioning. "mep" stands for "message exchange pattern." The pattern used by this quickstart is "one-way" in that the requester invokes a service (by sending it a message) and then does not wait for a response.
  • Lines 2-4, 9-24: These lines simply cause the message to be written to the server log before and after its transformation. (org.jboss.soa.esb.actions.SystemPrintln, is, incidentally, the only out-of-the-box action in the Miscellaneous action group.)
  • Line 5: Here's where we specify that we want to invoke a SmooksAction
  • Line 6: And here is the XSLT that will be executed. We'll examine this file in a moment.
  • Line 7: This line is actually not in the quickstart. I've added the reportPath property it so that we can review the resulting report. Note that generating this report does require some processing resources, so it should not be used in production environments. See below for screen-shots of this report.


Now, let's look at smooks-res.xml:
1:  <?xml version='1.0' encoding='UTF-8'?>  
2:  <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.0.xsd">  
3:    
4:  <resource-config selector="OrderLine">  
5:  <resource type="xsl">  
6:  <![CDATA[<line-item>  
7:  <product><xsl:value-of select="./Product/@productId" /></product>  
8:  <price><xsl:value-of select="./Product/@price" /></price>  
9:  <quantity><xsl:value-of select="@quantity" /></quantity>  
10:  </line-item>]]>  
11:  </resource>  
12:  <param name="is-xslt-templatelet">true</param>  
13:  </resource-config>  
14:  </smooks-resource-list>  

  • Line 2: The namespace referenced here is the Smooks XML Schema Definition
  • Line 4: The resource-config element corresponds to an org.milyn.cdr.SmooksResourceConfiguration object[11]
  • Lines 7-9: These XPath (XML Path Language) statements locate the Product element's productId and price attributes and the OrderLine element's quantity attributes. XPath is used by XSLT to find or reference data in XML documents.
When you run this quickstart, you'll see the XML message as defined in SampleOrder.xml displayed before and after it undergoes the XSLT transformation.

Content Based Routing with Smooks

The routing of data from one place to another is one of the most basic, and common, problems facing any networked software application. This routing can take many forms, such as email being sent to the correct recipient or network traffic being routed around the globe based on system names defined in DNS. In the context of an Enterprise Service Bus such as the JBossESB in the SOA Platform, where everything is either a message or a service, routing means getting messages delivered to the correct services.

JBossESB Routing Choices

The JBossESB supports multiple actions to route messages to services right out of the box (http://jboss-soa-p.blogspot.com/2009/09/works-great-right-out-of-box.html) such as:
  • HttpRouter (org.jboss.soa.esb.actions.routing.HttpRouter) - The HttpRouter routes the incoming message to a URL that you specify in the action definition.
  • JMSRouter (org.jboss.soa.esb.actions.routing.JMSRouter) - This action routes the incoming message to JMS. For example, to a JMS queue where a service can then retrieve the message asynchronously. In order to find the correct JMS queue or topic, you specify values for properties such as jndi-name, initial-context-factory, and jndi-URL in the action definition.
  • StaticRouter (org.jboss.soa.esb.actions.StaticRouter) - As its name implies, this router establishes static routes that do not change based on the content of the messages or a set of routing rules.
  • StaticWiretap (org.jboss.soa.esb.actions.StaticWiretap) - Maybe it's the comic book fan in me, but this is my favorite name for an action. There's something film noir-ish about a "wiretap." You can almost imagine Humphrey Bogart sitting it the back room of a bar listening in on a wiretapped SOA action. (On black and white film, of course.) In practice, it's not all that exciting. This action implements the Enterprise Integration Pattern for a wiretap (http://www.eaipatterns.com/WireTap.html). The goal of a wiretap is to inspect each message, without affecting the operation of the action chain. This can be a useful action to use to aid in debugging a service.
JBossESB and Contest Based Routing

These are all very useful, but they are also all static in nature in that once you define the path for the messages, that's the path that the messages always take. For example, a static path could be used so that messages from the sales department service are always sent to the warehouse inventory control service. But, what if you had several warehouses each of which stored a different set of products? And what if you wanted to be able to vary the route of the messages dynamically? For example, what if you want the route the message takes to be based on the actual content in the message? Well, the JBossESB also supports multiple types of content based routing (CBR).

Two relatively lightweight approaches (http://jbossesb.blogspot.com/2009/10/content-based-routing-in-jbossesb-just.html) for content based routing are supported:
  • XPath Content Based Routing - Note that this is completely defined in the jboss-esb.xml file. No additional configuration files are needed.
  • Regex Content Based Routing - An external configuration file is used to define XPath expressions that govern the routing.
Now, if you are dealing with a more complex set of routing "rules," the JBossESB supports using JBoss Rules (http://jboss-soa-p.blogspot.com/2009/07/when-content-knows-way-content-based.html) to define the routing rules. Rules provides you with the rich feature set of JBoss Drools to control the dynamic routing of messages based on their content. JBoss Drools is a complete enterprise platform for rules-based application development, workflow, administration, and event processing. It also provides an integration with JBossESB to support content based routing. You define the content based routing algorithm in a set of rules.

But wait - there's more.

Content Based Routing with Smooks

You might think of JBoss Smooks (http://www.smooks.org/) as primarily a tool for performing XML transformations (for example, for transforming data from CSV to XML), but it can also be used for content based routing on the JBossESB in the SOA Platform. Some of the things that Smooks allows you to do with content based routing are:
  • Splitting of messages - Don't just route the whole message, but split out the parts of the message, say sales order items and route them to separate services.
  • Even more complex splitting of messages - For example, split out those sales order items and then combine each of them with data from other parts of the message (say, customer informatin) before performing the message routing.  This is more than just basic message fragment extraction, as the extracted data can be combined or otherwise processed.
  • Routing split message fragments in multiple formats - For example, routing XML to one service, Java to another, CSV data to still another.
  • Fast performance - Smooks is able to perform all the message splitting and routing (even for multiple destinations and multiple formats) in a single filtering pass of the message. This makes for fast performance as there is no need to evaluate multiple XPaths multile times on the same message. Smooks is also able to handle big (make that REALLY BIG as in > 50MB)  messages efficiently.
  • Complex conditionals - Smooks is not limited to just the conditionals that XPath supports.
The best way to understand and appreciate the advantages and flexibility of content based routing on the JBossESB with Smooks is to see it in action. Note that we'll be following a programming tradition as we'll take a quickstart as starting point, and expand it to fulfill our requirements.

The quickstart is named: smooks_file_splitter_router

Now, as the quickstart's name indicates, its objective is to demonstrate both splitting files and routing messages. The quickstart's "Splitter" service makes use of a File Gateway. Its file system listener ("fs-listener") uses the Smooks org.jboss.soa.esb.smooks.splitting.FileStreamSplitter to split an incoming XML message and route message fragments to the "Receiver" service.

Let's take a closer look at how this works, and then we'll expand on the quickstart to perform some additional content based routing.

The quickstart is initiated by the "runtest" ant target. This invokes the org.jboss.soa.esb.sample.quickstart.smooksfilesplitterrouter.InputOrderGenerator class which creates the "SampleOrder.xml" file for the File Gateway. SampleOrder.xml contains multiple orders and follows this form:
1:  <order id="332">  
2:  <header>  
3:  <customer number="123">Joe</customer>  
4:  </header>  
5:  <order-items>  
6:  <order-item id="1">  
7:  <product>1</product>  
8:  <quantity>2</quantity>  
9:  <price>8.80</price>  
10:  </order-item>  
11:  <order-item id="2">  
12:  <product>2</product>  
13:  <quantity>2</quantity>  
14:  <price>8.80</price>  
15:  </order-item>  
16:  <order-item id="3">  
17:  <product>3</product>  
18:  <quantity>2</quantity>  
19:  <price>8.80</price>  
20:  </order-item>  
21:  <order-item id="4">  
22:  <product>4</product>  
23:  <quantity>2</quantity>  
24:  <price>8.80</price>  
25:  </order-item>  
26:  <order-item id="5">  
27:  <product>5</product>  
28:  <quantity>2</quantity>  
29:  <price>8.80</price>  
30:  </order-item>  
31:  <order-item id="6">  
32:  <product>6</product>  
33:  <quantity>2</quantity>  
34:  <price>8.80</price>  
35:  </order-item>  
36:  <order-item id="7">  
37:  <product>7</product>  
38:  <quantity>2</quantity>  
39:  <price>8.80</price>  
40:  </order-item>  
41:  <order-item id="8">  
42:  <product>8</product>  
43:  <quantity>2</quantity>  
44:  <price>8.80</price>  
45:  </order-item>  
46:  <order-item id="9">  
47:  <product>9</product>  
48:  <quantity>2</quantity>  
49:  <price>8.80</price>  
50:  </order-item>  
51:  </order-items>  
52:  </order>  

The best way to understand the operation of the quickstart is to take a closer look at the providers and services defined in the jboss-esb.xml file. Note that in the case of this quickstart, jboss-esb.xml is actually generated from jboss-esb-unfiltered.xml at run time to include environment specific information such as directory names. Here's jboss-esb-unfiltered.xml:
1:  <?xml version = "1.0" encoding = "UTF-8"?>  
2:  <jbossesb xmlns="http://anonsvn.labs.jboss.com/labs/jbossesb/trunk/product/etc/schemas/xml/jbossesb-1.1.0.xsd"  
3:    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
4:    xsi:schemaLocation="http://anonsvn.labs.jboss.com/labs/jbossesb/trunk/product/etc/schemas/xml/jbossesb-1.1.0.xsd http://anonsvn.jboss.org/repos/labs/labs/jbossesb/trunk/product/etc/schemas/xml/jbossesb-1.1.0.xsd"  
5:    parameterReloadSecs="5">  
6:     
7:    <providers>  
8:        <fs-provider name="FSprovider1">  
9:            <fs-bus busid="smooksFileChannel">  
10:                <fs-message-filter  
11:                    directory="@INPUTDIR@"  
12:                    input-suffix=".xml"  
13:                    work-suffix=".esbWorking"  
14:                    post-delete="true"  
15:                    post-directory="@OUTPUTDIR@"  
16:                    post-suffix=".sentToEsb"  
17:                    error-delete="false"  
18:                    error-directory="@ERRORDIR@"  
19:                    error-suffix=".IN_ERROR"  
20:                  />  
21:            </fs-bus>  
22:        </fs-provider>  
23:    
24:    </providers>  
25:    
26:    <services>  
27:    
28:      <!--  
29:          Splitter Service...  
30:      -->  
31:      <service category="QS" name="Splitter" description="Splitter Service" invmScope="GLOBAL">  
32:          <listeners>  
33:          <!-- Splitting the message at the gateway via the FileStreamSplitter composer class allows us to  
34:                   handle huge messages... -->  
35:              <fs-listener name="FileGateway" busidref="smooksFileChannel" is-gateway="true" schedule-frequency="2">  
36:                  <property name="composer-class" value="org.jboss.soa.esb.smooks.splitting.FileStreamSplitter"/>  
37:                  <property name="splitterConfig" value="/smooks-config.xml"/>  
38:              </fs-listener>  
39:          </listeners>  
40:          <actions mep="OneWay">  
41:              <action name="print" class="org.jboss.soa.esb.actions.SystemPrintln">  
42:                  <property name="message" value="[Splitter] Message Split complete"/>  
43:              </action>  
44:       
45:              <!-- The next action is for Continuous Integration testing -->  
46:              <action name="testStore" class="org.jboss.soa.esb.actions.TestMessageStore"/>  
47:          </actions>  
48:      </service>  
49:        
50:      <!--  
51:          Receiver Service...  
52:      -->  
53:      <service category="QS" name="Receiver" description="Receiver Service" invmScope="GLOBAL">  
54:          <actions mep="OneWay">  
55:              <action name="print" class="org.jboss.soa.esb.actions.SystemPrintln">  
56:                  <property name="message" value="[Receiver] Message Fragment Received"/>  
57:              </action>  
58:          </actions>  
59:      </service>  
60:       
61:    </services>  
62:        
63:  </jbossesb>  

  • Lines 7-24 - This is the definition of the file system gateway. What's interesting to note here are:
  • Lines 11-12 - Here's the input directory. Based on its configuration, the listener will listen for files with an .xml extension that are created in the INPUT directory.
  • Line 13 - While the file gateway is processing the file, it is renamed to have an extension of "esbWorking".
  • Lines 14-16 - After the file is processed, it is renamed again, this time to have an extension of "sentToEsb," and then it is deleted.
  • Lines 17-19 - And, if an error occurs, the file is renamed to have an extension of "IN_ERROR" and is saved in the error directory.
The "Splitter" service is defined in lines 31-48. The most interesting lines are:
35:  <fs-listener name="FileGateway" busidref="smooksFileChannel" is-gateway="true" schedule-frequency="2">  
36:      <property name="composer-class" value="org.jboss.soa.esb.smooks.splitting.FileStreamSplitter"/>  
37:      <property name="splitterConfig" value="/smooks-config.xml"/>  
38:  </fs-listener>  

  • Line 35 - Here's the start of the file gateway definition.
  • Line 36 - Note that when the file gateway detects the presence of a file, it invokes the Smooks org.jboss.soa.esb.smooks.splitting.FileStreamSplitter class. And what does this class do with the file?
  • Line 37 - It splits the message into fragments, based on the conditions and actions defined in the smooks-config.xml file.
The message fragments are then routed to the Receiver service, which simply writes the message to the server.log.

The smooks-config.xml file is where the file splitting and routing is defined, so let's take a closer look there:

1:  <?xml version="1.0"?>  
2:  <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"  
3:          xmlns:jb="http://www.milyn.org/xsd/smooks/javabean-1.2.xsd"  
4:          xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd"  
5:          xmlns:esbr="http://www.jboss.org/xsd/jbossesb/smooks/routing-1.0.xsd">  
6:   
7:      <params>          
8:          <param name="stream.filter.type">SAX</param>  
9:      </params>  
10:   
11:     <conditions>  
12:         <!-- route the even numbered order items -->  
13:         <condition id="routeItem"><!-- orderItem.itemId % 2 == 0 --></condition>  
14:     </conditions>  
15:   
16:     <!-- Capture some data from the message into the bean context... -->  
17:     <jb:bean beanId="header" class="java.util.Hashtable" createOnElement="order">  
18:         <jb:value property="orderId" data="order/@id"/>  
19:         <jb:value property="customerNumber" data="header/customer/@number"/>  
20:         <jb:value property="customerName" data="header/customer"/>  
21:     </jb:bean>  
22:     <jb:bean beanId="orderItem" class="java.util.Hashtable" createOnElement="order-item">  
23:         <jb:value property="itemId" data="order-item/@id"/>  
24:         <jb:value property="productId" data="order-item/product"/>  
25:         <jb:value property="quantity" data="order-item/quantity"/>  
26:         <jb:value property="price" data="order-item/price"/>  
27:     </jb:bean>  
28:   
29:     <!-- On each order-item, apply a template to the data captured into the bean context,  
30:          binding the templating result back into the bean context under the  
31:          beanId "orderItemFragment" to be routed by the following ESB Router... -->  
32:     <ftl:freemarker applyOnElement="order-item">  
33:         <condition idRef="routeItem" />  
34:         <ftl:template>/orderitem-split.ftl</ftl:template>  
35:         <ftl:use>  
36:            <ftl:bindTo id="orderItemFragment" />  
37:         </ftl:use>  
38:     </ftl:freemarker>  
39:   
40:     <!-- On each order-item, route the "orderItemFragment" bean to the  -->  
41:     <esbr:routeBean beanIdRef="orderItemFragment" toServiceCategory="QS" toServiceName="Receiver" routeOnElement="order-item">  
42:         <condition idRef="routeItem" />  
43:     </esbr:routeBean>  
44:  
45: </smooks-resource-list>  
As we described a minute ago, the two main actions being performed are the splitting of the incoming message into fragments, and then the routing of the resulting fragments to services. Let's look at the how the splitting is performed. Recall that the sampleOrder.xml file included these XML elements.
 <order id='332'>  
 <header>  
 <customer number="123">Joe</customer>  
 </header>  
 <order-items>  
 <order-item id='1'>  
 <product>1</product>  
 <quantity>2</quantity>  
 <price>8.80</price>  
 </order-item>  
 (followed by more order-items)  

Remember how we talked about Smooks' ability to process large messages efficiently? At this point, it's important to note that when Smooks splits the XML message, it processes each orderItem one at a time, and it only keeps one orderItem in memory at a time. This is one way in which Smooks is able to efficiently process large messages.
  • Lines 17-21 - As the comments indicate, this captures data from the message, and extracts the orderId, customerName and customerNumber properties (note the use of xPath to navigate the original XML message) into a header (org.jboss.soa.esb.sample.quickstart.smooksfilesplitterrouter.Header) bean.
  • Lines 22-27 - And, this does the same for the orderItems.
  • Lines 30-38 - And here, Smooks takes the header and orderItems beans, and uses a Freemarker (http://freemarker.sourceforge.net/) template to create an orderItemFragment bean that has this organization:
1:   <orderitem id="${orderItem.itemId}" order="${header.orderId}">  
2:       <customer>  
3:           <name>${header.customerName}</name>  
4:           <number>${header.customerNumber}</number>  
5:       </customer>  
6:       <details>  
7:           <productId>${orderItem.productId}</productId>  
8:           <quantity>${orderItem.quantity}</quantity>  
9:           <price>${orderItem.price}</price>  
10:       </details>  
Note that Smooks is able to combine elements from both the header and orderItems beans into an orderItemFragment bean.

So much for the splitting, now let's take a look at how Smooks routes the message fragments to services.
  • Lines 11-14 - First, Smooks establishes a condition ("routeItem") that must be met in order for the message to be routed. In this example, the value of the orderItem.itemId element must be an even number.
  • Lines 41 - And here's where the routing actually starts. This line defines the attributes for the JBoss ESB router ("routeBean"). The attributes defined for this quickstart are:
  • beanIdRef  - this is the references of the bean that will be routed to the intended service
  • toServiceCategory - The category of that service, as defined in the quickstart's jboss-esb.xml file
  • toServiceName - And the name of that service ("Receiver"), again, as defined in the quickstart's jboss-esb.xml file
  • routeOnElement - And, finally the content element used to determine the route for messages to follow - in the case of the quickstart, this is the order-item
A good way to look at content based routing with Smooks is that new messages are built from data in fragments in the original source message. In the case of the quickstart, we want to route data from each in an message.  In this case, since the routeOnElement is the order-item, when the Smooks processing filter reaches the end of every order-item fragment, it will route a message to "somewhere" i.e. the end of the fragment is used to trigger the routing. (Many thanks to Tom F. for this summary!)
  • Line 42 - The ID of the condition that must match in order for the routing to occur. This ID ("routeItem") maps to the routing condition defined in line 13.
One attribute that is not explicitly defined by in this quickstart is "routeBefore." This attribute enables you to control when the routing occurs. This can be either (at the start) a specific message fragment is processed by Smooks, or after (at the end) it has processed the fragment. The default value is after, which is why the quickstart does not explicitly define the attribute.

Note that the full set of attributes that control routing to JBossESB services with Smooks are defined in the routing-1.0.xsd file that is deployed to a SOA Platform server in: smooks.esb/META-INF/xsd/jbossesb/smooks/routing-1.0.xsd

Closing Thoughts

OK, let's review what happened. The quickstart demonstrated two features of the integration between Smooks and the JBossESB in the SOA Platform; file splitting and routing. First, the quickstart took a single, and potentially very large XML file, and split its content into two separate beans, then combined elements of those two beans into a new bean. When Smooks performed this splitting and transformation of data from one form to another, it did so in a serial fashion, where only one instance was in memory at a time, so that it could efficiently handle even large numbers of instances. Then, Smooks found an element in the newly created bean, and based on the content of that element, routed the newly created bean in a message to an ESB service.

And - note that in order for this quickstart to accomplish these tasks, it was not necessary to write large amounts of new custom code. A Smooks configuration file and a FreeMarker template file, coupled with a call to the org.jboss.soa.esb.smooks.splitting.FileStreamSplitter class was all that was needed in the quickstart. Smooks and the JBossESB in the SOA Platform did the rest!

Acknowledgements

As always, I want to thank the JBoss SOA Platform team and community (especially Tom Fennally and Kevin Conner) for their timely review input for this blog post!

References