Normalizing Data for Joins, Grouping or Filtering
In many cases, the results from different services may contain similar data that is not expressed in identical forms. This makes the comparisons used in joins, grouping or filtering difficult or impossible until you normalize the data to a single representation.
When you need to normalize data for comparisons, the best method is to create a custom XPath function that you add to the EMML Reference Runtime Engine. This allows you to:
Use the custom function directly in the XPath expression that is comparing the disparate data. Thus the comparison is handled properly, but service results are not altered.
Reuse data normalization logic in any mashup that you publish to the EMML Engine where the custom function is deployed.
Note: if the data normalization logic is specific to one mashup and you have no need to reuse this, you can also do data normalization using scripting and the <script> statement.
See Defining Custom XPath Functions for complete instructions on how to write custom XPath functions and deploy them. Once you have your custom function deployed, you simply declare a namespace for the function in any mashup script or macro and use the function in the appropriate mashup statement.
An Example Data Normalization Custom XPath Function
This example shows a very simple data normalization function and the mashup script that uses the custom function to join the results of two services. The example joins mortgage rate information from two web sites. One site refers to the APR, but the second uses custom terms specific to their mortgage products to refer to rates. To combine the results for comparison, the custom terminology must be normalized.
First, you create the custom XPath function logic in a Java class that extends org.oma.emml.client.EMMLUserFunction. This class looks something like this example:
package com.mycompany.mashups;
import org.oma.emml.client.EMMLUserFunction
...
public class MyFinanceFunctions extends EMMLUserFunction {
static Set mortgageAliases = new HashSet();
static { mortgageAliases.add("5/1 Orange Mortgage"); }
public static String mortgage(String data) {
if (mortgageAliases.contains(data))
return "5-Year ARM";
return data; }
}
Compile and Deploy the Custom XPath Function Class
Compile this class, being sure to add web-apps-home/emml/WEB-INF/lib/emml.jar to the classpath.
Deploy the compiled class to web-apps-home/emml/WEB-INF/classes for the EMML Engine that will host mashups that need to use this function.
Use the Custom XPath Function in Your Mashup or Macro
To use the function, you must add a namespace for the class that contains this function. Add an xmlns attribute with a unique namespace prefix to the <mashup> or <macro> tag. For example:
<mashup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openemml.org/2009-04-15/EMMLSchema
../schema/EMMLSpec.xsd"
xmlns="http://www.openemml.org/2009-04-15/EMMLSchema"
xmlns:macro="http://www.openemml.org/2009-04-15/EMMLMacro"
xmlns:finance="java:com.mycompany.mashups.MyFinanceFunctions"
name="MortgageComparisons">
...
</mashup>
Then simply use the custom function in the XPath expressions where you need data to be normalized. In this example, the custom XPath expression is used in a <join> statement:
...
<input name="feed1" type="document">
<feed>
<Product>5-Year ARM</Product>
<Rate>5.250%</Rate>
<APR>5.388%</APR>
</feed>
</input>
<input name="feed2" type="document">
<feed>
<Product>5/1 Orange Mortgage</Product>
<Rate>5.500%</Rate>
<APR>4.872%</APR>
</feed>
</input>
<output name="result" type="document"/>
<join outputvariable="$result"
joincondition="$feed1/feed/finance:mortgage(Product) =
$feed2/feed/finance:mortgage(Product)"/>
<display message="result = " expr="$result"/>
....
Enterprise Mashup Markup Language (EMML) Documentation is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.
