Vous êtes sur la page 1sur 5

Cajoniz: A Data Transformation Use Case

2008 Informatica Corporation

Introduction
This article illustrates the ability of Informatica PowerCenter and B2B Data Transformation to integrate high-volume transaction data into a database environment. Our goal is to show how PowerCenter and Data Transformation provide a cost-effective solution for transforming large volumes of business transactions in various formats into a single XML framework. The article describes a real scenario implemented by the Informatica Professional Services Department. Some details have been altered to protect the customer's confidential information.

Overview
Cajoniz (fictional name) is a financial services company selling investment products through a network of independent sales agents located across the United States. Cajoniz processes 10,000 sales transactions a day. The company must record these transactions, report the sales to the investment managers, and reconcile the commissions due the company and the agents. Sales transactions are reported to Cajoniz in a variety of message formats, including: Financial Information Protocol (FIX), an early financial industry standard format FIXML, the XML version of FIX Microsoft Excel workbooks Proprietary formats Cajoniz uses several third-party application programs to process sales transactions, but unfortunately they do not currently support automatic commission reconciliation, which must be done manually. Several problems arise from the current situation, including: The error rate in commission reconciliation is high, usually resulting in the investment managers underpaying Cajoniz. The companys best estimate is that the investment managers owe it several hundred thousand dollars due to commission underpayment. Due to commission errors, agents may not be totally loyal to Cajoniz, resulting in a potential loss of business to the company. A significant number of clerical personnel are dedicated to reconciling commissions.

Goals
To deal with these issues, Cajoniz decided to launch a new, strategic IT project, the goals of which are to: Eliminate commission underpayments. Reduce the manpower expended in processing commission payments by at least 30%. The timeframe for meeting these goals was set at three months from the date the project goes live.

High Level Design


The project teams first priority was to design a system capable of analyzing sales transactions and commission files. Emphasis was placed on producing managerial reports, especially exception reports concerning payment errors. The idea is to identify errors quickly and to obtain correct payments from the investment managers. In order to achieve this design goal, the application needed a repository of all transaction data for reconciliation processing. Cajoniz chose PowerCenter and Data Transformation to import the raw transaction files, convert them to a

standard format, and record the output in a database. PowerCenter handles all the file and database I/O. Data Transformation was chosen for its powerful transformation capabilities.

Project Implementation
The project is being implemented in several phases. The first phase was to process ten transaction types in FIX format for three investment management companies. A cancellation transaction, uniform across the three companies, was included in the milestone. One transaction in Microsoft Excel format was also chosen for inclusion.

PowerCenter Workflow
A PowerCenter workflow and Data Transformation services were constructed to convert the incoming FIX and XLS formatted transactions to FIXML. The FIXML is further converted to a simplified format that PowerCenter stores in the database. PowerCenter and Data Transformation Workflow

Input data from the file system

FIX

DT FIX-FIXML transformation

FIXML

DT FIXML to database XML transformation

XLS

DT XLS-FIXML transformation

Database

Database Update

DB XML

All file and database I/O operations are handled by PowerCenter. Data is passed between PowerCenter and Data Transformation via memory buffers to minimize file I/O.

Transformation Design
The workflow uses the PowerCenter Unstructured Data Transformation to run three Data Transformation services: The first service transforms incoming FIX data to FIXML. This service consists of two components, a pre-defined FIX-to-XML parser derived from the Data Transformation FIX Library and a mapper that converts the XML to FIXML. The second service converts Microsoft Excel formatted transactions to FIXML. This service combines two components: A pre-defined document processor component to convert the data from Excel format to XML. A mapper component to convert the XML to FIXML.

The third service uses a mapper to transform the FIXML to the XML format that is used to update the Cajoniz database. This mapper also generates primary and foreign keys for all rows being added to the database.

Converting Code Values Using an XML Conversion Table


One of the issues in the FIX-to-FIXML design was converting FIX code values to the corresponding FIXML values. We handled this issue in the following way: The corresponding FIX and FIXML values were entered in an Excel worksheet. We configured a Data Transformation parser that transforms the Excel worksheet to an XML table. The parser was inserted as a preliminary step in the FIX-to-FIXML transformation. In the mapper, we used a built-in component to look up FIX input values in the XML table and replace them with FIXML values. In the future, if the codes are revised or new codes are added, Cajoniz can simply update the Excel worksheet. The Data Transformation service does not require any revision for this purpose.

Details of the Workflow


PowerCenter detects incoming transactions by polling the file system and inputs the data. PowerCenter invokes Data Transformation, passing the contents of one input file (one or more transactions) in a buffer. Data Transformation performs the transformations from FIX and XLS formats to FIXML and passes the result back to PowerCenter in the same buffer. Data Transformation handles all the different transaction types in one transformation. PowerCenter does not have to determine which transactions are present in the input data before passing it to Data Transformation. PowerCenter stores the FIXML data in the file system and passes it back to Data Transformation for the second transformation. Data Transformation transforms the FIXML data into the database XML format and returns this to PowerCenter. PowerCenter stores the XML data in the target database.

How Data Transformation Features were Used in the Implementation


Here is a partial list of the Data Transformation components that we used in the project: We used pre-defined Parser components from the Data Transformation FIX library to transform FIX data to XML. We configured Mapper components to transform among the XML formats. We used the built-in CreateGuid component to generate database identifiers. We configured Parser components to transform Excel worksheets to XML. The Excel parsers incorporated a built-in ExcelToDataXml document processor to preprocess the Excel worksheets, making the data amenable to parsing. We used the built-in LookupTransformer component to convert FIX to FIXML codes. We configured the above transformation components using reusable subcomponents wherever possible. The capability of Data Transformation to reuse components ensures consistent logic throughout the project and streamlines the configuration procedure.

Summary
In this article we have shown how PowerCenter and Data Transformation were used to create an effective solution for processing data from many sources containing both structured and unstructured data. This use case illustrates how standard industry transaction formats such as FIX and FIXML can be processed using pre-defined components from

the Data Transformation libraries. Finally, we have shown how diverse input data can be converted into a standard format for further processing. Cajoniz uses the PowerCenter and Data Transformation workflow to automate transaction processing. Commission reconciliation was successfully integrated into Cajonizs operating procedures, resulting in: Significant reductions in commission underpayments Significant clerical manpower savings The next phase of the project will expand the application to cover all sales transactions and all investment managers.

Author
Michael Lahav Senior Documentation Specialist, AlmondWeb Ltd. Mike, a long-time project manager and writer, is the author of much of the Informatica B2B Data Transformation documentation.

Acknowledgements
The author thanks Douglas Barch, Ronen Schwartz, and Baruch Katzir of Informatica for providing the information on which this article is based.

Vous aimerez peut-être aussi