Vous êtes sur la page 1sur 10

Flat file validation process

Applies to:
Informatica PowerCenter 8.6.0

Summary
The intention of this paper is to give an idea about validation of a flat file source using information in file
trailor. File will contain total record count in trailor and same will be verified with actual record count received
in file before loading to target. If file is found to be valid, loading will be done else a record will be inserted in
status table indicating that file to be processes is invalid.

Author Bio
Author(s): Avneesh Kumar Rathor
Company: Steria India Ltd.
Created on: Jul 02, 2009

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 1
Flat file validation process

Table of Contents
Table of Contents ............................................................................................................................................... 2
1 Objective ...................................................................................................................................................... 3
2 Source definition .......................................................................................................................................... 3
3 Target definition ........................................................................................................................................... 3
3.1 Target T_FILE_PROCESS_STATUS .................................................................................................................... 3
3.2 Target EMPLOYEE ................................................................................................................................................ 4
4 Mappings ...................................................................................................................................................... 4
4.1 Mapping to validate the source file: m_ValidateSourceFile .................................................................................... 4
4.1.1 Transformations used ......................................................................................................................................... 4
4.2 Mapping to load source file: m_LoadFile................................................................................................................ 7
4.2.1 Transformations used ......................................................................................................................................... 8
5 Sessions ....................................................................................................................................................... 8
6 Workflow ...................................................................................................................................................... 8

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 2
Flat file validation process

1 Objective
The intention of this paper is to give an idea about validation of a flat file source using information in file
trailor. File will contain total record count in trailor and same will be verified with actual record count received
in file before loading to target. If file is found to be valid, loading will be done else a record will be inserted in
status table indicating that file to be processes is invalid.

2 Source definition
Sample flat file source is a comma separated values file having employee records. In addition file is having a
header (starting with H) and a trailor (starting with T). Header record contains file sequence number and
trailor record contains total number of records in file (including header and trailor record).

In sample file total records in file are 16 (including header and trailor). File trailor is “T16”.

Data records are having columns empno, ename, job, mgr, hiredate, sal, comm, and deptno.

3 Target definition

3.1 Target T_FILE_PROCESS_STATUS


First target is relational table (oracle) T_FILE_PROCESS_STATUS and has below structure.

This table will be used to load file information when the file is not valid. In case of a valid file nothing will be
loaded in this table. Example data in this table for invalid file will be as shown below.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 3
Flat file validation process

3.2 Target EMPLOYEE


Second target is EMPLOYEE table with below structure.

4 Mappings

4.1 Mapping to validate the source file: m_ValidateSourceFile


Below shown is mapping to check if total record count of file matches with record count specified in trailor. If
there is a mismatch then a status record will be inserted in T_FILE_PROCESS_STATUS. For a valid record
count in trailor, no record will be inserted in target table.

4.1.1 Transformations used


EXP_EXTRACT_HEADER_TRAILOR
This expression is used after Source Qualifier to identify file header and file trailor. In this expression total
number of records in file is also calculated.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 4
Flat file validation process

Variable to extract header sequence number is v_HeaderSeqNo, variable to extract trailor record count is
v_TrailorRecCnt and variable to count total number of records in file is v_FileRecCnt.

Port Name Expression

v_HeaderSeqNo IIF (SUBSTR (Empno, 1, 1) ='H', SUBSTR (Empno, 2), NULL)

v_TrailorRecCnt IIF SUBSTR (Empno, 1, 1) ='T', SUBSTR (Empno, 2), NULL)

v_FileRecCnt v_FileRecCnt + 1

O_ProcessDate TRUNC (SYSDATE)

AGG_HEADER_TRAILOR_ROWCOUNT
Aggregator is used to make a single row for header sequence number, trailor record count and file record
count.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 5
Flat file validation process

Port o_ProcessDate is specified as Group By port.


EXP_VALIDATE_ROWCOUNT
This expression is used to compare trailor record count with file record count.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 6
Flat file validation process

Output port o_IsFileValid uses below expression to check if trailor record count matches with file record
count. If there is a mismatch then o_IsFileValid flag is set to ‘N’.

Port Name Expression

o_IsFileValid IIF (o_FileTrailor != o_FileRecCount,'N')

FIL_INVALID_FILE
This filter will pass the row only when o_IsValidFile flag is ‘N’ i.e. trailor record count does not match with file
record count.
Filter condition is o_IsFileValid = 'N'.

4.2 Mapping to load source file: m_LoadFile


This is simple source to target mapping with a filter in place to skip file header and trailor.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 7
Flat file validation process

4.2.1 Transformations used


FIL_HEADER_TRAILOR
Filter is used to skip file header and trailor. Only data records will be passed and loaded to target
T_EMPLOYEE.
Filter condition used is SUBSTR (Empno, 1, 1) != ‘H’ AND SUBSTR (Empno, 1, 1) != 'T'

5 Sessions
Session tasks will be created corresponding to each mapping. Session s_ValidateSourceFile is for mapping
m_ValidateSourceFile.
Session s_LoadFile is for mapping m_LoadFile. As date is source file is in DD-MON-YY format, DateTime
Format String is changed from default to DD-MON-YY. All other settings are default.

6 Workflow

Simple workflow as shown below is created using a start task and two session tasks.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 8
Flat file validation process

Condition to execute session s_LoadFile is defined as below expression.

IIF ($s_ValidateSourceFile.TgtSuccessRows = 0,
$s_ValidateSourceFile.Status = succeeded,
$s_ValidateSourceFile.Status = failed)

With this condition is place, session s_LoadFile will execute only when no row is being inserted to target by
session s_ValidateSourceFile. As per the mapping logic, m_ValidateSourceFile will insert the row in target
only when file is failed for validation. So no row inserted in target means file is valid and should be loaded to
EMPLOYEE table using m_LoadFile mapping.
Using sample flat file to execute this workflow will insert no row in T_FILE_PROCESS_STATUS. 14 rows will
be inserted to EMPLOYEE table.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 9
Flat file validation process

Disclaimer and Liability notice


Informatica offers no guarantees and assumes no responsibility or liability of any type with respect to the content of this software asset,
including any liability resulting from incompatibility between the content within this asset and the materials and services offered by
Informatica. You agree that you will not hold, or seek to hold, Informatica responsible or liable with respect to the content of this software
asset.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 10

Vous aimerez peut-être aussi