Académique Documents
Professionnel Documents
Culture Documents
- By Naveen Marupudi
27 February 2013
Informatica Data Quality
27 February 2013
Informatica Data Quality
27 February 2013
Frequent Requirements
27 February 2013
Data Analysis and Discovery
27 February 2013
Web Based Analyst Tools to Profile Your Data
Data
Steward
27 February 2013 7
Parsing & Standardization
27 February 2013 8
Parsing & Standardization: All Data
27 February 2013 9
Address Validation
27 February 2013 10
Address Validation
27 February 2013 11
Match and De-Duplicate All Data Types
27 February 2013 12
Match and De-Duplicate All Data Types
27 February 2013 13
Monitoring Quality
27 February 2013 14
Monitoring the quality
Easy-to-share browser-based scorecards for line of business managers
• Browser-based
scorecards
enabling
you to:
• View and share
data quality
scorecards
• Drilldown to the
actual records
• Take action to
Line reduce the
business impact
manager
Zero learning curve for business users to review and track data quality metrics,
enabling data quality “for the masses”.
27 February 2013 15
Business Empowerment
Simple-to-use browser-based tools
27 February 2013 16
Release 9.1 Architecture Overview
IDQ V9.1
27 February 2013 17
Release 9 DQ Architecture
27 February 2013 18
Model Repository Service (MRS)
27 February 2013 19
Data Integration Service (DIS)
27 February 2013 21
Content Management Service (CMS)
27 February 2013 22
Informatica Services Platform (ISP)
27 February 2013 23
Informatica Analyst
27 February 2013 24
Profiling
View
• Profile
Statistics
• Value / Pattern
Frequency
Analysis
• Drilldown
Analysis
27 February 2013 25
Profiling – Apply Rules
• Apply
existing
rule or create
new one.
• View Profile
results for
Rule
output
27 February 2013 26
Scorecards
• Add Profile
Columns to a
Scorecard
• Scorecard
Name, Location
Description.
• Set
Thresholds
27 February 2013 27
Scorecards
• Run the
Scorecard
• View Trend
Chart
27 February 2013 28
Address Validation Overview
27 February 2013 29
Step 1 - Transliteration
27 February 2013 30
Step 2 – Parsing
27 February 2013 31
Step 3 – Correction
27 February 2013 32
Step 4 - Formatting
27 February 2013 33
Step 5 – Enrichment
27 February 2013 34
Data Quality Matching
Matching – Classic Data Matching
Match Strategies:
• Jaro Distance
• Greatest emphasis on initial characters
• Bigram Distance
• Longer multi token text strings
• Hamming Distance
• Numeric, date & code
• Edit Distance
• Strings of arbitrary length
• Reverse Hamming Distance
27 February 2013 35
Matching – Identity Matching
• Identity Matching delivers next generation linguistic and
statistical matching algorithms to ensure highly accurate
matching out of the box for over 60 countries
Sample Rules
27 February 2013 36
Classic Data Matching – product data
27 February 2013 37
Matching – Classic vs Identity
To the human eye certain
names would be an obvious
potential match.
27 February 2013 38
Sample matching output
27 February 2013 39
PowerCenter Integration
Why Integrate with PowerCenter?
• Deployment to PC for
• Performance
• Reliability
• PowerCenter HA
• Scalability
• PowerCenter Grid
• Connectivity
• Batch access
• Web Services
• DQ as part of ETL process
27 February 2013 40
IDQ/PC Integration Scenarios
New Install
27 February 2013 41
DQ v9.x in PC Upgrade
27 February 2013 42
IDQ v8.6.2 in PC 8.6 -> 9.1 Upgrade
There are 2 Options
1. Upgrade PowerCenter and apply CRS Upgrade patch.
• Run standard PowerCenter 8.6.x to PowerCenter 9.1 upgrade.
• Apply CRS Upgrade which is required to continue running mappings
containing IDQ 8.6.x mapplets post upgrade to PowerCenter 9.x
• Requires 8.6.x Integration libs remain available.
2. Migrate IDQ Plans from 8.6.2 to 9.0.1
• Migration only from IDQ 8.6.2 to PowerCenter 9.0.1. No direct migration to
9.1
• Requires a number of steps
• Export Data Quality 8.6.2 repository objects and reference data to the file
system in XML format.
• Convert the 8.6.2 repository objects to 9.0.1 format.
• Import the reference data as reference tables to the 9.0.1 Model repository
and staging area
27 February 2013 43
Thank You
27 February 2013 44