Académique Documents
Professionnel Documents
Culture Documents
Quality Technical
Design Document
CHRISTUS Health
Table of contents
DOCUMENT CONTROL.....................................................................2
INTRODUCTION..............................................................................3
OUT OF SCOPE...............................................................................4
ENVIRONMENT DETAILS..................................................................6
HIGH LEVEL DATA FLOW DIAGRAM..................................................7
DATA QUALITY MONITORING...........................................................8
PROFILING.....................................................................................9
SCORECARDS...............................................................................11
REFERENCE MANAGEMENT............................................................13
DATA QUALITY RULES...................................................................14
WEB SERVICE...............................................................................15
Document Control
Version History
Version
Date
Author
Comment
s
1.0
05/15/201
6
Bharat
Sain
Initial
Draft
Reviewer
The following are the list of reviewers for this document.
Name
May-Law,
Michelle
Williams, Shirly
Review
Date
Notes
Introduction
This document was created to support data quality requirement at CHRISTUS Health to
measure and improve the quality of the source data on an on-going basis for all three
different sources namely Meditech, Cerner, and Athena. Also, ensure that data
dependent business processes and applications deliver expected results.
Using the Developer / Analyst tool to design and run processes that achieve the
following objectives:
Profile data. Profiling reveals the content and structure of each of the source system.
Parse records. Parse data records to improve record structure and derive additional
information from your data. For eg: Phone number, SSN column values can split a
single field of freeform data into fields that contain different information types.
Validate postal addresses. Address validation evaluates and enhances the accuracy
and deliverability of your postal address data. Address validation corrects errors in
addresses and completes partial addresses by comparing address records against
reference data from national postal carriers.
Create data quality rules. Informatica provides many pre-built rules that you can run
or edit to suit your project objectives. You can create rules in the Developer tool.
Collaborate with Informatica users. The rules and reference data tables you add to
the Model repository are available to users in the Developer tool and the Analyst tool.
Users can collaborate on projects, and different users can take ownership of objects
at different stages of a project.
Export objects into PowerCenter. You can export objects into PowerCenter to reuse
the metadata for physical data integration or to create web services.
Out of Scope
DQ Workflow
No Match requirement using IDQ tool - Identify duplicate records in your data
using a variety of matching techniques.
Environment Details
DEV
TEST
PROD
MRS
IDQ_MRS_CHR_DEV
961
IDQ_MRS_CHR_TEST96
1
IDQ_MRS_CHR_PROD96
1
Domain
DOM_CHR_DEV961
DOM_CHR_TEST961
DOM_CHR_PROD961
Host
Name
ICASM544
icasm545
icasm543
Port
6005
6005
6005
Profiling
A profile is a repository object that finds and analyzes all data irregularities across data
sources in the enterprise and hidden data problems that put data projects at risk. Running a
profile on any data source in the enterprise gives you a good understanding of the strengths
and weaknesses of its data and metadata.
Create and run a profile to find the content, quality, and structure of data sources of an
application, schema, or enterprise. The data source content includes value frequencies and
datatypes. The data source structure includes keys and functional dependencies.
You can use the Analyst tool and Developer tool to analyze the source data and metadata.
Analysts and developers can use these tools to collaborate, identify data quality issues, and
analyze data relationships. Based on your job role, you can use the capabilities of either the
Analyst tool or Developer tool. The degree of profiling that you can perform differs based on
which tool you use.
You can perform the following tasks in the Developer tool and Analyst tool:
Perform column profiling. The process includes discovering the number of unique
values, null values, and data patterns in a column.
Add rules to column profiles.
Curate the inferred datatypes in the profile results.
Use scorecards to monitor data quality.
Generate a mapping from a profile.
MEDITECH
CERNER
ATHENA
Scorecards
A scorecard is the graphical representation of the valid values for a column or output of a
rule in profile results. Use scorecards to measure data quality progress. You can create a
scorecard from a profile and monitor the progress of data quality over time.
Reference Management
Create a reference table in the design workspace of the Informatica Data Analyst tool.
Reference table is created using unmanaged table with editable option in the version 10.
Note - Managed reference table option did not worked at CHRISTUS Health as IDA tool
reference management creates its own reference table naming in case of any deletion and
this was an issue for any ETL process that this object as source for extracting any reference
data information.
Databased tables created and reference tables are created from a database table by
creating a metadata object in the Model repository. Below are ddls for the same.
DDLs script:
PROD
TEST
DEV
PROD_ddl_reference
Tables.sql
TEST_ddl_reference
Tables.sql
DEV_ddl_referenceT
ables.sql
Once reference table is created, you can edit data and view an audit trail of the changes
that users made to a reference table. Use the Audit Trail view on the reference table to view
the audit trail events.
Below are list of all reference tables
Master_List_of_DQ_
Rules.xlsx
DQ_SCORE_IND
DQ Scores
0
-26
-25
-99
Description
Good Records.
Source has a value but Target dont have a corresponding
DWID in the lookup table (E.g. Gender is T but lookup table
only has M & F).
Source missing a value (E.g. Gender missing altogether).
Source missing a value and record is rejected and sent to
exception route (E.g. Both ARGO_EID and URN are null in the
MT message).
Web service
Informatica Data Services provides data integration functionality through a web service. You
can create a web service in the Developer tool. A web service client can connect to a web
service to access, transform, or deliver data. An external application or a Web Service
Consumer transformation can connect to a web service as a web service client.
A web service integrates applications using open standards, such as SOAP, WSDL, and XML.
SOAP is the communications protocol for web services. The web service client request and
the web service response are SOAP messages. A WSDL is an XML schema that describes the
protocols, formats, and signatures of the web service operations.
Below are WSDL links in PROD and for other environment, just change icasm543 to new env
once webservice application is deployed in to target env.:
WSDLs
Address
Doctor
http://icasm543:8095/DataIntegrationService/WebService/dqws_AddressDoctor/
AllergySeverity
http://icasm543:8095/DataIntegrationService/WebService/dqws_AllergySeverity/
AllergyType
http://icasm543:8095/DataIntegrationService/WebService/dqws_AllergyType/
ARGO
EID
&
URN
Null
Check
http://icasm543:8095/DataIntegrationService/WebService/dqws_ArgoEID_URN/
Date
http://icasm543:8095/DataIntegrationService/WebService/dqws_Date/
Email
http://icasm543:8095/DataIntegrationService/WebService/dqws_Email/
Ethnicity
http://icasm543:8095/DataIntegrationService/WebService/dqws_Ethnicity/
Ethnicity
Sub
Group
http://icasm543:8095/DataIntegrationService/WebService/dqws_EthnicitySubGroup/
Gender
http://icasm543:8095/DataIntegrationService/WebService/dqws_Gender/
Language
http://icasm543:8095/DataIntegrationService/WebService/dqws_Language/
Marital
Status
http://icasm543:8095/DataIntegrationService/WebService/dqws_MaritalStatus/
Name
http://icasm543:8095/DataIntegrationService/WebService/dqws_Name/
Phone
http://icasm543:8095/DataIntegrationService/WebService/dqws_Phone/
Prefix
http://icasm543:8095/DataIntegrationService/WebService/dqws_Prefix/
Race
http://icasm543:8095/DataIntegrationService/WebService/dqws_Race/
Religion
http://icasm543:8095/DataIntegrationService/WebService/dqws_Religion/
SSN
http://icasm543:8095/DataIntegrationService/WebService/dqws_SSN/
String
with
Number
check
http://icasm543:8095/DataIntegrationService/WebService/dqws_StringNumberCheck/
Suffix
http://icasm543:8095/DataIntegrationService/WebService/dqws_Suffix/
Title
http://icasm543:8095/DataIntegrationService/WebService/dqws_Title/