Vous êtes sur la page 1sur 18

Enriching Data Quality in your Organisation: Minimising Duplication & Ensuring Data is Reused by Different Parts of the Business.

Michael Mc Morrow, Head of Data Management Services, Information Management, AIB Bank. michael.mcmorrow@aib.ie
1

Multiple Perspectives

DQ Data Lifecyle: Points of Focus DQ Governance: Top-Down, Bottom-Up DQ Governance: Front to Back DQ Information Infrastructure: Data Warehouse DQ Inventory: Central Log DQ Identification: Organisation Culture DQ Understanding: Metadata DQ Stakeholders: Varied Expectations DQ Levels: Perfect / Indicative DQ Assessment: Hard / Soft DQ Prioritisation: Target Your Critical Data
2

DQ Data Lifecycle: Points Of Focus

DQ Data Lifecycle: Points Of Focus: Gather

Do the people capturing data really know what they should enter (e.g. are categorisations ambiguous?) and the data quality levels required by all subsequent users / usages (i.e. not just by this data capture application)?

DQ Data Lifecycle: Points Of Focus: Manipulate

Are all processes which transfer data reliable, and rules which transform data accurate and consistent?

DQ Data Lifecycle: Points Of Focus: Deliver

Are target outputs correctly understood and data mapping to those targets correctly performed (eg. external regulatory reports)?

DQ Governance: Top-Down, Bottom-Up

Top-Down Governance Strategy


Hierarchy of Governance Forums from C-Suite down Align to the reality of Organisation Structure

Bottom-Up
Each data item governed by a named (Business) Data Steward Range of Practical Responsibilities e.g. DQ Assessment DQ Remediation Co-ordination Metadata Provision

DQ Governance: Top-Down, Bottom-Up

Simple Structure

Machine Bureaucracy

Professional Bureaucracy

Divisionalised Form

Adhocracy
Ref: The Structure of Organisations : A Synthesis of The Research, Henry Mintzberg
8

DQ Governance: Front to Back

Data Steward Responsibility Scope


Cradle to Grave? DQ of assigned data items from point of capture to all uses

Staged? DQ of assigned data within a specific application / system layer

DQ Information Infrastructure : Data Warehouse


Concept of Single-Version-Of-The-Truth , Information Environment

Highly Secure Certified Quality CrossFunctional Right Time Data Consistent Performance

Optimised Reporting User-Friendly Access

Consistent Reporting

Peer-Beating Analytics Limitless Scalability Operationally Aligned Data Content Rich Data History Rich

Resilient Availability

Cost Effective
10

DQ Inventory: Central Log

Single Inventory of all known DQ Issues


Expose scale of DQ issues Opportunity to log issues which people just have grown used to Facilitate risk/value-based prioritisation Identify opportunities to group DQ initiatives

11

DQ Identification : Organisation Culture


Responsibility of Everyone Opportunities Everywhere Eg. Physical Data Model as DQ Tool
Compare Data Model (Expectation) with Data (Reality) Anomalies.either Wrong Data Model or Wrong Data

12

DQ Understanding: Metadata Support Safe Reuse


Technical Static Definition Metadata
Definition of table/column within RDBMS eg. Character(8), Not Null

Business Static Definition Metadata


Additional Internal/Industry definitions about the table/column eg. Data Steward Id, Business Text Description

Business Static Quality-Status Metadata


Documentation of data quality level once-off or general quality nuances eg. DQ Issue Log

Link & Publish

Business Dynamic Quality-Status Metadata


Data quality metrics over time eg. DQ Scorecard Results

Technical Life-Cycle Metadata


Data flows and transformations on route from source to target eg. ETL graphs

Technical Relational Metadata


How one item of data relates to other items of data eg. Physical Data Model
13

DQ Stakeholders: Varied Expectations


Fit for Immediate Purpose
Narrow needs of the Data Capture application

Fit for Enterprise Purpose


Wide reuse needs of other stakeholders ( eg. BI/Reporting, Predictive Analytics)

Death Indicator
Option within a Data Capture system
What if some staff add deceased to customer name instead??

DQ essential to business function owning that system? DQ essential to some other Regulatory Reporting system?

14

DQ Levels: Perfect / Indicative


Impractical / Impossible for all data to be perfect
Financial Balances should be perfect Number of Cattle will only ever be indicative

Define & Assess Appropriate DQ Level per data item Consider inheriting DQ from external certified sources
Number of Employees of a Client Company Ask client.store internally.re-ask periodically? Access from some external certified source such as the Companies Registration Office?

15

DQ Assessment: Hard / Soft


Hard Data Scorecards (Right/Wrong):
Possible to accurately measure breaches of technical rules Data Format, Data Optionality, Data Relationships Possible to accurately measure breaches of business rules Mortgage Holder (fact) who is one year old (wrong)

Soft Data Profiles (Suspicious):


Individual versus Set Valid for an individual to be born on 01/01/2001 Implausible for half of your customers to be born on 01/01/2001 Trend Non-intuitive pattern over time
16

DQ Prioritisation: Target Your Critical Data


Priority Data: Identify Top 100 data items
Apply most complex data scorecarding / profiling effort Maintain richest metadata C-Suite visibility of Data Quality issues

All Data: Apply Data Quality Tax to all Change Programs


If opening System X , and there are any logged DQ deficiencies within System X, then add remediation to program scope

17

Summary

Make DQ part of your Organisational DNA


18

Vous aimerez peut-être aussi