Vous êtes sur la page 1sur 101

A Walk Through the Kimball ETL Subsystems with Oracle Data Integration

Michael Rainey | Collaborate 16


info@rittmanmead.com www.rittmanmead.com @rittmanmead 1
Introduction

Michael Rainey - Data Integration Practice Lead - America

- Oracle Data Integration expertise


- Blog: http://ritt.md/mRainey
- Oracle ACE
@mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead 2


About Rittman Mead

Unlock the potential of your organizations data

Worlds leading specialist partner for technical Providing our customers targeted expertise; we are a
excellence, solutions delivery and innovation in company that doesnt try to do everything only
Oracle Data Integration, Business Intelligence, what we excel at
Analytics and Big Data
Founded on the values of collaboration, learning,
70+ consultants worldwide including 1 Oracle ACE integrity and getting things done
Director and 3 Oracle ACEs, offering training
courses, global services, and consulting Comprehensive service portfolio designed to
support the full lifecycle of any analytics solution

info@rittmanmead.com www.rittmanmead.com @rittmanmead 3


Average user adoption for BI
platforms is below 25%

Rittman Meads User Engagement Service can help

Visual Redesign Business User Training

Engagement Toolkit Ongoing Support

More info: http://ritt.md/ue


info@rittmanmead.com www.rittmanmead.com @rittmanmead 4
Whats Most Important for YOU in Data Integration?

Big data?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 5


Whats Most Important for YOU in Data Integration?

Big data?
Cloud?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 5


Whats Most Important for YOU in Data Integration?

Big data?
Cloud?

Financial Reporting on one version of the truth?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 5


Lets take a walkand talk about ETL

info@rittmanmead.com www.rittmanmead.com @rittmanmead 6


Wait! What are Kimball ETL Subsystems?
Do you all know of Ralph Kimball?

www.kimballgroup.com

info@rittmanmead.com www.rittmanmead.com @rittmanmead 7


Wait! What are Kimball ETL Subsystems?
Do you all know of Ralph Kimball?

Ralph Kimball founded the Kimball Group. Since the mid-1980s, he has
been the DW/BI industrys thought leader on the dimensional approach
and trained more than 20,000 students. Prior to working at Metaphor and
founding Red Brick Systems, Ralph co-invented the first commercially-
available workstation with a graphical user interface at Xeroxs Palo Alto
Research Center (PARC). Ralph has his Ph.D. in Electrical Engineering
from Stanford University.

www.kimballgroup.com

info@rittmanmead.com www.rittmanmead.com @rittmanmead 7


The Kimball Group
Do you all know of Ralph Kimball?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 8


The Kimball 34 Subsystems of ETL

Extracting Data
- Data Profiling
- Change Data Capture System
- Extract System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 9


The Kimball 34 Subsystems of ETL

Cleaning and Conforming Data


- Data Cleansing System
- Error Event Schema
- Audit Dimension Assembler
- Deduplication System
- Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 10


The Kimball 34 Subsystems of ETL
Delivering Data for Presentation
- Slowly Changing Dimension - Multi-Valued Dimension Bridge
Manager Table Builder
- Surrogate Key Generator - Dimension Manager System
- Hierarchy Manager - Fact Provider System
- Special Dimensions Manager - Aggregate Builder
- Fact Table Builders - OLAP Cube Builder
- Surrogate Key Pipeline - Data Propagation Manager
- Late Arriving Data Handler

info@rittmanmead.com www.rittmanmead.com @rittmanmead 11


The Kimball 34 Subsystems of ETL

Managing the ETL Environment


- Job Scheduler - Lineage & Dependency
- Backup System Analyzer
- Recovery and Restart System - Problem Escalation System
- Version Control System - Parallelizing / Pipelining System
- Version Migration System - Security System
- Workflow Monitor - Compliance Manager
- Sorting System - Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead 12


Oracle Data Integration Solutions

NoETL*Engine* Prepare,*Secure,*
100%*NaEve*Data* Data$ Big$Data$ Enrich*and*Publish*
TransformaEon* Integrator$ Prepara/on$ Unstructured*Data*

NonIinvasive*CDC,* Catalog,*Trace*and*
RealEme*streaming* GoldenGate$ Metadata$ View*Models*across*
data*delivery* Management$ the*Enterprise*

Prole,*Cleanse,* Federate*Data*
Match,*and* Data$ Data$Service$ Across*DBs,*Services*
Remediate*Data* Quality$ Integrator$ and*ApplicaEons*

Copyright**2015,*Oracle*and/or*its*aliates.*All*rights*reserved.**|* Oracle*Open*World*2015* 1*

info@rittmanmead.com www.rittmanmead.com @rittmanmead 13


Now lets take a walk through the ETL Subsystems

info@rittmanmead.com www.rittmanmead.com @rittmanmead 14


Whos coming with us?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 15


Data model - where were going

info@rittmanmead.com www.rittmanmead.com @rittmanmead 16


Data model - where were going

info@rittmanmead.com www.rittmanmead.com @rittmanmead 16


Data model - where were going

info@rittmanmead.com www.rittmanmead.com @rittmanmead 16


Extracting Data

Data Profiling

- Oracle Enterprise Data Quality


Change Data Capture System

Extract System

- Oracle Data Integrator


- Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 17


Extracting Data

Data Profiling

- Oracle Enterprise Data Quality


Change Data Capture System

Extract System

- Oracle Data Integrator


- Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 17


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 18


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 18


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 18


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 18


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 18


Extracting Data - Data Profiling with EDQ

info@rittmanmead.com www.rittmanmead.com @rittmanmead 19


Extracting Data - Data Profiling with EDQ

Small dataset due to


sampling percentage

info@rittmanmead.com www.rittmanmead.com @rittmanmead 19


Extracting Data - Data Profiling with EDQ

Small dataset due to


sampling percentage
_projectid looks like a
primary key

info@rittmanmead.com www.rittmanmead.com @rittmanmead 19


Extracting Data - Data Profiling with EDQ

Small dataset due to


sampling percentage
_projectid looks like a
primary key
Investigate
school_district blanks

info@rittmanmead.com www.rittmanmead.com @rittmanmead 19


Extracting Data - Data Profiling with EDQ

Small dataset due to


sampling percentage
_projectid looks like a
primary key
Investigate
school_district blanks

info@rittmanmead.com www.rittmanmead.com @rittmanmead 19


Extracting Data - Oracle Data Integrator

Extract from many different


systems? Yes!

- Multiple technologies OOTB


- Custom technologies can be added
Data Server - connection to the
data source

- Physical Schema
- Logical Schema

info@rittmanmead.com www.rittmanmead.com @rittmanmead 20


Extracting Data - Oracle Data Integrator

Extract from many different


systems? Yes!

- Multiple technologies OOTB


- Custom technologies can be added
Data Server - connection to the
data source

- Physical Schema
- Logical Schema

info@rittmanmead.com www.rittmanmead.com @rittmanmead 20


Extracting Data - Oracle Data Integrator

Models

- Based on a single data


source
Datastores

- Logically represent a
table, file, XML, etc
- Reverse engineer or
build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead 21


Extracting Data - Oracle Data Integrator

Models

- Based on a single data


source
Datastores

- Logically represent a
table, file, XML, etc
- Reverse engineer or
build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead 21


Extracting Data - Oracle Data Integrator

Models

- Based on a single data


source
Datastores

- Logically represent a City

table, file, XML, etc


- Reverse engineer or
build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead 21


Extracting Data - Oracle Data Integrator

Models

- Based on a single data


source
State
Datastores

- Logically represent a City

table, file, XML, etc


- Reverse engineer or
build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead 21


Extracting Data - Oracle Data Integrator

Models

- Based on a single data


source Zip Code
State
Datastores

- Logically represent a City

table, file, XML, etc


- Reverse engineer or
build manually

info@rittmanmead.com www.rittmanmead.com @rittmanmead 21


Extracting Data - Oracle Data Integrator

info@rittmanmead.com www.rittmanmead.com @rittmanmead 22


Extracting Data - Oracle Data Integrator

info@rittmanmead.com www.rittmanmead.com @rittmanmead 22


Extracting Data - Oracle Data Integrator

info@rittmanmead.com www.rittmanmead.com @rittmanmead 22


Extracting Data - Changed Data Only

Change Data Capture

- Extract only the changed data since the last ETL extract
Methods

- Audit columns
- Timed extract
- Full diff compare
- Database log scraping

info@rittmanmead.com www.rittmanmead.com @rittmanmead 23


Extracting Data - Changed Data Only

Change Data Capture

- Extract only the changed data since the last ETL extract
Methods

- Audit columns
- Timed extract
- Full diff compare
- Database log scraping

info@rittmanmead.com www.rittmanmead.com @rittmanmead 23


Extracting Data - CDC with Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 24


Extracting Data - CDC with Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 25


Extracting Data - CDC with Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 25


Extracting Data - CDC with Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 25


Extracting Data - CDC with Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 25


Extracting Data

Data Profiling

- Oracle Enterprise Data Quality


Change Data Capture System

Extract System

- Oracle Data Integrator


- Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 26


Extracting Data

Data Profiling

- Oracle Enterprise Data Quality


Change Data Capture System

Extract System

- Oracle Data Integrator


- Oracle GoldenGate

info@rittmanmead.com www.rittmanmead.com @rittmanmead 26


Cleaning and Conforming Data

Data Cleansing System

- ODI & EDQ


Error Event Schema

- Built on ODI E$ tables


Audit Dimension Assembler

Deduplication System

- EDQ
Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 27


Cleaning and Conforming - Data Cleansing System

ODI - Check Knowledge Module

- Check logical constraints


- Bad data moves to error table
EDQ

- Data cleansing audit processors

info@rittmanmead.com www.rittmanmead.com @rittmanmead 28


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 29


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 29


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 29


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 29


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - ODI Constraints

info@rittmanmead.com www.rittmanmead.com @rittmanmead 30


Cleaning and Conforming - EDQ Data Cleansing

info@rittmanmead.com www.rittmanmead.com @rittmanmead 31


Cleaning and Conforming - EDQ Data Cleansing

info@rittmanmead.com www.rittmanmead.com @rittmanmead 31


Cleaning and Conforming - EDQ Data Cleansing

info@rittmanmead.com www.rittmanmead.com @rittmanmead 31


Cleaning and Conforming - EDQ Data Cleansing

info@rittmanmead.com www.rittmanmead.com @rittmanmead 31


Cleaning and Conforming - Error Event Schema

Image From:Data Warehouse Lifecycle Toolkit(Wiley Publishing, Inc: 2008).

info@rittmanmead.com www.rittmanmead.com @rittmanmead 32


Cleaning and Conforming - Error Event Schema
E$ Tables

SNP_COND
SNP_KEY
SNP_JOIN

SNP_LPI_RUN

Image From:Data Warehouse Lifecycle Toolkit(Wiley Publishing, Inc: 2008).

info@rittmanmead.com www.rittmanmead.com @rittmanmead 32


Cleaning and Conforming - Deduplication System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 33


Cleaning and Conforming - Deduplication System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 33


Cleaning and Conforming Data

Data Cleansing System

- ODI & EDQ


Error Event Schema

- Built on ODI E$ tables


Audit Dimension Assembler

Deduplication System

- EDQ
Conforming System

info@rittmanmead.com www.rittmanmead.com @rittmanmead 34


Delivering Data

Slowly Changing Dimension Multi-Valued


Manager
Dimension Bridge Table
Surrogate Key Generator
Builder

Hierarchy Manager
Dimension Manager System

Special Dimensions Manager


Fact Provider System

Fact Table Builders


Aggregate Builder

Surrogate Key Pipeline


OLAP Cube Builder

Late Arriving Data Handler Data Propagation Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead 35


Delivering Data

Slowly Changing Dimension Manager

- ODI Integration Knowledge Module


- Set SCD behavior type for each
target column
Surrogate Key Generator

- Database Sequence objects and ODI Sequences


Fact Table Builder

- Lookups in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead 36


Delivering Data - Slowly Changing Dimension in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead 37


Delivering Data - Slowly Changing Dimension in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead 37


Delivering Data - Slowly Changing Dimension in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead 37


Delivering Data - SCD in ODI - Surrogate Keys

info@rittmanmead.com www.rittmanmead.com @rittmanmead 38


Delivering Data - SCD in ODI - Surrogate Keys

info@rittmanmead.com www.rittmanmead.com @rittmanmead 38


Delivering Data - SCD in ODI - Surrogate Keys

Additional audit columns

info@rittmanmead.com www.rittmanmead.com @rittmanmead 38


Delivering Data - SCD in ODI - Surrogate Keys

Additional audit columns

info@rittmanmead.com www.rittmanmead.com @rittmanmead 38


Delivering Data - Fact Table Builder

info@rittmanmead.com www.rittmanmead.com @rittmanmead 39


Delivering Data - Fact Table Builder

info@rittmanmead.com www.rittmanmead.com @rittmanmead 39


Delivering Data

Slowly Changing Dimension Manager

- ODI Integration Knowledge Module


- Set SCD behavior type for each
target column
Surrogate Key Generator

- Database Sequence objects and ODI Sequences


Fact Table Builder

- Lookups in ODI

info@rittmanmead.com www.rittmanmead.com @rittmanmead 40


Managing the ETL Environment

Job Scheduler
Lineage &
Backup System
Dependency Analyzer

Recovery and Restart System


Problem Escalation System

Version Control System


Parallelizing / Pipelining
System

Version Migration System

Security System

Workflow Monitor

Compliance Manager

Sorting System
Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead 41


Managing the ETL Environment - Job Scheduler

Create ODI schedule on


execution object

- Tied to an agent and


context
Limited flexibility

- Custom Fiscal Month end,


for example

info@rittmanmead.com www.rittmanmead.com @rittmanmead 42


Managing the ETL Environment - Job Scheduler

Alternative to ODI scheduler - external scheduling tool

- ODI Scenarios and Load Plans can be executed via command


line script or web service

./startloadplan.sh LOAD_EDW GLOBAL 6


-AGENT_URL=http://localhost:20910/oraclediagent

info@rittmanmead.com www.rittmanmead.com @rittmanmead 43


Managing the ETL Environment - Version Control/Migration

ODI 12.2.1 Lifecycle Management

- Integrated with Subversion


- Deployment Archives for code
migration between environments

12.2.1

info@rittmanmead.com www.rittmanmead.com @rittmanmead 44


Managing the ETL Environment - Workflow Monitor

info@rittmanmead.com www.rittmanmead.com @rittmanmead 45


Managing the ETL Environment - Workflow Monitor

info@rittmanmead.com www.rittmanmead.com @rittmanmead 45


Managing the ETL Environment - Workflow Monitor

info@rittmanmead.com www.rittmanmead.com @rittmanmead 45


Managing the ETL Environment - Workflow Monitor

Drilldown from ODI


Session to SQL
detailed activity
report
Obtain real-time and
historical agent
statistics

info@rittmanmead.com www.rittmanmead.com @rittmanmead 45


Managing the ETL Environment

Job Scheduler
Lineage &
Backup System
Dependency Analyzer

Recovery and Restart System


Problem Escalation System

Version Control System


Parallelizing / Pipelining
System

Version Migration System

Security System

Workflow Monitor

Compliance Manager

Sorting System
Metadata Repository Manager

info@rittmanmead.com www.rittmanmead.com @rittmanmead 46


Where did we end up?

The Kimball ETL Subsystems


will guide your data warehouse
program

Oracle Data Integration can


help you fully implement the
ETL Subsystems

- Extract, Load, Transform with


ODI and GoldenGate
- Profile and cleanse data with
Enterprise Data Quality

info@rittmanmead.com www.rittmanmead.com @rittmanmead 47


Where did we end up? One version of the truth

info@rittmanmead.com www.rittmanmead.com @rittmanmead 48


Questions?

info@rittmanmead.com www.rittmanmead.com @rittmanmead 49


Questions?
Websites

- kimballgroup.com
- rittmanmead.com/blog
Contact

- info@rittmanmead.com
- michael.rainey@rittmanmead.com
Twitter

- @rittmanmead
- @mRainey

info@rittmanmead.com www.rittmanmead.com @rittmanmead 50


info@rittmanmead.com www.rittmanmead.com @rittmanmead 51

Vous aimerez peut-être aussi