Académique Documents
Professionnel Documents
Culture Documents
Agenda
Introduction
What is a Data Warehouse?
Dimensional Modeling
Overview
Chicago
Minneapolis
Raleigh
Chicago
Minneapolis
Raleigh
Bangalore, India
Bangalore
Practice Areas
Application Development
Business Intelligence
Packaged Solutions
End-to-End BI
Legal Dashboard
Data Warehouse
Visible Visitors
Portals
Dashboards
Web Design
DI + EIM/Quality
E-Commerce
Training
Open-Enrollment
Map Intelligence
On-Site + Custom
Managed Services
Jumpstart/Mentoring
Predictive Analytics
Selected Clients
City of Chicago
Partnerships
Keep it fundamental
Kimball point of view
What, Why and How
Lack history
Complex data structure
Moving target
Agenda
Introduction
What is a Data Warehouse?
Dimensional Modeling
Ralph Kimball
Bill Inmon
Jose
A product
A language
A project
A data model
A copy of your transactional systems
*Note: There are bundled products that come close to covering many aspects of
a data warehouse!
The BI Stack
Extract
Clean
Conform
Deliver
ETL
Management
Services
ETL Data Stores
Presentation Server
ETL System
Source Systems
Legacy
mainframe
systems
Production
databases
Transactional
systems
Subscription data
Data Marts
Stars &
Snowflakes
Conformed
Dimensions
Conformed Facts
BI Applications
Front Room
Back Room
Metadata
Infrastructure and Security
Reporting
systems
Ad hoc systems
Dashboards
Analytics
systems
Extract
Clean
Conform
Deliver
ETL
Management
Services
ETL Data Stores
Presentation Server
ETL System
Source Systems
Legacy
mainframe
systems
Production
databases
Transactional
systems
Subscription data
Data Marts
Stars &
Snowflakes
Conformed
Dimensions
Conformed Facts
BI Applications
Front Room
Back Room
Metadata
Infrastructure and Security
Reporting
systems
Ad hoc systems
Dashboards
Analytics
systems
Agenda
Introduction
What is a Data Warehouse?
Dimensional Modeling
Dimensional Modeling
Dimensional modeling
is a technique which
allows you to design a
database that meets
the goals of a data
warehouse.
Steps
Identify Business Process
Inventory
Student Registration
Identify Dimensions
Selection Criteria (where Gender=Female)
Row Headers (College Name, Region, )
How do you want to slice the data?
What are the artifacts of your business?
Transaction Grain
Additive
Non-Additive
Accumulating Snapshot
Grain
Semi-Additive
Fact-less Facts
Date Dimension
Special Date Dimension Attributes
In another language
Semester (First Semester, Second
Semester, )
Canadian Holiday
And so many more!
Hmmm.... these
are very
descriptive names.
Type 2:
Type 3:
Preserve a point-in-time
history
Type 2
Mini-dimensions
Technique for Rapidly
Changing Monster Dimension
Customer Dimension
PK
Customer ID
Name
Address
DoB
Date of First Order
------Age
Gender
Annual Income
Number of Children
Marital Status
Use mini-dimensions
PK
Customer Key
More Foreign Keys
Facts...
Customer Key
Customer ID
Name
Address
DoB
Date of First Order
FK1
Fact Table
Customer Key
Fact Table 2
FK2
FK3
Customer Key
Customer Demo Key
More Foreign Keys
Facts...
Other Dimensions
Rapidly Changing
Dimensions
Mini-dimensions
Degenerate Dimension
Junk Dimension
Outrigger
Other Dimensions
Rapidly Changing
Dimensions
Mini-dimensions
Degenerate Dimension
Junk Dimension
Outrigger
Examples:
Transaction Number
Invoice Number
Line Item Number
Ticket Number
Other Dimensions
Rapidly Changing
Dimensions
Mini-dimensions
Degenerate Dimension
Junk Dimension
Outrigger
Other Dimensions
Rapidly Changing
Dimensions
Mini-dimensions
Degenerate Dimension
Junk Dimension
Outrigger
Employee Key
More FK
HR Fact 1
HR Fact 2
PK
Employee Key
FK1
Employee Attributes
......
Emp Skill Key
Transaction Grain
Periodic
Snapshot Grain
Accumulating
Snapshot Grain
Time period
represented
Point in time
Regular,
predictable
intervals
Indeterminate time
span, typically
short-lived
Grain
Insert
Insert
Not revisited
Not revisited
Revisited
whenever activity
Date dimensions
Transaction date
Facts
Performance over
finite lifetime
Bridges
Normalizing a dimension
table
OLTP modeler tendency
Many to many
relationships not resolved
in fact tables
Outriggers
A dimension table is
referenced in another
dimension (i.e. hire date
example)
Snowflaking
What is Snowflaking?
Normalizing in a star
schema
Should be avoided
Adds complexity to
presentation layer
Snowflaking
What is Snowflaking?
Normalizing in a star
schema
Should be avoided
Adds complexity to
presentation layer
Dimensional models
presuppose the business
questions and therefore are
inflexible
Dimensional models are
departmental
Brining a new data source into
a dimensional data warehouse
breaks existing schemas and
requires new fact tables
Continuously balance
requirements and realities to
deliver a DW/BI solution thats
accepted by business users
and that supports their
decision making
Thank You
Future Webinars
The ETL Process
Stars in Motion
Columnar and In-memory
databases
Modeling Business Process
Retail Sales
Inventory
CRM
HR