Vous êtes sur la page 1sur 17

DATA

WAREHOUSE

DEFINITION
Data Warehouse A collection of corporate information, derived directly from operational systems and some external data sources. Its specific purpose is to support business decisions, not business operations.

THE PURPOSE OF DATA WAREHOUSING


Realize

the value of data

Data / information is an asset Methods to realize the value, (Reporting, Analysis, etc.)

Make

better decisions

Turn data into information Create competitive advantage Methods to support the decision making process, (EIS, DSS, etc.)

Data Warehouse Components


Staging Area A preparatory repository where transaction data can be transformed for use in the data warehouse Data Mart Traditional dimensionally modeled set of dimension and fact tables Per Kimball, a data warehouse is the union of a set of data marts Operational Data Store (ODS) Modeled to support near real-time reporting needs.

DATA WAREHOUSE FUNCTIONALITY

Relational Databases

Optimized Loader
ERP Systems

Extraction Cleansing

Data Warehouse Analyze Engine


Purchased Data

Query

Legacy Data

Metadata Repository

EVOLUTION ARCHITECTURE OF DATA WAREHOUSE

Top-Down Architecture Bottom-Up Architecture


Enterprise Data Mart Architecture Data Stage/Data Mart Architecture

GO TO DIAGRAM
GO TO DIAGRAM GO TO DIAGRAM GO TO DIAGRAM

VERY LARGE DATA BASES


WAREHOUSES ARE VERY LARGE DATABASES
7

Terabytes Petabytes

-- 10^12 bytes: Wal-Mart -- 24 Terabytes

-- 10^15 bytes: Geographic Information Systems National Medical Records Exabytes -- 10^18 bytes:
Zettabytes Zottabytes

-- 10^21 bytes: Weather images -- 10^24 bytes: Intelligence Agency Videos

COMPLEXITIES OF CREATING A DATA WAREHOUSE


Incomplete

errors Missing Fields Records or Fields That, by Design, are not Being Recorded errors Wrong Calculations, Aggregations Duplicate Records Wrong Information Entered into Source System

Incorrect

SUCCESS & FUTURE OF DATA WAREHOUSE


The

Data Warehouse has successfully supported the increased needs of the State over the past eight years. The need for growth continues however, as the desire for more integrated data increases. Data Warehouse has software and tools in place to provide the functionality needed to support new enterprise Data Warehouse projects.

The

The

future capabilities of the Data Warehouse can be expanded to include other programs and agencies.

DATA WAREHOUSE PITFALLS


You

are going to spend much time extracting, cleaning, and loading data are going to find problems with systems feeding the data warehouse

You

You

will find the need to store/validate data not being captured/validated by any existing system
scale data warehousing can become an exercise in data homogenizing

Large

DATA WAREHOUSE PITFALLS


The

time it takes to load the warehouse will expand to the amount of the time in the available window... and then some You are building a HIGH maintenance system You will fail if you concentrate on resource optimization to the neglect of project, data, and customer management issues and an understanding of what adds value to the customer

BEST PRACTICES
Complete

requirements and design

Prototyping
Utilizing Training Build

is key to business understanding

proper aggregations and detailed data

is an on-going process

data integrity checks into your system.

Top-Down Architecture

BACK TO ARCHITECTURE

Bottom-Up Architecture

BACK TO ARCHITECTURE

Enterprise Data Mart Architecture

BACK TO ARCHITECTURE

Data Stage/Data Mart Architecture

BACK TO ARCHITECTURE