Académique Documents
Professionnel Documents
Culture Documents
Anwendungssoftware
Overview
2
Data Warehouse Architecture Anwendersoftware
Architecture
End User data flow
Data Access control flow
Data
Warehouse
Data Warehouse Manager Metadata Manager Metadata
Repository
Load
Extraction Monitor
Data Staging Area
Data Warehouse System
Data Sources
Data Sources
Data Sources
4
Data Warehouse Architecture Anwendersoftware
Data Quality
consistency • Are there contradictions in data and/or
metadata?
correctness • Do data and metadata provide an exact picture
of the reality?
completeness • Are there missing attributes or values?
5
Data Warehouse Architecture Anwendersoftware
6
Data Warehouse Architecture Anwendersoftware
Monitoring
7
Data Warehouse Architecture Anwendersoftware
Extraction Monitor
8
Data Warehouse Architecture Anwendersoftware
Extraction
• Transfer data from data source into the data staging area.
• Extracted subset of data sources and schedule of the extraction
depends on the kind of analysis that should be supported.
• Method depends on the monitoring strategy used:
Read data from a file written by triggers.
Read data from replication tables.
Data Staging
Select data based on the timestamp. Area
Transformation
10
Data Warehouse Architecture Anwendersoftware
Load
• Transfer data from the data staging area into the data warehouse.
• Data in the warehouse is rarely replaced. The history of
values/changes is stored instead.
• Mainly based on bulk load tools of the DBMS.
• Offline vs. online load. Data
Warehouse
• Parallel load may be required.
Load
Data Staging
Area
11
Data Warehouse Architecture Anwendersoftware
12
Data Warehouse Architecture Anwendersoftware
Source Systems Data Staging Area "The Data Warehouse" End User
Presentation Servers Data Access
Storage:
Storage: Data
DataMart
Mart#1:
#1: Ad
AdHoc
HocQuery
Query
feed
Flat
Flatfiles;
files; OLAP query
OLAP query Tools
Tools
extract
RDBMS;
RDBMS; services;
services;
other populate,
other replicate, dimensional!
dimensional! Report
Processing:
Processing: recover subject
subjectoriented;
oriented; Report
clean; locally Writers
clean; locallyimplemented;
implemented; feed Writers
prune;
prune; user
usergroup
groupdriven;
driven;
extract
combine;
combine; may
may storeatomic
store atomic
remove
removeduplicates;
duplicates; data;
data; End
EndUser
User
household;
household; may
maybe befrequently
frequently Applications
Applications
standardize;
standardize; refreshed;
refreshed; feed
conform
conformdimensions;
dimensions; conforms
conformstotoDWDWBus
Bus
extract store
storeawaiting
awaitingreplication;
replication; Models
archive; Models
archive; DW BUS Conformed dimensions
export
Conformed facts forecasting;
forecasting;scoring;
exporttotodata
datamarts;
marts; populate, scoring;
allocating;
allocating;
replicate, Data
DataMart
Mart#2:
#2: feed data
datamining;
mining;
No
Nouser
userquery
queryservices recover other
services otherdownstream
downstream
systems;
systems;
DW BUS Conformed dimensions
other
Conformed facts otherparameters;
parameters;
special
specialUI
UI
populate,
replicate, Data
DataMart
Mart#3:
#3:
recover
Architecture
Clients
Data Marts
End User End User
Data Access End User Data Access End User
End User End User Data Access Data Access
Data Access End User Data Access End User
Data Access Data Access
Transformation
Data
Warehouse
Data Data Data Data
Mart Mart Mart Mart
Data Marts
dependent data marts independent data marts
(tiered architecture) (federated architecture)
• Central data warehouse (DW) is • Several data marts (DM) are build
build first first
• Extracts of the data warehouse • Data marts are integrated by
are provided as data marts means of a second
(materialized views) transformation step
• Establish ETL process for DW • Establish ETL process for each
only DM and the central DW
• Consistent analysis on DW and • Inconsistent analysis is possible
DM
• Virtual data warehouse possible
(federated architecture)
16
Data Warehouse Architecture Anwendersoftware
Federation layer
e.g. uniform access language, uniform access schema, uniform metadata set
Local
Applications Foundation layer
(data sources)
17
Data Warehouse Architecture Anwendersoftware
Wrapper Back-end
SQL API Data
Data Source
Federated
Database
Server
Client Back-end
Data Source Data
Catalog Data
18
Data Warehouse Architecture Anwendersoftware
Oracle
Federated DB SELECT Ename, Dno
FROM EMP
Ename &
Dname
SELECT Ename, Dname WHERE Floor = 2
FROM EMP E, DEPT D ORDER BY Dno
WHERE E.Dno = D.Dno
AND E.floor = 2
AND D.Mgr = 'Cooke' DB2
SELECT Dname, Dno
FROM Dept
WHERE Mgr = 'Cooke'
ORDER BY Dno
• Knowing what the data source
can do is a good idea!
19
Data Warehouse Architecture Anwendersoftware
class 0
applications
DWH
ODS operational environment
class 1
DWH in an immediate manner
ODS (range of one to two seconds)
class 2
DWH
ODS
environment are stored, integrated,
and forwarded to the ODS
applications
class 3
DWH
• ODS is fed aggregated analytical data
ODS from the data warehouse
class 4
DW Design Costs
Recurring DW costs
security
periodic administration
verification of the 1% occasional
activity monitor reorganization of
data monitor conformance to
2% summary table data
2% the enterprise
metadata design usage analysis data model 1%
5% 2% 2%
data rchiving
metadata 1% capacity
access/analysis
management planning
tools
end-user 3% 1%
6% disk storage
DBMS training
30%
10% 6%
monitoring of
avctivity and data
7%
network costs
10% servicing data DW refreshment
mart requests 55%
for data
21%
integration and processor costs
transformation
20%
15%
Metadata Repository
25
Data Warehouse Architecture Anwendersoftware
Metadata Management
centralized - decentralized - federated
Metadata Manager
Local
Tool
data flow
control flow
Summary
• Basic Components:
Data Staging Area: Extraction, Transformation, Load
Data Warehouse Database
Data Warehouse Manager
Metadata Repositories and Metadata Manager
• Data Marts: Distributed Data Warehouse
• Data Warehouse vs. Federated Information Systems
• Metadata is important to:
Support development and operation of a data warehouse
Provide information for data warehouse users
• Metadata standards are important to interchange metadata
between warehouse tools, warehouse platforms and warehouse
metadata repositories.
27