Vous êtes sur la page 1sur 34

Data Warehouse

Database, Data Warehouse & Data


Mining
Database systems OLTP systems, efficient
for transaction processing
Not so efficient for adhoc (summarized)
queries

Data Warehouse (Summarized data)

Data Mining of Transactions & Summarized


data
BDM 2017 2
Data Warehouse
Data collected from multiple sources

Summarized, Integrated around major subjects


- Customer, Item, Supplier

Historical perspective

Stored under a single schema

Residing in a single site DATA WAREHOUSE


BDM 2017 3
Multidimensional

Refreshing

ETL Extract, Transform & Load

BDM 2017 4
Data Warehouse
A data warehouse is a
subject-oriented,
integrated,
time-variant and
nonvolatile collection of data
in support of managements decision making
process - W.H. Inmon
DW supports decision making, adhoc queries,
analytical reporting (OLAP), separate from
database (OLTP)
BDM 2017 5
Data Warehouse
Summarized (adhoc) queries

Itemwise sales

Itemwise Regionwise sales

Branchwise sales for Kolkata region

..
BDM 2017 6
Multidimensional
Multidimensional data model
Dimension perspectives or entities
SONY World time, item, branch, location
Dimensions (time, item, branch, location) and
facts tables (sales figure)
Dimension tables contain data on dimensions
Fact tables contain data on facts (numerical)
Viewed as a spreadsheets to data cubes
Data Mart Departmental DW
BDM 2017 7
Sales data Quarterwise Sales

SONY World All products

Q1 9455
Q2 10287
Q3 10820
Q4 11593
BDM 2017 8
Sales data Quarterwise
Itemwise Sales

SONY Home
World ent comp. phone sec
Q1 3364 3421 184 2486
Q2 3647 3635 188 2817
Q3 3818 3790 192 3020
Q4 4176 3985 214 3218
BDM 2017 9
Location= Vancouver

Quarterwise Itemwise
BDM 2017
Locationwise Sales 10
Sales data according to dimensions time, item, and
location. Measure displayed is dollars_sold (in
thousands).

Quarterwise Itemwise
BDM 2017 Locationwise Sales 11
A multi-dimensional (3-D Data Cube) View

Quarterwise Itemwise
BDM 2017
Locationwise Sales 12
A multi-dimensional (4-D Data Cube) View

Quarterwise Itemwise Locationwise


BDM 2017 Supplierwise Sales 13
BDM 2017 14
Features Database Data Warehouse
(OLTP) (OLAP)
Characteristics Day-to-day Informational
operations, Analysis, Decision
Transaction support
processing
Users Clerk, Database Knowledge workers
Professionals, Manager, Executive,
DBA Analyst

DB design ER-based, Star/Snowflake,


application subject-oriented
oriented
BDM 2017 15
Features Database Data Warehouse
(OLTP) (OLAP)
Data Current, up-to- Historical,
date, read/write, mostly read,
100 Mb to GB, 100 GB to TB,
Normalized Denormalized
Summarization Primitive, Highly Summarized,
detailed consolidated
View Detailed, tables Multidimensional
No of records Tens Millions
No of users Thousands Hundreds
Priority Availability Flexibility
BDM 2017 16
OLAP

Online Analytical processing (OLAP) tools in


Data Warehouse play a significant role -
summarization, consolidation and aggregation as
well as viewing data from different angles

BDM 2017 17
OLAP operations

Programmed functionalities for


Roll-up: performs aggregation on a data cube
Drill-down: details out the data
Rotate: rotates the data axes for visualization
Slice: performs selection on one dimension
Dice: performs selection on two or more
dimensions
BDM 2017 18
OLAP operations: Roll-up, Drill-down

BDM 2017 19
OLAP operations: Slice, Rotate

BDM 2017 20
OLAP operations: Dice

BDM 2017 21
BDM 2017
STAR Schema 22
SNOWFLAKE Schema
BDM 2017 23
FACT CONSTELLATION (GALAXY) Schema
BDM 2017 24
PanaSony
Identify the entities table and relationship tables. Is there any
possibility to reduce the number of tables?
What levels of granularity do you think will be essential for building
the Data warehouse? Justify your answer.
What will be the dimensions and facts for your base cuboid?
Based on this base cuboid, compare the total sales in January17
and February17. In order to achieve this report, what OLAP
operations are required on the base cuboid?
Do you observe some variations of sales during these two

months?
Given the current level of data, can you further study the source

of such drop? Please provide your analysis and any OLAP


operation that you might have to perform.
Identify any concept hierarchy that exists.

BDM 2017 25
Concept Hierarchies

From a set of low level concepts to higher-


level, more general concepts

Location: city province_or_state country


All

Partial or total order

BDM 2017 26
Dimension location

BDM 2017 27
location: total time: partial

BDM 2017 28
Concept Hierarchies

Many concept hierarchies are implicit within


the database schema street < city < state <
country

DW systems allow the user to define the


concept hierarchy

Time concept hiererachy may be pre-


defined

BDM 2017 29
Three-tier Architecture

BI software

BDM 2017 30
Implemented using
Enhanced spreadsheet functionality

Relational OLAP (ROLAP) Extended


Relational DBMS (SQL extensions, DSS
server of Microstrategy)

MOLAP Multidimensional OLAP


(Array based views of data,
COGNOS)

HOLAP (Hybrid OLAP, SQL 2005)


BDM 2017 31
TYPES OF DW
Enterprise DW Enterprise wise

Data mart - departmental

Virtual Warehouse - a set of views on a


relational database

BDM 2017 32
Data warehousing Applications

Continental airlines
FA bank
Whirlpool
Verizon
Walmart
3M

BDM 2017 33
Managerial Considerations
Issues in DW implementations

Cost & Challenge


Intangible benefit
Success rate
General purpose or applications specific
Opposition to such projects
Training of employees

BDM 2017 34

Vous aimerez peut-être aussi