Vous êtes sur la page 1sur 78

Introduction to BI, Data warehouse

Day 1

2008 MindTree Consulting

Introduction to BI, Data warehouse

BI concepts Data warehouse concepts Introduction to BIW Advantages of BIW over other data warehouse tools Concept of star schema architecture Introduction to Administrator workbench (All buttons in AWB)

2008 MindTree Consulting

Introduction
Success in a competitive business environment needs more than just
good information. Ability to derive meaningful, timely and readily accessible insights from the information is the need of the hour.

Insights into the business are the key to define effective strategy,
align business operations to the strategy and improve the efficiency and effectiveness of execution.

2008 MindTree Consulting

Is your enterprise set up to win?

2008 MindTree Consulting

Business needs
The ability to take actions based on complete, timely, relevant
insights.

A fast accurate way to pinpoint root causes.


The ability to track and manage the alignment of strategic
objectives and business activities

Easy access to information


Support for legal compliance

2008 MindTree Consulting

Why are todays insights not enough?


75% of business users do not use analytic applications Analytics are disconnected from business processes. Business processes are disconnected from corporate strategy.

90% of organisations fail to execute their strategiesFortune magazine

2008 MindTree Consulting

Introducing SAP BI

2008 MindTree Consulting

SAP BI s Approach

1. Establish one foundation to run your business - providing


integrated consistent data and metrics

2008 MindTree Consulting

Today

2008 MindTree Consulting

With SAP BI -Establish a Foundation

2008 MindTree Consulting

SAP BI s Approach

1. Establish one foundation to run your business - providing


integrated consistent data and metrics

2. Bring decision making to the business process

2008 MindTree Consulting

Bring decision making to the business process

2008 MindTree Consulting

Bring decision making to the business process

2008 MindTree Consulting

SAP BIs Approach

1. Establish one foundation to run your business - providing


integrated consistent data and metrics

2. Bring decision making to the business process 3. Align execution with strategy across organizations to achieve
corporate goals

2008 MindTree Consulting

Align execution to strategy

2008 MindTree Consulting

Deliver actionable insights

2008 MindTree Consulting

SAP BIs Approach

1. Establish one foundation to run your business - providing


integrated consistent data and metrics

2. Bring decision making to the business process 3. Align execution with strategy across organizations to achieve
corporate goals

4. Profit from the immediate action on insights within the


business process with clear options and explanation of potential results

2008 MindTree Consulting

Profit from timely action

2008 MindTree Consulting

Improve business process

2008 MindTree Consulting

What does this mean for the future

2008 MindTree Consulting

Business benefits
Better-informed decisions with faster corrective actions. Better business performance as a result of strategy-guided actions. Faster innovation. Faster response to changing business conditions. Increased competitive advantage.

2008 MindTree Consulting

Business Intelligence
Defined as: Business Intelligence is a technology based on customer and profit oriented models that reduce operating costs and provide increased profitability by improving productivity, sales, and service and help to make decision-making capabilities at no time. Business Intelligence Models are based on multi dimensional analysis capabilities.

2008 MindTree Consulting

BI solutions differ from and add value to standard operational


systems (OLTP systems Online Transaction Processing systems) in three ways

By providing the ability to extract, cleanse and aggregate data


from multiple operational systems into a separate data mart or data warehouse

By storing data often in a star or multi dimensional cube format,


to enable rapid delivery of summarized information and drill down to detail

By delivering personalized, relevant informational views and


querying, reporting and analysis capabilities for gaining deeper business understanding and making better decisions faster

2008 MindTree Consulting

To implement BI, the following technologies are used-

Data Marts/ Data Warehouses - A data warehouse is a subject oriented,


integrated, time variant, non-volatile collection of data in support of management's decision-making process. To facilitate data retrieval for multi dimensional analytical processing a star schema is used very often.

Extraction, Transformation and Loading (ETL) - Data is extracted from


multiple source systems. Data is cleansed and transformed and into a
consistent format and structure. The cleansed data is loaded into the data warehouse.

On-Line Analytical Processing (OLAP) and Data Mining - Analysis tools are
applied against the data warehouse to analyze and mine the data.

2008 MindTree Consulting

Key differentiators

SAP BI supports key business processes.

SAP BI reflects SAPs industry-leading business process


expertise.

SAP BI provides complete visibility across the entire value


chain.

SAP BI is delivered on the most robust and scalable


technology platform.

SAP BI delivers the most relevant set of predefined


content.SAP BI is easy to deploy and extend.

2008 MindTree Consulting

Key features of SAP BI

2008 MindTree Consulting

Introduction to Data Warehouse

2008 MindTree Consulting

What is a data warehouse?

Term Data Warehouse coined by Bill Inmon in 1990 Bill Inmon s definition
A warehouse is a Subject-oriented, Integrated, Timevariant and Non-volatile collection of data in support of managements decision making process

2008 MindTree Consulting

Page 28

What is a data warehouse?

Subject-Oriented
Data that gives information about a particular subject instead of about a
company's ongoing operations

Operational

Data Warehouse

Plan

Despatch

Customers

Products

Invoices

Orders

Regions

Time

2008 MindTree Consulting

Page 29

What is a data warehouse?

Integrated
Data that is gathered into the data warehouse from a variety of sources and
merged into a coherent whole.

Appl A - m,f Appl B - 1,0 Appl C - male,female Appl A - balance dec fixed (13,2) Appl B - balance pic 9(9)V99 Appl C - balance pic S9(7)V99 comp-3 Appl A - bal-on-hand Appl B - current-balance Appl C - cash-on-hand Appl A - date (julian) Appl B - date (yymmdd) Appl C - date (absolute)
2008 MindTree Consulting

m,f balance dec fixed (13,2) Current balance

date (julian)
Page 30

What is a data warehouse?

Time Variant
All data in the data warehouse is identified with a particular time period.

Operational

Data Warehouse

Current Value data time horizon : 60-90 days key may not have element of time

Snapshot data time horizon : 5-10 years key has an element of time data warehouse stores historical data

2008 MindTree Consulting

Page 31

What is a data warehouse?


Ralph Kimballs Definition

A copy of transaction data specifically structured for query and analysis.


Basically - Snapshots of business events at regular intervals

2008 MindTree Consulting

Page 32

How is a DW different from OLTP?


OLTP DW

Business event / transaction oriented


Supports Operations Making the Wheels turn Narrow Looking within...

Decision oriented
Decision support Watching the Wheels turn Broad looking across...

View

Usage patterns
Time

Stable, predictable
Limited time frame

Variable, Unpredictable
Historical data

Data

Detailed only
2008 MindTree Consulting

Detailed / Summarized and Derived


Page 33

How is a DW different from OLTP?


OLTP Typical Operation Age of Data Data Required/ Queried Insert/Update intensive (A twinkling database) Current Minimal DW Read intensive (A quiet data store ) Historical Extensive

Table structure
Scope of data Data

Normalized. Minimum redundancy


Internal Reacts to events

De-normalized. Controlled redundancy


Internal+external Can anticipate events

2008 MindTree Consulting

Page 34

How is a DW different from OLTP?


To Summarize

OLTP Systems are used to run a business and are based on ER Model

The Data Warehouse helps to optimize the business and is based on OLAP (dimensional model)

2008 MindTree Consulting

Page 35

What is OLAP
Stands for OnLine Analytical Processing OLAP tools aid users in quick and easy multi dimensional analysis to
get insights into whats happening

Supports features for the following


Slice and dice along the dimensions Drill up and drill down through hierarchies

Types of OLAP ROLAP Relational OLAP


Data always comes from relational tables

MOLAP Multidimensional OLAP


Data always comes from multi-dimensional cubes

HOLAP Hybrid OLAP


Data always comes from both relational as well as multi-dimensional cubes Aggregated data comes from multi-dimensional cubes Detailed data comes from relational tables
2008 MindTree Consulting Page 36

What is OLAP
Slice and Dice Relational Model:
Record #001 #002 #003 #004 Product Film Lenses Cameras Film Region East West Central West Month Dec Jan Feb Mar Sales 240 250 690 425

Product Manager View

Regional Manager View

Multidimensional Model:

Region

Sales
Ad Hoc View

Financial Manager View

2008 MindTree Consulting

Page 37

EDW Dimensional Model

Originated in the mid seventies by A.C.Nielson Made popular by Ralph Kimball Dimensional Model divides the world into
Measurement : Sales, Cost, Stock, Yield

Context (Dimensions) surrounding these measurements :


Customer, Time, Service, Region

Two Variants of dimensional model


Star Schema
Snow Flake Schema

2008 MindTree Consulting

Page 38

EDW Dimensional Model

Typical OLTP Model


Payment Payment Denial Product

Mode

Location

Property

Agent

Product Line

Booking

Business Unit

Product Group

Contract

Franchisee

Customer

Contact

Sales rep

Division

Data is S C A T T E R E D across !!!


2008 MindTree Consulting Page 39

EDW Dimensional Model Star Schema


Channel
Channel key Channel desc Original channel Source id Source desc Source type

Site
Site key

Dimension

Fact

Site desc Chain code Site status

Bookings
Site Rate Plan Channel

Rooms available Mgmt company Marketing area

Site QA Score
Lodge Score Rest Score

Time
Date key Week Month Quarter Year Weekend flag

Date
# of new bookings # of booking nights # of rooms for bookings # of guaranteed bookings

Rate Plan
Rate plan key Rate plan desc Rate plan type Brand

Hierarchy
2008 MindTree Consulting

Measure
Page 40

Dimensions - Definition
Contain descriptors of the business using which analysts view data
by.

Dimensions sets the context for asking questions about the facts in
the fact table.

SPEAKS BUSINESS LANGUAGE !!! Dimensions have multiple levels A combination of levels participate in a hierarchy

Hierarchies are logical structures that use ordered levels as a means


of organizing data.

A hierarchy can be used to define data aggregation.


2008 MindTree Consulting Page 41

Dimension - Characteristics

The tables contains all the textual descriptors of the business. Dimensions supply the context in which a measurement was made They correspond to the entities by which you want to analyze the
business

Many columns
Fewer rows Are linked to a fact table through a foreign key reference to their
primary key

2008 MindTree Consulting

Page 42

Dimensions Examples
Franchisee Consumer Property Car Channel Channel-Travel Agent Site Rate plans Brand Business unit Entity Entity group
2008 MindTree Consulting Page 43

Fact - Definition

Fact tables contain the measures related to a process or event

measures are analyzed by the various dimensions contained in the


dimension tables

Each row in a fact table corresponds to a measurement. Fact tables have a few columns and lots of rows

2008 MindTree Consulting

Page 44

Fact - characteristics They

Are usually the largest tables Are usually appended to Can grow quickly

Can contain either detail or summarized data


Are joined to dimension tables through foreign keys It is always sparse no rows are stored to represent
nothing happened.

2008 MindTree Consulting

Page 45

Fact examples

Sum insured Amount Approved Claims ratio (derived fact) Premium Paid

2008 MindTree Consulting

Page 46

EDW Dimensional Model Advantages

PRODUCT
All Products

DEPOT
All Depots

CUSTOMER
All Customers

PERIOD
All Periods

4 * 4 * 5 * 6 = 480 reports
Category Region Year

Region

Area

Quarter

Month Brand State Territory Day

Product

Depot

Customer

Time point

Actual sales Sales Forecast Returns Complaints

2008 MindTree Consulting

Page 47

EDW Dimensional Model Advantages


Using OLTP Database
SELECT Channel_Desc, Year = DATEPART(year,oht.book_Date), Month = DATEPART(month,oht.book_Date), TotRevenue = sum(DISTINCT(1+Tax_Rate) *(days_booked*olt.rate_per_night)) book_Header_Table oht, book_Line_Table olt, Property_Table st, Product_Table pt, Intensive SubChannel_Table sct, computation Channel_Table ct oht.book_Number = olt.book_Number oht.Property_Number = st. Property _Number olt.product_code = pt.product_code pt.product_code = sct.product_code sct.subChannel_code = ct.subChannel_code Channel_Desc = 'Agent' DATEPART(year,oht.book_Date) IN (1992, 1994) DATEPART(month,oht.book_Date) = 2 Category_Desc, DATEPART(year,oht.book_Date) , DATEPART(month,oht.book_Date)

Using Star Schema


SELECT

FROM

FROM

WHERE and and and and and and and GROUP BY

WHERE and and and and GROUP BY

Channel_Desc, year, month, 'TotSales' = sum(Total_Sale) Arrivals st, Channel_Dimension pd, Time_Dimension td Channel_Desc = Agent' month = 2 year in(1992,1994) st.Product_Key = pd.Product_Key st.Time_Key = td.Time_Key
Channel_Desc, year, month

Less intensive

2 Joins !!! 5 Joins !!!

2008 MindTree Consulting

Page 48

Typical Stages in the evolution of a DW


Stage 1: Reporting
Biggest challenge : Data Integration + Data quality

Example
Retail : what products does he buy ? HealthCare : Which area contribute to maximum Claims?

Stage 2: Analysis
Less focus on what happened ? More focus on why it happened ? Iterative refinement of questions ( Q&A Map ) support chain of
thought analysis and questions

Example
Why did expenses increase by 10% compared to last quarter?
2008 MindTree Consulting Page 49

Typical Stages in the evolution of a DW


Stage 3: Prediction Org is now well entrenched in the whys Build predictive models Regression ( linear/non linear), decision trees, Neural Stage 4: Operational Insight Stage 13 on strategic decision making Process reengineering Example
Retail : Inventory management with JIT HealthCare : Generating Preventive Campaigns well before time.

Stage 5: Activate Sense and respond layer sits on top of BI Example


Order raw material if inventory below threshold value
2008 MindTree Consulting Page 50

Typical architecture of a DW

2008 MindTree Consulting

Page 51

Building Blocks - Component 1: Source Systems

Operational systems that run the business


SAP Siebel JD Edwards BAAN Point of sales application Oracle applications Home grown systems Excel spreadsheets

Optimized for inserts and updates Very less redundancy of data by design

2008 MindTree Consulting

Page 52

Building Blocks Component 2 : ETL ETL stands for Extract Transform Load

The action of
Extracting information from one or more Source Transforming it mid stream Aggregation Business Rules Code normalization/cleansing Loading it into a central database

2008 MindTree Consulting

Page 53

Building Blocks - Component 3: Staging Area


Refers to both the storage area and a set of ETL processes Raw data is massaged and made ready for loading using ETL tools,
scripts, SQL, etc

Rules checking Re formatting / Re structuring etc

Should NOT be exposed to business users


May use flat files, relational tables or both Is the black box that converts raw input data into finished data for
the presentation layer

2008 MindTree Consulting

Page 54

Building Blocks - Component 4: Storage Layer


This is where the data is organized, stored and made available for
querying by users and tools

This is the data warehouse for the business users

Usually based on the dimensional model

2008 MindTree Consulting

Page 55

Building Blocks - Component 5: Reporting Layer


Comprises tools and applications that present the data to end users for decision making.

Could consist of:


Pre-canned reports (optionally web-enabled) Ad-hoc query tools

Data mining applications


Budget Planning and forecasting applications, etc

2008 MindTree Consulting

Page 56

Building Blocks Component 6 : The Metadata Layer

The glue that binds the data warehouse components An encyclopedia of the data warehouse

Crucial for maintaining the warehouse


One of the hardest thing to manage in a warehouse !!!

2008 MindTree Consulting

Page 57

Other Building Blocks


Calculation Engines
For deriving new measures using the base measures DW is the ideal place for calculating Key Performance Indicators

Extractors
For distributing data to other applications

MRDR (Master Reference Data Repository)


Also termed MDM Master Data Management Managing master data in the enterprise

Best place to implement conformed dimensions

2008 MindTree Consulting

Page 58

Tools required for a DW solution?

To extract & transform data baseSAS Datastage


To present data

Business Objects
Cognos Microstrategy OLAP services Express

To Store data

Oracle SQL Server DB2

Informatica DTS PL-SQL Pro*C

Hyperion
Brio SAS-EIS

Data cleansing

Trillium I-Spheres based solution

2008 MindTree Consulting

Page 59

SAP Business Information Warehouse - BW


What is SAP BW ?
Reporting and Analysis
Reports

Data warehouse System with optimized structures


for reporting and analysis

OLAP engine and Tools Integrated Meta Data Repository

Data Access

Data Warehouse

Data Extraction and Staging


Preconfigured support for data sources from R/3
System

Business Application Programming Interfaces (BAPIs)


for non-SAP systems

Automated Data Warehouse Management Administrative Workbench for controlling and


managing

Data Extraction Transformation

Data Sources
2008 MindTree Consulting

SAP BW Architecture

2008 MindTree Consulting

SAP BW Components
Info-objects DataSources

Persistent Staging Area (PSA)


ODS objects Infocubes Master data InfoProviders Query and query views InfoSpokes and Open-hub destination Business Content
2008 MindTree Consulting

Star schema
The Star schema offers comprehensibility for software. The Star
schema is the most popular way of implementing a MultiDimensional Model in a relational database

2008 MindTree Consulting

Star schema
The key elements of a Star schema are:
Central fact table with dimension tables shooting off from it Fact tables typically store atomic and aggregate transaction information, such as
quantitative amounts of goods sold. They are called facts.

Facts are numeric values of a normally additive nature. Fact tables contain foreign keys to the most atomic dimension attribute of each
dimension table.

Foreign keys tie the fact table rows to specific rows in each of the associated
dimension tables.

The points of the star are dimension tables. Dimension tables store both attributes about the data stored in the fact table and
textual data.

Dimension tables are de-normalized. The most atomic dimension attributes in the dimensions define the granularity of
the information, i.e. the number of records in the fact table.
2008 MindTree Consulting

Extended Star Schema


Attributes of the dimension tables are called characteristics. The meta data objects for these are infoobjects Hierarchies of characteristics or attributes may be stored in separate hierarchy tables. Therefore these hierarchies are named external hierarchies

Textual descriptions of a characteristic are stored in a separate text table. The system runs in different languages at a time.

Dependent attributes of a characteristic can be stored in a separate table called the Master Data Table for the characteristic
2008 MindTree Consulting

Extended Star Schema - Continued


Solution Independent Schema The Shared Master Table valid for use with any info cube or ODS object These master tables are the glue that binds the data warehouse Solution Dependent Schema The InfoCube, which describes the process-oriented part of the solution. An InfoCube consist of One fact table andSeveral dimension tables

pointer or translation tables called SID (Surrogate-ID) tables are used in the BW schema to link the solution-independent master tables of the BW schema to InfoCubes
2008 MindTree Consulting

Comparison

2008 MindTree Consulting

Slide 67

Extended Star Schema Key Elements


Attributes located in the dimensions are called Characteristics. Attributes located in a master data table of a Characteristic are called
attributes of the Characteristic.

SID tables (pointer tables) provide the technical link to the Master Data
(attribute, text and hierarchy) tables that are outside the dimension of a star schema.

Dimension tables are built using the combination of numeric SID values of
each Characteristic in the Dimension.

External information (attributes of the Characteristics, text descriptions


and external hierarchies) is stored separately (shared) and linked to the InfoCubes.

Historical relationships as well as the current state of the data can be


maintained and reported on

Multiple languages are supported for text / description

2008 MindTree Consulting

Administrator Workbench
The Data Warehousing Workbench (DWB) is the central tool for
performing the tasks in the data warehousing process

It provides data modeling functions as well as functions for control,


monitoring and maintenance of all processes in SAP NetWeaver BI having to do with data procurement, data retention, and data processing.

Functional Areas of the Data Warehousing Workbench:


Modeling Administration Transport connection

Documents
Business Content Translation Metadata Repository
2008 MindTree Consulting

Modeling

Used to create and maintain (meta) objects relevant to the data staging process in SAP BW. Objects are displayed in a tree structure, in which the objects are ordered according to hierarchical criteria. To access the Modeling function area, choose transaction RSA1.

2008 MindTree Consulting

Administration

Functional areas is used to

display the navigational area and,


if applicable, the corresponding object tree in the left hand area of the screen when applications are called. This means that you can use the tree to start new application you are in

2008 MindTree Consulting

Transport Connection

Used to collect newly created or


changed objects in the SAP BW system. You can use the Change and Transport Organizer (CTO) to transport them into other SAP BW systems.

2008 MindTree Consulting

Documents

The Documents function area


enables you to insert, search in, and create links for one or more

documents in various formats,


versions and languages for SAP BW objects.

2008 MindTree Consulting

BI Content

BI Content provides pre-configured information models based on metadata. It provides users in an enterprise with a selection of information they can use to fulfill their tasks. To access the BI Content function area, choose the transaction RSORBCT

2008 MindTree Consulting

Translation

In the Translation function area, you can translate short and long texts belonging to SAP BWobjects.

2008 MindTree Consulting

Metadata Repository

All SAP BW Meta objects and the

corresponding links to each other


are managed centrally. In addition, metadata can also be exchanged between different systems, HTML pages can be exported, and graphics for the objects can be displayed. To access the Metadata Repository function area, choose the

transaction RSOR.

2008 MindTree Consulting

Thank You

2008 MindTree Consulting

Imagination Action Joy

2008 MindTree Limited 2008 MindTree Consulting

Vous aimerez peut-être aussi