Vous êtes sur la page 1sur 42

Webcast - searchsap.

com 1
September 10, 2002

ERP Centric Data Mining and


Knowledge Discovery
Naeem Hashmi
Chief Technology Officer
Information Frameworks
e-mail: nhashmi@infoframeworks.com
Web: http://infoframeworks.com

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
2
About the Speaker
• Founder and CTO of Information Frameworks, an author, speaker and world-
renowned expert on emerging Information Architectures, Integration and
Business Intelligence Technologies.

• Author of the best selling book titled,


– SAP Business Information Warehouse for SAP, 2000.
• Technical Editor
Naeem Hashmi
– SAP BW Certification Guide, authored by Catherine Roze 2002
• Contributing Author, SAP BW Handbook, 2002

• Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT


industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The
ERP World, Data Mining and the Data Warehouse Institute.
• 25+ years of experience in emerging Information Technology research, development,
and management; Information Architectures; Enterprise Application Integration e-
business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object
and Client/Server Technologies and Strategic Consulting.

• Email- nhashmi@infoframeworks.com url: http://infoframeworks.com


Tel: 603-432-4550

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
3
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
4
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
5
What is Data Mining and Knowledge Discovery ?

• Data Mining is a tactical process that uses


mathematical algorithms to sift through large data-
stores to extract data patterns/models/rules

• The Knowledge Discovery is the process of


identifying and understanding potentially useful
hidden anomalies, trends and patterns. Data
mining is an integral part of knowledge discovery
process

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
6
Data Mining and Statistics ?

• DM sounds very similar to regression analysis but


its approach and purpose are quite different

– Statistical methods tests a hypothesis on a data set

– Data Mining starts from the data sets to construct a


hypothesis

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
7
Data Mining - Present State

Application Domains
Business 317 73%
Life Sciences 85 20%
Other 31 7%

Source: http://www.kdnuggets.com/polls/

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
8
Data Mining Methodologies

CRISP-DM
http://www.crisp-dm.org/
Source: http://www.kdnuggets.com/polls/

CRoss Industry Standard


Process for Data Mining
SIX STEPS PROCESS
1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
Source: http://www.crisp-dm.org/
6. Deployment
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
9
Data Mining Process
http://www.crisp-dm.org/
CRoss Industry Standard Process (CRISP) for Data Mining
Data Warehouse

Initially will take about


Data
Understanding 60%
60% to to 80%
80%
of
of the
the data
data mining
mining project
project
Data
Preparation time
time

Source: http://www.crisp-dm.org/

1. Business Understanding
2. Data Understanding
3. Data Preparation
4. Modeling
5. Evaluation
6. Deployment
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
10
Data Mining - Tools and Data Formats
Business 317 73%
Domains Life Sciences 85 20%
Other 31 7%

57% Flat files


37% Proprietary
27% DBMS

Source: http://www.kdnuggets.com/polls/

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
11
Data Mining Technology
Visualization
Use human pattern recognition capabilities

Statistics
T Applying statistical techniques to predict

E Decision Trees U
C Building scripts based on historic data
Discover
S
H Association Rules (Rule Induction) A Understand
N Reasoning from specific facts to reach a hypothesis G
I Clustering E Predict
Q Refers to finding and visualizing groups of facts that were
U not previously known
E Neural Networks
S Learning how to solve problems based on examples

K-Nearest Neighbor
Classification by looking at similar data

Genetic Algorithms
Survival of the fittest …
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
12
Data Mining Models

Two Types of Data Mining Models

Prediction Models Descriptive Models


Prediction and Classification Grouping & Associations
 Regression algorithms  Clustering/Grouping algorithms
• Neural Networks, Rule Induction • K-means, Kohonen, Factor
• Predict Numerical Outcome Analysis
 Classification algorithm  Association algorithms
• CHAID, discriminant analysis • Apriori, Sequence
• Predict Symbolic Outcome

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
13
Traditional DM vendors

• SPSS Clementine
• SAS Enterprise Miner
• IBM Intelligent Miner
• Salford CART/MARTS
• …more

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
14
Database Vendors – DM within the Products

• Data Mining Engine in Oracle 9i


– Oracle 9i consists of key products
• Oracle9i Database ,Oracle9i Application Server,Oracle9i Developer Suite

• IBM Intelligent Miner into DB2


• TeraMiner into Teradata
• Microsoft – SQL Server 2000

• When you implement DM functionality in a DBMS, you are


limited to a specific database engine and not quite flexible in
a typical enterprise application landscape - heterogeneous
environment.

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
15
Data Mining Standards

• PMML - Predictive Model Markup Language


• OleDB for Data Mining
• Java Data Mining API
• Other Data Exchange Standards for Analytics and
need Data Mining extensions
– CWM: Common Warehouse Metadata
– XML/A: XML for Analytics
– CPEX: Customer Profile EXchange
– xCIL: Extensible Customer Information Language

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
16
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
17
Enterprise Applications Landscape

• ERP Solutions
– Oracle
– PeopleSoft
– SAP
• ERP vendors have extended
scope of their applications far
beyond tradition ERP functions
to a wide array of business • Oracle Business
solutions such as: Intelligence Solution
 Customer Relationships
Management
• Peoplesoft Enterprise
 Business Intelligence Performance Management
 Enterprise Portals • SAP Business
• Siebel Information Warehouse

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
18
Oracle Business Intelligence Solution
Business Processes (Pre-Built Portlets)
• Response to Lead (27)
• Lead to Quote (56)
• Quote to Order (15)
• Order to Cash (34)
• Demand to Build (40)
• Procure to Pay (28)
• Revenue to Compensation (29)
• Expiration to Renewal (33)
• Issue to Resolution (51)
• HR Family (43)

Oracle 9i DM Integration
• Oracle Marketing Online for
Campaign Management
• Oracle9iAS Personalization
• iStore
• more to come… Source: Oracle
Oracle 9i Oracle9iDS Warehouse Builder Oracle9iAS Discoverer
Oracle9iDS Reports Oracle9iAS Portal
Business
Oracle9iAS Clickstream Intelligence Oracle9iAS Personalization
Intelligence Oracle9i Data Mining Oracle9iDS Business Intelligence Beans

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
19
PeoplSoft Business Intelligence Solution
Enterprise Performance Management (EPM)
 Customer Profitability
 Finance
 Workforce Analytics
 Supply Chain Management Process
 Workforce Rewards
 Enrollment Management
 Retail Merchandise
)
 Project Analysis .com
a dva n
biz
ww.e
 Student Administration c. ( w
ta ge In
an
 Balanced Scorecard Ad v
ess
Busin
r tes
y: e  CRM Prospect Analysis
 Employee Scorecard Cou
Data mining  CRM Marketing Analysis
 Customer Scorecard
 Vendor Scorecard
Capabilities  CRM Sales Effectiveness
 CRM Service Effectiveness
No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products.
No response from PeopleSoft contacts
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
20
SAP Business Intelligence Solution
Business Information Warehouse
SAP CRM SAP Markets, Procurement
Campaign management Bidding, pattern-based offering
Opportunity analytics Activity reproting, service
Customer behavior modeling analytics

SAP SCM SAP Portals +1700


Queries
Demand planning
E-commerce analysis
Spend optimization
SCOR KPIs
Closed loop platform capabilities +420
Drill-through (report-report i/f) InfoCubes
SAP Financials, Human Capital
Management Remote cubes (read through)
SEM 90
Real-time data warehousing ODS
Balanced scorecard
Objects
Planning Data mining
Economic profit Write back to operational system
Benchmarking
Employee turnover & retention
Corporate investment management

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
21
CRM Venders – Data Mining Integration

• Oracle CRM
– Pre 9i Darwin
– Post 9i ODM
• RightPoint and E.piphany
• SPSS and Siebel
• SAP CRM
– Native Data Mining built in SAP BW - Database Independent
– Interface to IBM Intelligent Miner Interface with SAP BW
• PeopleSoft CRM
– No official data mining product or vendor solution
– Waiting for their response on what they have?

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
22
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
23
SAP BW 3.0b Data Mining Implementation

• Currently for Customer Subject Area


• Algorithm Supported
– Decision Trees
– Scoring
– Clustering/Segmentation
– Association
• Data Mining process
– Model definition
– Training the model No Extensive
– Performing prediction using the training results
Data Staging
– Uploading the results back into BW
– Utilizing the mining results (on the operational side)
– SAPGUI is the Interface to the Data Mining modeling and analysis

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
24
Modeling a Decision Tree

Create a mining model


Specifying the column parameters
1 Model ccolumns Data type of the column
2 6

7
Specifying the
values in case the
original values in
the column are to
be treated
differently

4 Indicating the
3 prediction
The nature of the column content column

5
Indicating the key column
Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
25
Modeling a Decision Tree
Specify Model Parameters
Size of the window (such as 10%)
The number of repeats
Use portion (%) of 1 2
the data for training 3 with different samples
or the whole data set
for training

Stop training when 7


the no. of cases
Use the
under the given node
information gain
is less than/equal to
threshold to check
the specified value
the relevance
6
5 If the tree is too big,
prune the tree without
Stop training when the
violating the expected
accuracy is greater than or
accuracy
equal to the expected accuracy

Source: SAP
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
26
Modeling a Decision Tree

Create a training source and map the model columns


BW Query 5 Runtime parameters for query
1

Model
columns
2 3
Selected source columns

4 Mapping between model column and source column

Source: SAP
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
27
SAP BW Data Mining – Process Steps

 Create a mining model


 Train the model
 Predictions using
Training results
 Using the data mining
results against BW Query

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
28
Viewing Decision Tree Training Results Out of a total of
705 cases, 41
Chances of a customer leaving is This decision tree predicts cases are covered
70.7% if the profession is whether the customer has under this node
“LABOURER”
1 left or is still “on board
2 4

3
Chart shows the
distribution at the
selected node 6
5 13/41 customers
28/41 customers are likely to stay
are likely to leave

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
29
Data Mining – Decision Trees

Uploaded in BW Source: SAP

Then BEX for further Analysis


Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
30
Data Mining – Association

• Create a Association
model
• Define Model Columns
• Train the model
• Predictions using
Training results
• Using the data mining
results against BW Query

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
31
Data Mining – Association

Source: SAP
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
32
Data Mining – Cluster Analysis

• Create a Cluster model


• Train the model
• Predictions using
Training results
• Using the data mining
results against BW Query

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
33
Viewing Cluster Analysis Results

Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
34
Viewing Cluster Analysis results

Uploaded in BW
Then BEX for further Analysis Source: SAP

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
35
SAP Data Mining

• Good attempt to implement few Data Mining Algorithms


• Very traditional Data Mining Approach
• Requires a well versed Statistician or Data Mining Expert
to model and interpret the results
• Source: BEX Query – Big Limitation in DM
• Weak Visualization
• BEX for additional discovery - slicing and dicing

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
36
SAP BW - IBM Intelligent Miner

IBM Intelligent Miner is designed to:


• Copy data from SAP BW to IBM Intelligent Miner
– Results of reports in BW – Modeling in Business
Explorer Analyzer
– Data direct from InfoCubes (for cross-selling analysis)
– Descriptions, hierarchies
• Results data from IBM IM back into SAP BW
– Results of segmentation can be loaded as master data or
hierarchies
• Data transport is designed through Wizards in
SAP BW
– Possible to get a good view of Intelligent Miner Results
from SAP BW

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
37
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
38
ERPs and Data Mining: Good and the Bad News
• Good News
– Known Business Processes
– Few data Sources
– Improved Data Quality
– Metadata Integration
CRISP-DM
– Near real-time data mining
1. Business Understanding
– Closed-loop Knowledge Discovery 2. Data Understanding
– Consistent Infrastructure 3. Data Preparation
• Bad News 4. Modeling
5. Evaluation
– Complex Data Structures
6. Deployment
– Performance
– Availability
– Very few Data Mining algorithms - Today

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
39
Data Mining Process and ERP Data Mining

Business Data
Understanding Understanding
Will
Will reduce
reduce data
data mining
mining
project time up to
Data 50%
50%
Preparation
Deployment  Business Understanding
 Data Understanding
 Data Preparation
 Modeling
 Evaluation
 Deployment

Source: http://www.crisp-dm.org/

Good News for Future Business Applications

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
40
Agenda

• Data Mining and Knowledge Discovery Basics


• ERP Vendors and Data Mining Solutions
• Data Mining in SAP Business Information
Warehouse
• Pro and Cons of ERP centric Data Mining
• Q&A

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
41
INFORMATION FRAMEWORKS
Executive and Senior IT Management
Consulting
Seminars
Webinars Enterprise Information Architectures (EIA)
Keynotes KNOWLEDGE Business Case Development
Panel Moderator
Publications TRANSFER Information Architecture Application
Hands-on training Deployment Architectures implementation
Conferences
Legacy Application Migration Strategies
ERP Application deployment strategies

Enterprise Applications Integration (EAI)


Architectures, Service Modeling and
Market Research INFORMATION design, EAI technology assessment
Market Assessment
Competitive Analysis TECHNOLOGY INFORMATION Tools and Technology Assessment
Technology due
INVESTORS TECHNOLOGY Vendor Selection and Assessment

ORGANIZATION Conference Room Pilot implementation

Business Intelligence and Portals


Architectures, Methodologies

Technology/Solution
SOFTWARE Tool/technology/Vendor assessment and
selection
Assessment
Product Strategy
AND Data Warehouse, Data Marts, Analytics,
Information Delivery
Solution Strategy
Product Positioning SOLUTION Deployment Architectures
Competitive Analysis
Software product architecture VENDORS Business Intelligence and eBusiness
Marketing Strategy Integration architectures
Product Performance and Portals Strategies, Business case,
Benchmarking Consulting Assessment, Architectures, Modeling,
Hardware Configuration Planning and knowledge Transfer

http://infoframeworks.com
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002
42
Questions

Naeem Hashmi
Chief Technology Officer
September 10, 2002
Email: nhashmi@infoframeworks.com
Web Site: http://infoframeworks.com
Tel: 603-432-4550

Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, 2002

Vous aimerez peut-être aussi