Vous êtes sur la page 1sur 9

Making Sense of Data I

Notes
Making Sense of Data I
Chapter 5 Statistics
Population
All possible outcomes, measurements or values
Parameters characterise populations
Sample
Proportion of populations
Statistics characterise samples
Great care needs to be taken to ensure the sample
is representative and unbiased
Making Sense of Data I
Chapter 1 Introduction
Steps in a Data Mining project
Objectives
Deliverables
Roles and responsibilities
Current situation
Timeline
Costs and benefits
Data preparation
Prepare and become familiar with the data:
Pull together data table
Categorize the data
Clean the data
Remove unnecessary data
Transform the data
Partition the data
Making Sense of Data I
Chapter 1 Introduction
Implementation of the analysis
Summarizing the data
Finding hidden relationships
Making prediction
Methods include
Summary tables
Graphs
Descriptive statistics
Inferential statistics
Correlation statistics
Searching
Grouping
Mathematical models
Deployment
Plan and execute deployment based on the definition in step 1
Measure and monitor performance
Review the project
Making Sense of Data I
Chapter 1 Introduction
Data analysis tasks

Graphs Descriptive Mathematical methods


Statistics
Summarising Making
the Data predictions

Inferential
Statistics

Summary
Correlation
Tables
Statistics

Grouping

Finding hidden
relationships
Making Sense of Data I
Chapter 1 Introduction
CRSIP-DM process
Business understanding
Problem definition
Data understanding
Data quality problems
Data insights
Useful data subsets
Data preparation
Attribute and records selection
Sampling
Modelling
Techniques to apply
Evaluation
Scoring of different approaches
Deployment
Roll put model
Making Sense of Data I
Chapter 2 Definition
Project definition summary
Define objectives
Define the business objectives
Define specific and measurable success criteria
Broadly describe the problem
Divide the problem into sub-problems that are unambiguous and
that can be solved using the available data
Define the target population
If the available data does not reflect a sample of the target population, generate a plan to
acquire additional data
Define deliverables
Define the deliverables, e.g., a report, new software, business processes, etc..
Understand any accuracy requirements
Define any time-to-compute issues
Define any window-of-opportunity considerations
Detail if and how explanations should be presented
Understand any deployment issues
Making Sense of Data I
Chapter 2 Definition
Project definition summary
Define timetable
Set aside time for education upfront
Estimate time for the data preparation, implementation,
and deployment steps
Set aside time for reviews
Understand risks in the timeline and develop contingency
plans
Analyse cost/benefit
Generate a budget for the project
List the benefits to the business of a successful project
Compare costs and benefits
Making Sense of Data I
Chapter 2 Definition
Project definition summary
Define roles and responsibilities
Project leader
Subject matter expert/business analyst
Data analysis/data mining expert
IT expert
End user
Assess current situation
Define data sources and locations
List assumptions about the data
Understand project constraints (e.g., hardware, software, personnel,
etc..)
Assess any legal, privacy or other issues relating to the presentation of
the results

Vous aimerez peut-être aussi