Académique Documents
Professionnel Documents
Culture Documents
C op yr i g h t 2 0 1 2 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
AGENDA
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
DATA WAREHOUSE
ANALYTICAL DATAMARTS
Same information are replicated in several data structures provide
slow updating process and slow renewal data.
NEEDS:
Big data refers to datasets whose size is beyond the ability of typical
database software tools to capture, store, manage, and analyze.
The ability to store, aggregate, and combine data and then use the results to
perform analysis in motion has become ever more accessible as trends.
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
NEW QUESTIONS
New ways to manage distributed and not structured in classical way data are
needed:
We need different paradigm to organize data and, above all, to query them.
Collect several sources and manage them open several new problems:
Relational data (GRAPH DATA) can be useful to understand event
spreading in a population.
Data in motion coming from several tools on field (sensor
devices, smarthphone) provide dynamic pattern often without an
history of their form
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
ANALYSIS
Merging are driven by fuzzy keys where you can assign group
information according statistic relationship.
Event can be happen driven from relational with other data
rather from specific behavior.
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
BIG DATA
What types?
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
AGENDA
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
ANALYTICS
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
REFERENCE
EXAMPLE SAS-RACK IMPLEMENTATION
ARCHITECTURE
CLIENT
GREENPLUM
HADOOP
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
TERADATA
ORACLE
Input
Hadoop
Output
Visual
Analytics
Metadata
High
Performance
Analytics
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
In memory
GRID
COMPUTING
In Database
Input
Output
Visual
Analytics
Metadata
Analytical
Tool
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
High
Performance
Analytics
AGENDA
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
SAS HIGHPERFORMANCE
ANALYTICS
Why Now?
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Customer needs
Blade systems have proven viable platforms for high-performance
computing
New computing paradigms
Partnerships with MPP database vendors
SAS
PROCEDURES
Single-threaded
Not aware of distributed
computing environment
Multi-threaded
Aware of distributed
computing environment
Runs on client
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
OPERATING SYSTEM
SAS Process
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Process
Disks /filesys
Temp/Utility files
to support SAS
SAS Datasets
HADOOP NAMENODE
OPERATING SYSTEM
Process
NODE 1
SAS Process
4
3
2
Data
NODE 2
Data
6
NODE N
4
6
Data
Graph analysis
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
AGENDA
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Data Mining
Statistics
Binary target
& continuous
no.
predictions
Linear, NonLinear, &
Mixed Linear
modeling
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Complex
Text Mining
relationships
Tree-based
Classification
Variable
Selection
Parsing
large-scale
text
collections
Extract
entities
Auto.
Stemming &
synonym
detection
Forecasting
Large-scale,
multiple
hierarchy
problems
Optimization
Econometrics
Probability of
events
Severity of
random
events
Local search
optimization
Large-scale
linear &
mixed integer
problems
Graph theory
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
GRAPH
ANALYSIS
Network
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
GRAPH
ANALYSIS
Link
Node
2
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
AGENDA
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
A Single solution
for Statistical
Visualization and
reporting
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
SAS VISUAL
BUSINESS VISUALIZATION DRIVEN BY ANALYTICS
ANALYTICS
EXPLORATION AND
VISUALIZATION
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
POWER
OF
ANALYTICS
RAPID DELIVERY OF
MOBILE INSIGHTS
DATA VISUALIZATION
ANALYTIC VISUALIZATION
EXPLORATION
DISCOVERY
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Self-service
Easy to use Analytics
Work with more data
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
SAS VISUAL
MEETING YOUR BUSINESS NEEDS THROUGH FLEXIBILITY
ANALYTICS
Traditional
on premise
Deployments
C op yr i g h t 2 0 1 3 , S A S I n s t i t u t e I n c . A l l r i g h t s r es er v e d .
Public
Private
Hybrid
SAS Cloud
&
SAS Solutions on Demand