Vous êtes sur la page 1sur 16

12122016

BigDataAnalytics
Session2

TypesofAnalytics

Whatsystemshouldlearn?

WhatshouldHappen?
Why? Cognitive
Analytics
Prescriptive
Whatcouldhappen Analytics
Predictive
Analytics
WhatHasHappened?
Descriptive
Analytics

1
12122016

3VsofBigData

SomeMakeit4Vs

2
12122016

BigData

103Bytes IKilobyte OneParagraphofPage

106Bytes IMegabyte OnesmallbookData

109Bytes IGigabyte APickuptruck filledwithbooks

1012Bytes ITerabyte 300hoursofgoodqualityvideo


1500Cdstacked(6FtManstanding)
1015Bytes IPetabyte AllUSacademicresearchlibraries=2
Petabyte
1018Bytes IExabyte Allthewordsspokenbyallhumanbeings
tilldate=5Exabyte
1021Bytes IZettabyte

1024Bytes 1Yottabyte

3
12122016

BigDataAnalytics

BigDataAnalytics istheuseof:
data,
informationtechnology,
statisticalanalysis,
quantitativemethods,and
mathematicalorcomputerbasedmodels

tohelpmanagersgainimprovedinsightabouttheir
businessoperationsandmakebetter,factbaseddecisions.

17

Evolution of term big data

Frequency distribution of documents containing the term big data in ProQuest


Research Library.

Source: Amir Gandomi, Murtaza Haider, Beyondthehype:Bigdataconcepts,methods,and


analytics,InternationalJournalofInformationManagement,Volume35,Issue2,2015,137144,
http://dx.doi.org/10.1016/j.ijinfomgt.2014.10.007

4
12122016

Factsabout big data


Googlemanages>1million PBandprocesses>24PBofdataevery
day(alotmorethanallprintedmaterialintheU.S.Libraryof
adaptedfrom<https://www.linkedin.com/pulse/bigdatabrokerslostprivacysaranyaanandh>

Congress.)
>1billion Googlesearchesareconductedeveryday
>250billion emailcommunicationhappenseveryday.
YouTubehas>1 billion uniquevisitorspermonth
>6 billion hrs ofvideowatchedpermonthonYouTube
(~1hourforeverypersononEarthand50%morethanin2014)
90%ofthedataintheworldtodayhasbeencreatedinthepast2
years.
Dataareforecasttodoubleevery2yearsuntil2020.
In2020,theamountofdigitaldataproducedwillexceed40
zettabytes(5,200GBforeveryhomosapiensonEarth)

4.6
30 billion RFID billion
tags today
camera
12+ TBs (1.3B in 2005)
phones
of tweet data world
every day wide

100s of
millions
of GPS
data every day
? TBs of

enabled
devices
sold
annually

25+ TBs of 2+
log data billion
every day people
on the
76 million smart Web by
meters in 2009 end 2011
200M by 2014

5
12122016

KeysectorsforbigData
Financial Manufacturing
Healthcare Travel
Communications Retailing
DigitalMedia Government
RealEstate Energy

WhosGeneratingBigData

Mobiledevices
(trackingallobjectsallthetime)

Socialmediaandnetworks Scientificinstruments
(allofusaregeneratingdata) (collectingallsortsofdata)

Sensortechnologyandnetworks
(measuringallkindsofdata)

Theprogressandinnovationisnolongerhinderedbytheabilitytocollectdata
But,bytheabilitytomanage,analyze,summarize,visualize,anddiscover
knowledgefromthecollecteddatainatimelymannerandinascalablefashion

12

6
12122016

7
12122016

HarnessingBigData

OLTP:OnlineTransactionProcessing(DBMSs)
OLAP:OnlineAnalyticalProcessing(DataWarehousing)
16
RTAP:RealTimeAnalyticsProcessing(BigDataArchitecture&technology)

8
12122016

TheModelHasChanged
TheModelofGenerating/ConsumingDatahasChanged

OldModel:Fewcompaniesaregeneratingdata,allothersareconsumingdata

NewModel:allofusaregeneratingdata,andallofusareconsumingdata

17

ComplementaryApproachesforDifferentUseCases

Traditional Approach New Approach


Structured, analytical, logical Creative, holistic thought, intuition

Data Hadoop
Warehouse Streams
Transaction Data Web Logs

Internal App Data Social Data


Structured
Structured Unstructured
Repeatable Enterprise
Unstructured
Exploratory
Mainframe Data Repeatable
Linear Integration ExploratoryIterative
Text Data: emails
Linear
Monthly sales reports IterativeBrand sentiment
Profitability analysis Product strategy
OLTP System Datasurveys
Customer Sensor
Maximum asset data: images
utilization

ERP data Traditional New RFID


Sources Source
s

9
12122016

ChallengesinHandlingBigData

TheBottleneckisintechnology
Newarchitecture,algorithms,techniquesareneeded
Alsointechnicalskills
Expertsinusingthenewtechnologyanddealingwithbigdata

19

The5KeyBigDataUseCases

BigDataExploration Enhanced360o View Security/Intelligence


Find,visualize,understandall oftheCustomer Extension
bigdatatoimprovedecision Extendexistingcustomerviews Lowerrisk,detectfraudand
making (MDM,CRM,etc)by monitorcybersecurityin
incorporatingadditionalinternal realtime
andexternalinformation
sources

OperationsAnalysis DataWarehouseAugmentation
Analyzeavarietyofmachine Integratebigdataanddatawarehouse
dataforimprovedbusinessresults capabilitiestoincreaseoperationalefficiency
2
2013IBMCorporation
0

10
12122016

BigDataExploration:CustomerExample

Airline Manufacturer

Exploring 4 TB to drive point business solutions


(supplier portal, call center, etc.)
Single-point of data fusion for all employees to use
Reduced costs & improved operational performance for the business

KeyQuestionstoAsk
Can you separate the noise from useful content? Can you navigate and explore all enterprise
and external content in a single user interface?
Can you perform data exploration on large and
complex data? Can you quickly identify areas of data risk?
Can you find insights in new or unstructured data Do you have a logical starting point for your
types (e.g. social media and email)? big data initiatives?

ProductStartingPoint:InfoSphere DataExplorer
2
1

Enhanced360CustomerView:CustomerExample
Confidential,Internal
UseOnly

Create Facebook
Identify 200+ different customer profiles to help in fulfillment &
marketing efforts
Leverage new data types in customer analysis

KeyQuestionstoAsk
Canyouidentifyanddeliveralldataasitrelatestoa How are you driving consistency across your
customer,product,competitortothosetoneedit? information assets when representing your
customer, clients, partners etc.?
Canyougatheringinsightsaboutyourcustomersfrom
socialdata,surveys,supportemails,etc.? How can a complete view of the customer
enhance your line of business users and
Canyoucombineyourstructuredandunstructureddata
result in better business outcomes?
torunanalytics?

ProductStartingPoint:InfoSphereMDMServer,DataExplorer,BigInsights
2
2

11
12122016

AsianTelcoreduces
billingcostsand
improves customer
satisfaction.
Capabilities:
Stream Computing
Analytic Accelerators

Real-time mediation and analysis of


6 Billion CDRs per day
Data processing time reduced from
12 hrs to 1 sec
Hardware cost reduced to 1/8th
Proactively address issues
(e.g. dropped calls) impacting
customer satisfaction.
23

AsianGovernment
Agency

National Intelligence
Platform
Capabilities:
Stream Computing

Analyze all Internet traffic


(social media, email, etc)
Track persons of interest
(drug/sex traffickers,
terrorists, illegal
refugees/immigrants) and
civil/border activity

24

12
12122016

OPERATIONAL - ANALYSIS
Capabilities:
Hadoop & Stream Computing

Intelligent Infrastructure
Management: log analytics, energy bill
forecasting, energy consumption
optimization, anomalous energy usage
detection, presence-aware energy
management
Optimized building energy
consumption with centralized
monitoring; Automated preventive and
corrective maintenance

DataWarehouseAugmentation:
CustomerExample InternalUseOnly

Creates pre-processing hub and performs ad hoc analysis


Hadoop-based landing zone used to store, manage and analyze structured,
semi-structured and multi-structured data before moving to the warehouse
Benefits: Data warehouse optimized for workload and performance
Utilized InfoSphere BigInsights, InfoSphere DataStage

KeyQuestionstoAsk
Areyoudrowninginverylargedatasets(TBstoPBs) Do you have a lot of cold, or low-touch, data driving
thataredifficultandcostlytostore? up costs or slowing performance?
Areyouabletoutilizeandstorenewdatatypes? Do you want to perform analysis of data in-motion to
determine what should be stored in the warehouse?
Areyoufacingrisingmaintenance/licensingcosts?
Do you want to perform data exploration on all data?
Doyouuseyourwarehouseenvironmentasa
Are you using your data for new types of analytics?
repositoryforalldata?

ProductStartingPoint:BigInsights,Streams
2
6

13
12122016

27

BigDataTechnology

28

14
12122016

TheMythAboutBigData

Big Data Is New


Big Data Is Only About Massive Data Volume
Big Data Means Hadoop
Big Data Need A Data Warehouse
Big Data Means Unstructured Data
Big Data Is for Social Media & Sentiment Analysis

15
12122016

Thankyou

16

Vous aimerez peut-être aussi