Vous êtes sur la page 1sur 50

Big Data Analytics

Outline
Big Data Analytics
Use Cases

Sentiment Analysis
Customer Next Best Action
Fraud Detection
Subrogation
Visualization

Appendix

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Big Data: The Three Dimensions

Volu
me

Big Data

Veloci
ty

Source: IBM

Variet
y

Big Data comes in one size:


Large.
Enterprises are awash with
data, easily amassing
terabytes and even peta
bytes of information.
Often time-sensitive, Big
Data must be used as it is
streaming into the enterprise
in order to maximize its
value to the business.
Big Data extends beyond
structured data to include
unstructured data of all
varieties: text, audio, video,
click streams, log files and
more.
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Big Data: Client Financial Data


External
Execution
Costs

Client
Profiles

Suitability/
Portfolio

Billing and
Cost
Attribution

Account and
Statement
History

Counterpart
y
Hierarchies

Compliance
Reporting

Internal
Customer
Support
Activity

External
Execution
Costs

Reference
and
Market
Data

Investment
data

Auction
prices

Adjuster
Notes

Policy
Contracts
Real Estate
Data

Credit
Ratings
and Risk

Last
Customer
Action

Partnership
s / Sharing
Agreements

External
Social
Media and
Public
Informatio
n

Claims data

Medical
Records

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Analytics to Big Data Analytics

5 Billion
mobile
phones in
use in 2010

Cheaper and
faster
processes

30 Billion
pieces of
content
shared on
Facebook
every month
40%
projected
growth in
global data
generated
per year

New business
models (mass
customization
of products
and services)
New and/or
improved
metrics
(scorecards,
influence
scores, ..)

Source : McKinsey Reports on Big Data June 2011

ta has substantially lowered the barriers to Analytics and is transforming business models and pro
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Capgemini View on Big Data Methodology

Acquisition
Acquisition
Collection
Collection of
of data
data
from
diverse
from diverse
sources,
sources, including:
including:
Traditional
Traditional ETL
ETL and
and
real-time
constant
real-time constant
acquisition
acquisition to
to address
address
volume
&
velocity
volume & velocity
requirements
requirements
External
External data
data security,
security,
trust,
trust, licensing
licensing and
and
privacy
issue
privacy issue
resolution
resolution
Open
Open Data
Data source
source
deployment
deployment (e.g.,
(e.g.,
publicly
available
publicly available
sources
sources like
like
http://data.gov.uk)
http://data.gov.uk)

Marshalling
Marshalling

Organization
Organization (and
(and
storing)
of
data,
storing) of data,
including:
including:

Analysis
Analysis

Action
Action

Finding
Finding of
of insight
insight //
predictive
predictive modelling
modelling
Forward
Forward (prediction(predictionbased)
rather
based) rather than
than
historic
historic data
data perspective
perspective
Behavior-based
Behavior-based
modelling
modelling (e.g.,
(e.g., How
How will
will
customers
react?
When
customers react? When
is
is the
the optimum
optimum time
time to
to
replace
parts,
etc.)
replace parts, etc.)
Probabilistic
Probabilistic rather
rather than
than
definitive
assessments
definitive assessments
Text,
Text, voice
voice and
and video
video
analysis
methodologies
analysis methodologies

Large
Large volumes/constant
volumes/constant
feeds
feeds
Consumption
Consumption options
options
(real-time,
ASAP,
(real-time, ASAP, history)
history)
and
and filtering
filtering
Alternative
Alternative data
data formats
formats
structured,
semistructured, semistructured
structured and
and
unstructured
unstructured
Diverse
Diverse modelling
modelling
strategies
strategies from
from raw
raw
form
form to
to highly
highly structured
structured
depending
depending on
on source
source and
and
use
use
Data
Data Governance
Governance
Deletion
requirements
Deletion requirements

Use
Use of
of Insights
Insights to
to
change
business
change business
outcomes
outcomes with
with
outputs
outputs including:
including:
Human
Human e.g.,
e.g., reports
reports
and
and analysis
analysis that
that people
people
act
on
act on
Machine
Machine (more
(more common
common
with
Big
Data)

with Big Data) e.g.,


e.g.,
automatic
automatic assessment
assessment of
of
customer
to
adjust
offer
customer to adjust offer
ala
ala Amazon
Amazon proposal
proposal of
of
products
based
on
products based on
customer
customer profiling
profiling
BPM
technology/realBPM technology/realtime
time decisioning
decisioning
Partners
Partners Information
Information
System
System

Master
Master Data
Data Management,
Management, Data
Data Quality,
Quality, Data
Data Lifecycle,
Lifecycle, Legal
Legal constraints,
constraints, etc.
etc.
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Overview of Big Data Solutions


Category

High Performance Structured


Data

Context

Need to Know the exact


answer

Need to Understand trends and


direction

Target
Usage

Event-Driven Processing When


X and Y happen, Do Z

Ad-Hoc Analytics We Dont


Know What We Dont Know

Tools /
Products

SAP HANA, Oracle


Exadata/ExaAnalytics, Netezza

Hadoop , etc

Major
Challenge

Data Size / Scalability

Association Accuracy

Major
Questions

Do we have the right model for


the data?
Do we know the quality of the
data?

Massively Parallel Captured


Data

Do we understand the data we


are capturing, and can we
relate it to other content?

Can we scale the performance?


Data can be divided into two major categories, based on the tools used to approach the situation:
High Performance Structured Data and Massively Paralleled Captured Data.

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Use Cases

Potential Use Cases for Financial


Institutions
Customer

Risk Assessment especially Operational Risk


Self-service portal
Fraud Detection
Customer Next Best Action
Needs Extraction based on segmentation and
other similarities

Data Marshalling

Drive analyses and MI Reporting by combining


structured and unstructured Data
BI Search on collated data
Self-service analytics portal

Claims and
Payments

Subrogation
Rapid aggregation of payments across various
dimensions
Identification of cost efficiencies

Visualization

Geo-layering and chronological variation of


payments/ fund flows along several dimensions
Calendar heat maps and Word clouds
Visualization for mobile devices

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

Current Accelerators vis--vis Use Cases

er than Subrogation which is Insurance Specific, all other accelerators are applicable to
types of financial institutions (Banks, Capital Markets, Wealth Management)

Customer
Sentiment
NBA
Elastic Search
(Fraud)

Data
Marshalling
Elastic Search
(BI Search)
Informatica/
Pentaho Data
Integration

Payments

Visualization

Subrogation
(Insurance
specific)
Elastic Search
(rapid
aggregation,
slicing)

Mobile
Dashboard
Tableau,
Microstrategy,
Qlikview
Heatmaps, Geo
and Chronology
layering

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

10

Sentiment Analysis

Industry View: Key FS Applications for


Sentiment Analysis

A
A routine
routine review
review of
of corporate
corporate
A
A bank
bank can
can use
use sentiment
sentiment analysis
analysis
system
emails
might
reveal
the
to
system emails might reveal the
to better
better understand
understand the
the Voice
Voice of
of
sentiments
of
employees
Customer
based
on
available
data
sentiments of employees
Customer based on available data
regarding
residing
regarding individual
individual projects
projects or
or
residing in
in forums,
forums, blogs
blogs and
and other
other
the
institution
at
large
social
media
the institution at large
social media
Customer

Analysis
of
unstructured

Analysis of unstructured
By
By analyzing
analyzing unstructured
unstructured
Relationship
comments
on
the
employment
customer
sentiment
comments on the employment
customer sentiment data,
data, banks
banks
Management
satisfaction
survey
can
result
in
can
satisfaction survey can result in
can better
better determine
determine where
where to
to
valuable
insights
for
staff
invest
funds
targeted
at
the
valuable insights for staff
invest funds targeted at the
retention,
acquisition
retention, recruitment
recruitment and
and other
other
acquisition and/or
and/or retention
retention of
of
Employee
HR
objectives
Fraud Detection customer
assets
HR objectives
customer assets
Satisfaction

A
A combined
combined predictive
predictive analysis,
analysis,
speech
analytics
and
social
speech analytics and social media
media
analytics
can
be
used
to
detect
analytics can be used to detect
and
and prevent
prevent fraud.
fraud. For
For example,
example, in
in
an
an Insurance
Insurance setting,
setting, analysis
analysis of
of a
a
claimants
speech
and
social
claimants speech and social
media
media communications
communications can
can be
be
used
to
generate
a
risk
score
used to generate a risk score for
for
the
claimant,
which
can
then
be
the claimant, which can then be
used
to detect fraud
before
used
fraud and
before
Sentiment Analysis can play significant role in Banking Domain applications
suchtoasdetect
CRM, Fraud
Risk
issuing
a
policy
issuing a policy

Words
Words contained
contained in
in financial
financial news
news
publications,
shareholder
reports,
publications, shareholder reports,
etc.,
etc., provide
provide textual
textual context
context and
and
clarification
when
combined
with
clarification when combined with
traditional
traditional transactional
transactional data
data to
to
improve
Risk
Assessments
and
improve Risk Assessments and
Mitigation
Mitigation

Risk
(especially
Operational)

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

12

Why R and Hadoop?


R
Huge
functionality in
packages
(over 2500)
Favorite at
schools
Excellent
integration with
the entire
enterprise IT
landscape

Hadoop

Robust Big
Data
Platform
that can
store and
analyze
massive and
complex
data

Highly scalable
with
commoditized
hardware
Map-reduce

R and Hadoop from robust Big data platform which can handle massive and complex data
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

13

Capgemini Taxonomy-Based Big Data


Solutions
Sample Taxonomies

Credit Card
Mortgage
Regulatory (CCAR)

CDH 4, ~730 GB

Fraud
AML
None

Analytics Engine
Run Analytics
AlgorithmsSentiment @
in R

Employee
morale
Trades
Sales Leads

Web Service and/or


Interfaces for
Dashboard-based
Outputs
e.g. Qlikview

Taxonomy based Big data solution can analyze sentiments in real time which enables assessing real time response to marketing campaigns
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

14

Taxonomy-Driven Highlighting/Annotating
and
Extraction Add Value

Highlighting the keywords based on taxonomy gives better visualization to make quick decisions
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

15

Tweet Annotations are also Insightful

Tweet Annotations help in capturing sentiments easily to understand sentiments macro level
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

16

Key Differentiators of the Capgemini


Sentiment Analysis Solution

1. Data
source
2. Recency
3. Intensity
of
sentiment
expression
3 levers for customization

Taxonomy
driven to
generate
specific
actionable
messages
for quick
monetizatio
n
Focus on monetization

Greatly
facilitates
integration
on both
appliance
and hadoop
derived
platforms
Workflow in R

Not limited to social media

Looks at all
means of
electronic
communicati
on from email to pdf
to word to
social media

Capgeminis solution stands out with workflow in R, focus on monetization and customization
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

17

Customer Next Best


Action (NBA)

www.capgemini.com

NBA: Right Product to Right Person at


Right Time

Source: http://www.youtube.com/watch?
v=kg_NaZhwvKQ

Traditional marketing may not offer relevant products to customer


Which reduces ROI for campaigns and promotions
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

19

Why Big Data is needed for Next Best Action

Data
Stream
s

Customer
Profile
Transactions
Data
Historical sales
Data
Location Data
through smart
phones
Customer
Feedback
Complaints

Right
product or
offer at
right time
& right
place

Product
Offerings
Current Offers
Inventory
Position
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

20

Overview of Capgeminis Customer Next Best Action


Customer
query/
complaint

Processing

Decision Hub
Components

Voice
of
Customer
&
Sentiment
Analysis

CG PPI*/
Predictive
Models

Customer
Satisfied?

Next
Best
Action

Outbound:
Call to the
Customer

PPI (product Propensity Index ) is a set of approximately 14 quantitative models that segment
mers on behaviors and identifies each customers current propensity for the FIs product set.

NBA
Effectiveness
Metrics
| Financial Services

The information contained in this presentation is proprietary.


Copyright 2012 Capgemini. All rights reserved.

21

Next Best Action : Sample offering

Offer
PPI

Likelihood to
Attrite
H

H
L
H
L
L

CLTV

Offer

Premium

Standard

Premium

N/A

Standard

Standard

Standard

Standard

NBA offers right product to right customer at right time which improves Customer Lifetime Value

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

22

Quantification of Improvement in Customer Experience & Value


Directly
monetizable
MAXIMIZ
E

Indirectly
monetizable
Directly
monetizable

MINIMIZ
E

Indirectly
monetizable

Repeat purchases; renewal rates; up sell & cross-sell


rates; incremental yield in sales, revenues, and
profits per customer, channel, touch point,
interaction, and agent
customer satisfaction scores, customer response and
acceptance rates, customer sentiment trends,
customer recommendation propensities, Voice of the
Customer (VoC) scores
IT and staffing costs associated with CRM processes
(marketing, sales, customer service), and with
associated app, models, data, rules, etc.
online shopping-cart abandonment rates; time to
resolve customer issues; incidence of customer
complaints; lost opportunities due to misdirected,
redundant, or repetitive offers; incidence of wrong,
incomplete, or inconsistent info provided to
customers across various CRM channels

Measuring effectiveness of NBA improves the predictive models which in turn gives better next actions
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

23

Fraud Prevention

www.capgemini.com

Industry View: Proliferation of Fraud in the FS Sector


Incidence of credit card fraud has
increased 87%
since 2010 in US*
Aggregate fraud loss of $6 billion*
(* Source: Javelin research on 8TH ANNUAL CARD ISSUERS' SAFETY
SCORECARD,2012)

Incidence rate rising while average


detection times continue to fall
Social behaviors in market continuing to
put consumers at risk
Consumers are still sharing a significant
Due toamount
rising rate of incidents ,focus is shifting towards preventive actions against frauds
of personal information on public social
media

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

25

Fraud Detection Powered By Near Real Time Event Processing


Framework

Near Real Time Event Processing Framework


Elastic
Search
Indexer

Percolator
Query

Listeners

Transactions
Database

Flagging of
Suspicious
activities

Capgemini
Fraud Detection
Algorithm

Innovative Near real time event processing framework can be used in multiple applications which improves processing speed
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

26

Key features of the Elastic Search based


approach

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

27

Subrogation

www.capgemini.com

Benefits of Big Data Analytics in


Subrogation
Current challenges
Benefits of Big Data Analytics

3
1

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

29

Capgeminis Solution (Fault Measure)


Expert Driven
Taxonomy
Hit, Crash
Collision
Dashed

Sentences

Parts of speech
Tagging

Triplet Extraction
Algorithm

Fault Measure

Subject

Object

Hits of Words from Taxonomy

Phras
e

Hits

crash

dash

hit

Total

Word hit count with respect to party 2 from triplet


Total no. of word hit count

Input new Variable


For
Predictive Model

Predicate

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

30

Schematic of Capgeminis Subrogation Analysis


If No

If No

Identification for
Claim authentication

If No

If Yes

Recovery from
subrogation = 0%

Total Compensation
>Threshold2

Close claim

Fault Measure> Threshold1

(Percentage)

If Yes

If Yes

If No

Predict subrogation
Opportunity
> Threshold3

If Yes

Predict settlement
amount

Estimate percentage
recovery
Claim Data

Report of
final outputs

Regressor variables:
Claim Number, Jurisdiction Local, Claim Local
Closed Outcome, Gross incurred loss,
Total reserve amount, Exposure Type,
Claim Age etc.

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

31

Further Enhancements

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

32

Solution Deployment
Unstructured
Data

Speed up by appliances
Like SAP-HANA,
Vertica, Greenplum,
SAS High performance
analytics

Capgemini
Accelerator
Fault Measure

SAS Translation
of Capgemini
Accelerator

Subrogation Recovery
Amount Prediction

Management
Recommendation
Dashboard

Closing the loop for


Continuous Improvement

Actual Results

Management
Action

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

33

Visualization

www.capgemini.com

Capgemini View: The Value of


Visualization

Visualization is the key for conveying results of analysis for making informed decisions
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

35

Visualization Use Case: Word Cloud and


Sentiment Scoring Comparisons
Provide
Valuable
Insights
on
Bank A vs. Bank
BHNIs
Comparative
Phrase Cloud
Analysis

Word Clouds provide insights into the behavioral and


attitudinal aspects of High Net Worth Individuals (HNIs)
e.g., some of the top things on minds of HNIs included
net worth, estate wealth, etc.

Use Case: Compliment quantitative research by mining


the social media (Twitter) sentiment on two prospective
investments

Providing valuable insights regarding HNI


| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

36

Visualization Use Case: Qlikview Analysis


Helps Predict Stock Value
Twitter Sentiment Score vs. Following Day Opening Stock Price for an
international Bank

Qlikview analysis provides excellent visualization


| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

37

Visualization Use Case: Calendar Heat


Maps Identify Time Variations
Calendar Heat showing the sentiment displayed in a
Tweet

Calendar Heat showing stock movement

Heat maps help to visualize variations in sentiments


| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

38

Visualization Use Case: Network Analysis

Infers Critical Household Data


Insights and Geo Layering plots
shows geographical dispersion

Human mind can understand relationships in better way through graphs


| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

39

Appendix

Taxonomy Search Model

Client Name
e.g. Morgan

Morgan Wealth

Wealth Mgmt

Net Worth

Morgan Wealth Net


Worth
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

41

Sentiment Analysis Model

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

42

Score Calculation Model

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

43

Capgemini Big Data Fraud Detection Solution Uses Multiple


Checks

Phone numbers
spam list
4
Email ID
validation

Free or corporate
mail

6
Multiple checks to find out suspicious activities
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

44

Identification of Fraudulent Activities Using Checks


Che
ck
ID

Scor Scor
e If
e If
True False

Description

Cardholder's country is in sync with country given by IP


address for transaction

Cardholder's state is in sync with state given by IP


address for transaction

Cardholder's country is in sync with country tracked


using customers phone number

Customers phone is not in spam list( a genuine number)

Verification against server which performs Reverse DNS


lookup on the mail server to check if it is a valid email
domain

1
(corp
)

0.5
(free)

Check if its a free email address or corporate

Any unusual/infrequent
date/time
transaction such as
Cumulative
score is used to decide
if it isfor
a fraud
7
0
1
late
Friday
or
weekend
purchases
The weightages to checks can be varied as per business requirements. New cheeks can also be added to meet business requirements
| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

45

Subrogation Output Snippets: Logistic Regression


Fault Measure (excluded from the
Model)

Fault Measure (included in the


Model)

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

46

Subrogation Output Snippets:

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

47

Subrogation Output Snippets: ROC Curve


ROC Curve(Without Fault
Measure)

ROC Curve(With Fault


Measure)

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

48

Subrogation Screen Snippets: Fault Measure


Claim Details with Fault
Measure

Fault Measure Justification

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

49

Subrogation Screen Snippets: Subrogation Prediction

| Financial Services
The information contained in this presentation is proprietary.
Copyright 2012 Capgemini. All rights reserved.

50

Vous aimerez peut-être aussi