Académique Documents
Professionnel Documents
Culture Documents
A
recently
released
mobile
app
targeted
at
Tier-1/Tier-2
city
phone
users
in
India
has
acquired
over
half
a
million
customers
in
a
couple
of
months.
Now
they
want
to
know
Who
are
their
customers?
Helping
a
mobile
application
to
make
a
disruptive
change
to
the
monetization
and
feature
enhancement
of
their
product
During
the
first
month
of
our
analytics
division
setup,
we
landed
a
very
interesting
client.
Our
client
was
a
mobile
application
with
over
500,000
downloads
in
a
couple
of
months
since
launch.
The
product
is
a
utility
app.
It
helps
users
monitor
their
spends
across
their
mobile
operators.
The
app
was
predominantly
focused
on
pay
as
you
go
mobile
users
in
India.
This
segment
of
users
is
almost
95%
of
all
mobile
users
in
the
country.
Our
objective
was
to
use
intelligent,
analytics
to
support
them
across
different
parts
of
the
business.
For
this
white
paper
we
will
focus
on
two
key
questions
that
were
addressed
with
varying
solution
approaches
Defining
the
product
journey
/
Product
road
map
Identifying
and
isolating
reasons
for
user
dissatisfaction
/
drop
off
As
a
startup
themselves
there
were
typical
challenges
on
personnel
availability
all
the
time
to
drive
the
analytics
forward.
Often
we
were
making
outputs
independent
of
industry
context.
In
hindsight
this
seemed
to
be
the
biggest
value
we
were
bringing
to
the
table.
Solution
The
data
set
that
we
were
working
on
had
a
wide
range
of
data
parameters
for
the
active
user
base
of
the
clients
product.
The
data
had
a
wide
range
of
information
ranging
from
usage
parameters
of
the
core
product,
ancillary
features
behavior
and
interactions
of
the
user
with
other
products
/
features
on
their
particular
device.
Methodology
used
with
process
diagrams
There
were
three
key
parts
towards
helping
the
client
leverage
the
power
of
the
data
they
were
collecting
Personi-fying
the
customer
base
of
the
app
Mapping
customer
usage
of
product
to
different
customer
types
Overlapping
patterns
across
dis-satisfied
/
dropped
customers
The
basic
analytics
methodology
that
we
followed
is
Fig1.
1. Exploratory
Analytics
of
Data:
The
data
comprised
of
voice
calling
usage,
internet
data
usage,
app
usage
and
recharge
behavior
of
the
customer.
We
initially
looked
at
each
of
the
data
sets
independently
to
extract
emergent
behaviors.
Here
the
inputs
were
critical
since
subtle
behavior
patterns
such
as
Recharge
Quantum
define
a
user
better
than
total
amount
of
recharge
over
a
month.
User
Profiling
after
Outlier
Removal
and
Normalization
by
Looking
at
Various
Raw
Metrics
2. Outlier
Removal
/
Normalization:
The
nature
of
data
collection
allows
few
outliers
to
creep
in.
Using
the
data,
we
designed
multiple
outlier
detection
methods
to
remove
them
right
at
the
beginning.
Data
in
each
of
the
sources
were
in
different
units;
to
avoid
unnecessary
confusion
we
normalized
the
data
to
maximum
independent
for
every
data
set.
3. Data
Management:
The
next
iteration
of
analytics
drives
the
simplification
and
reorganization
of
the
data
set
in
to
easier
analyzable
components
(e.g
getting
data
from
multiple
mongo
db
tables
into
easily
analyzable
smaller
data
sets,
to
answer
the
specific
focus
question
for
that
iteration).
Created
~200
metrics
for
each
user
from
their
usage
behavior.
The
metrics
were
designed
such
that
they
could
be
easily
appended
as
data
accumulated.
Many
of
these
metrics
were
derived
metrics
such
as
Incoming
Call
to
Outgoing
Call
ratio.
Such
derived
metrics
help
capture
the
non-linear
nature
of
user
behavior
and
interaction.
4. Clustering
Methods:
With
the
clean
metricified
data,
we
ran
several
unsupervised
clustering
algorithms
such
as
Random
Forests,
Hierarchial
Clustering,
Multi-step
k-means.
The
clusters
were
then
corroborated
with
methods
such
as
Silhoutte
Analysis,
Training/Testing
datasets.
We
ran
these
methods
iteratively
till
we
got
some
obvious
user
groups.
The
results
are
quite
interesting.
Most
app
users
were
small
users
of
data
and
voice
calling.
The
app
was
largely
adopted
by
users
who
were
keen
to
control
their
usage.
Detailing
the
customer
base:
Metrics
Co-occurrence
to
remove
similar
variables
Clustering
into
3
distinct
groups
based
on
usage
patterns
Technology
used
All
the
analytics
discussed
above
was
performed
using
the
most
common
multi
faceted
language
for
data
scientists
R
Visualisations
were
predominantly
executed
using
d3.js
Solution
core
-
How
our
methodology
and
technology
formed
the
solution
that
addressed
the
problem
Typically
an
iterative
methodology,
as
implemented
by
us
is
the
best
practice
for
deriving
high
quality
insights
from
any
data
set.
That
doesnt
mean
that
focus
should
not
be
there
on
constant
outcomes.
Outcome
focus
always
drives
evolution
of
the
answer.
Sometimes
evolution
of
the
answer
is
essentially
an
improved
understanding
of
the
data
scientist
and
business
leader
of
the
data
set.
This
played
out
a
few
times
along
all
the
pieces
of
work
executed
by
us
The Benefits -
The
biggest
value
of
this
project
is
measure
over
a
longer
duration
when
this
approach
is
used
repetitively
to
identify
the
customer
groups,
and
hence
define
what
are
the
most
critical
features
for
development.
Specific
Impact
on
minimized
development
costs
A
simplistic
approach
to
quantify
impact
was
the
number
of
features
that
were
dropped
and
saved
$$
of
engineer
time
in
developing
redundant
features
in
this
context
this
could
be
valued
at
almost
70%
savings
on
the
total
development
costs
Secondary
impact
on
user
churn
Based
on
simple
features
tweaks
and
improving
solutions
to
support
the
biggest
chunks
of
drop
offs,
an
immediate
improvement
of
almost
5%
was
seen
in
minimizing
customers
uninstalling
the
app.
Over
time,
as
more
corrective
features
are
implemented
and
assuming
appropriate
groups
are
completely
address
there
is
an
opportunity
to
improve
customer
retention
by
almost
30%.
The
impact
of
this
on
marketing
$$
spent
is
significant.
In
this
case,
this
single
change
would
increase
the
runway
of
the
startup
on
current
funding
by
almost
5
months.
Additional
benefits
that
can
be
seen
but
not
measured
Customer
acquisition
costs
can
be
rationalized
by
focusing
on
specific
customer
segments
Additional
monetization
avenues
were
emerging
based
on
the
customer
segments
through
advertising
and
focused
targeting
options
for
advertisers
Alternate
industries
could
be
challenged
by
precise
recommendations
of
products
to
each
customer
segment
within
the
population
Conclusion
While
our
focus
on
analytics
was
to
answer
a
few
very
targeted
questions,
the
outcomes
were
fascinating
for
the
client
and
us.
This
piece
of
work
was
hence
extended
into
a
recurring
relationship
where
building
deeper
understanding
and
awareness
of
data
was
the
clear
definition
of
success.
In
the
current
context
of
products
(web
,
apps,
IOT
devices)
generating
large
data
sets,
sophisticated
focus
is
required
to
draw
out
insights
from
any
data
set
that
can
materially
transform
any
business.
Our
unique
skills
of
business
acumen,
high
quality
data
scientists
and
well
proven
execution
process
on
any
analytics
problem,
helps
us
to
quickly
dive
to
the
essence
of
any
problem
with
a
client.