Bienvenue sur Scribd !

Ignorer le carrousel

Clustering

Transféré par

CHANDER MOHAN CHUGH

0% ont trouvé ce document utile (0 vote)

16 vues22 pages

Clustering (quantitative research tool)

Copyright

Formats disponibles

PDF, TXT ou lisez en ligne sur Scribd

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Signaler ce document

Clustering (quantitative research tool)

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

0% ont trouvé ce document utile (0 vote)

16 vues22 pages

Clustering

Transféré par

CHANDER MOHAN CHUGH

Clustering (quantitative research tool)

Droits d'auteur :

Formats disponibles

Téléchargez comme PDF, TXT ou lisez en ligne sur Scribd

Signaler comme contenu inapproprié

Passer à la page

Vous êtes sur la page 1sur 22

Rechercher à l'intérieur du document

Clustering

Basic ideas

QRM:Prasenjit Chakrabarti
Clustering

• Clustering is a ill-defined problem

• Loosely: you cannot have a unique solution for clustering

QRM:Prasenjit Chakrabarti
Why it is so?
Conditions for desirable properties of a clustering algorithm:
1. Scale Invariant : 𝑓 𝐷 = 𝑓(𝛼𝐷) where 𝛼 is scale. Meaning if I multiply the
distances between pair-wise point by a constant 𝛼, my clustering solutions will
remain the same

2. Richness: Any possible partitioning would be possible outcome: Loosely, we

can have multiple clusters for the same set of data

3. Consistency: Initially suppose we had n-clusters. Loosely, distance between all

points in each cluster are reduced and distance between a different cluster is
enlarged then number of clusters should remain same as n.
• We CAN NOT have all the three properties satisfying at the same time.

QRM:Prasenjit Chakrabarti
K-mean cluster
• Suppose we have n data points indexed by 1,2,…n

• Suppose we need K-clusters, k𝜖 1,2, … 𝐾

• We need to assign each point to one cluster. K=C(i)

• The goal is to find a grouping of data such that the distances between points
within a cluster tend to be small and distances between points in different
clusters tend to be large.

QRM:Prasenjit Chakrabarti
K-mean cluster
Consider the sum of the distances between all the points. Let the Total distance be
T. Now consider two points i and j. They can be in a single cluster, or can be the
part of two different clusters. Then total distance can be written as

𝑇 = ෍ ෍ ( ෍ 𝑑𝑖𝑗 + ෍ 𝑑𝑖𝑗 )
𝑘=1 𝐶 𝑖 =𝑘 𝐶 𝑗 =𝑘 𝐶(𝑗)≠𝑘
=𝑤 𝐶 +𝑏 𝐶
Where 𝑑𝑖𝑗 = 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡𝑤𝑜 𝑝𝑜𝑖𝑛𝑡𝑠 𝑖 𝑎𝑛𝑑 𝑗
𝑤 𝐶 = Within cluster distance
𝑏 𝐶 = between cluster distance

QRM:Prasenjit Chakrabarti
K-mean cluster
Thus, it becomes an optimization problem
Loosely:

Our objective is : either to minimize 𝑤 𝐶 / maximize 𝑏 𝐶

This is how the clustering problem is transformed into a clustering problem.

• Generally, for large data points we use K-mean clustering, where we

exogenously define number of clusters.

QRM:Prasenjit Chakrabarti
Some
Data

This could easily be

modeled by a Gaussian
Mixture (with 5
components)
But let’s look at an
satisfying, friendly and
infinitely popular
alternative…

QRM:Prasenjit Chakrabarti
K-means
1. Ask user how many
clusters they’d like.
(e.g. k=5)

QRM:Prasenjit Chakrabarti
K-means
1. Ask user how many
clusters they’d like.
(e.g. k=5)
2. Randomly guess k
cluster Center
locations

QRM:Prasenjit Chakrabarti
K-means
1. Ask user how many
clusters they’d like.
(e.g. k=5)
2. Randomly guess k
cluster Center
locations
3. Each datapoint finds
out which Center it’s
closest to.
4. Each Center finds
the centroid of the
points it owns…
5. …and jumps there
6. …Repeat until
terminated!
QRM:Prasenjit Chakrabarti
K-means
Start
Advance apologies: in
Black and White this
example will deteriorate
Example generated by
Dan Pelleg’s super-duper
fast K-means system:
Dan Pelleg and Andrew
Moore. Accelerating Exact
k-means Algorithms with
Geometric Reasoning.
Proc. Conference on
Knowledge Discovery in
Databases 1999,
(KDD99) (available on
www.autonlab.org/pap.html)

QRM:Prasenjit Chakrabarti
K-means
continues
…

QRM:Prasenjit Chakrabarti
K-means
terminates

QRM:Prasenjit Chakrabarti

Vous aimerez peut-être aussi

Shoe Dog: A Memoir by the Creator of Nike
D'Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Évaluation : 4.5 sur 5 étoiles
4.5/5 (537)
The Yellow House: A Memoir (2019 National Book Award Winner)
D'Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Évaluation : 4 sur 5 étoiles
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
D'Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Évaluation : 4.5 sur 5 étoiles
4.5/5 (838)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
D'Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Évaluation : 4 sur 5 étoiles
4/5 (890)
Grit: The Power of Passion and Perseverance
D'Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Évaluation : 4 sur 5 étoiles
4/5 (587)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
D'Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Évaluation : 4 sur 5 étoiles
4/5 (5794)
Yes Please
D'Everand
Yes Please
Amy Poehler
Évaluation : 4 sur 5 étoiles
4/5 (1891)
The Little Book of Hygge: Danish Secrets to Happy Living
D'Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Évaluation : 3.5 sur 5 étoiles
3.5/5 (399)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
D'Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Évaluation : 4.5 sur 5 étoiles
4.5/5 (474)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
D'Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Évaluation : 3.5 sur 5 étoiles
3.5/5 (231)
The Emperor of All Maladies: A Biography of Cancer
D'Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Évaluation : 4.5 sur 5 étoiles
4.5/5 (271)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
D'Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Évaluation : 4.5 sur 5 étoiles
4.5/5 (344)
On Fire: The (Burning) Case for a Green New Deal
D'Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Évaluation : 4 sur 5 étoiles
4/5 (73)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
D'Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Évaluation : 4.5 sur 5 étoiles
4.5/5 (265)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
D'Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Évaluation : 3.5 sur 5 étoiles
3.5/5 (2219)
Team of Rivals: The Political Genius of Abraham Lincoln
D'Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Évaluation : 4.5 sur 5 étoiles
4.5/5 (234)
Fear: Trump in the White House
D'Everand
Fear: Trump in the White House
Bob Woodward
Évaluation : 3.5 sur 5 étoiles
3.5/5 (738)
Principles: Life and Work
D'Everand
Principles: Life and Work
Ray Dalio
Évaluation : 4 sur 5 étoiles
4/5 (599)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
D'Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
Évaluation : 4 sur 5 étoiles
4/5 (1090)
Rise of ISIS: A Threat We Can't Ignore
D'Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Évaluation : 3.5 sur 5 étoiles
3.5/5 (137)
John Adams
D'Everand
John Adams
David McCullough
Évaluation : 4.5 sur 5 étoiles
4.5/5 (2409)
The Unwinding: An Inner History of the New America
D'Everand
The Unwinding: An Inner History of the New America
George Packer
Évaluation : 4 sur 5 étoiles
4/5 (45)
Steve Jobs
D'Everand
Steve Jobs
Walter Isaacson
Évaluation : 4.5 sur 5 étoiles
4.5/5 (806)
Angela's Ashes: A Memoir
D'Everand
Angela's Ashes: A Memoir
Frank McCourt
Évaluation : 4.5 sur 5 étoiles
4.5/5 (440)
Bad Feminist: Essays
D'Everand
Bad Feminist: Essays
Roxane Gay
Évaluation : 4 sur 5 étoiles
4/5 (1015)
The Glass Castle: A Memoir
D'Everand
The Glass Castle: A Memoir
Jeannette Walls
Évaluation : 4.5 sur 5 étoiles
4.5/5 (1711)
The Outsider: A Novel
D'Everand
The Outsider: A Novel
Stephen King
Évaluation : 4 sur 5 étoiles
4/5 (1839)
Brooklyn: A Novel
D'Everand
Brooklyn: A Novel
Colm Toibin
Évaluation : 3.5 sur 5 étoiles
3.5/5 (1937)
The Light Between Oceans: A Novel
D'Everand
The Light Between Oceans: A Novel
M.L. Stedman
Évaluation : 4.5 sur 5 étoiles
4.5/5 (789)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
D'Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Évaluation : 4.5 sur 5 étoiles
4.5/5 (119)
The Woman in Cabin 10
D'Everand
The Woman in Cabin 10
Ruth Ware
Évaluation : 3.5 sur 5 étoiles
3.5/5 (2322)
A Man Called Ove: A Novel
D'Everand
A Man Called Ove: A Novel
Fredrik Backman
Évaluation : 4.5 sur 5 étoiles
4.5/5 (4609)
Manhattan Beach: A Novel
D'Everand
Manhattan Beach: A Novel
Jennifer Egan
Évaluation : 3.5 sur 5 étoiles
3.5/5 (792)
Wolf Hall: A Novel
D'Everand
Wolf Hall: A Novel
Hilary Mantel
Évaluation : 4 sur 5 étoiles
4/5 (3811)
The Perks of Being a Wallflower
D'Everand
The Perks of Being a Wallflower
Stephen Chbosky
Évaluation : 4.5 sur 5 étoiles
4.5/5 (2099)
Sing, Unburied, Sing: A Novel
D'Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Évaluation : 4 sur 5 étoiles
4/5 (1103)
The Art of Racing in the Rain: A Novel
D'Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Évaluation : 4 sur 5 étoiles
4/5 (4200)
Little Women
D'Everand
Little Women
Louisa May Alcott
Évaluation : 4 sur 5 étoiles
4/5 (104)
A Tree Grows in Brooklyn
D'Everand
A Tree Grows in Brooklyn
Betty Smith
Évaluation : 4.5 sur 5 étoiles
4.5/5 (1929)
The Constant Gardener: A Novel
D'Everand
The Constant Gardener: A Novel
John le Carre
Évaluation : 3.5 sur 5 étoiles
3.5/5 (104)
Her Body and Other Parties: Stories
D'Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Évaluation : 4 sur 5 étoiles
4/5 (821)
Freshers Question Paper
Document7 pages
Freshers Question Paper
Karri Jaganath Anil Kumar
Pas encore d'évaluation
Chapter04 Beyond Classical Search
Document39 pages
Chapter04 Beyond Classical Search
Scribd Chou
Pas encore d'évaluation
ES106: Programming For Engineers: Lecture 9, 10, 11: Basic Algorithm
Document16 pages
ES106: Programming For Engineers: Lecture 9, 10, 11: Basic Algorithm
Vishvas Suthar
Pas encore d'évaluation
Maths
Document9 pages
Maths
Sayantani Bose
Pas encore d'évaluation
Artificial Intelligence in 5G
Document34 pages
Artificial Intelligence in 5G
Sakhawat Ali Sahgal
Pas encore d'évaluation
Unit-3-Sequential Logic Circuits-Static and Dynamic Latches &registers
Document4 pages
Unit-3-Sequential Logic Circuits-Static and Dynamic Latches &registers
Manimegalai
Pas encore d'évaluation
Project File Cse
Document162 pages
Project File Cse
vanshika
Pas encore d'évaluation
Milestone - 1: Java Fundamentals
Document57 pages
Milestone - 1: Java Fundamentals
Shivam Sharma
88% (8)
Specification of Tokens Using Regular Expressions
Document8 pages
Specification of Tokens Using Regular Expressions
Sam Alex
Pas encore d'évaluation
CSL 331 Algorithms
Document5 pages
CSL 331 Algorithms
Mohammed Ashiq
Pas encore d'évaluation
Ai Lab Manual
Document30 pages
Ai Lab Manual
Cs062 pooja Singh
Pas encore d'évaluation
CS 6505 Lectures 3 & 4: Greedy Algorithms and Matroids
Document5 pages
CS 6505 Lectures 3 & 4: Greedy Algorithms and Matroids
DineshGarg
Pas encore d'évaluation
Operations With Polynomials Worksheet
Document4 pages
Operations With Polynomials Worksheet
jolinadavid
Pas encore d'évaluation
Lab2 - TMC-đã G P
Document41 pages
Lab2 - TMC-đã G P
Huỳnh Huy Thịnh Võ
Pas encore d'évaluation
Multi-Level Planning For Semi-Autonomous Vehicles in Traffic Scenarios Based On Separation Maximization
Document30 pages
Multi-Level Planning For Semi-Autonomous Vehicles in Traffic Scenarios Based On Separation Maximization
Rahul Kala
Pas encore d'évaluation
SEL- INDIA, NOIDA Data Structure: SOLUTION
Document5 pages
SEL- INDIA, NOIDA Data Structure: SOLUTION
Piyush Goyal
Pas encore d'évaluation
BCJR
Document3 pages
BCJR
gemechu
Pas encore d'évaluation
The Assignment Problem: SMS 4674 / SMS 3392 Operational Research
Document33 pages
The Assignment Problem: SMS 4674 / SMS 3392 Operational Research
Muhammad Hafiz Bin Yusoff
Pas encore d'évaluation
Automata: Answer: C
Document22 pages
Automata: Answer: C
Devjeet Roy
Pas encore d'évaluation
Data Structures and Algorithms Questions
Document78 pages
Data Structures and Algorithms Questions
Aerith Strike
Pas encore d'évaluation
Stack
Document8 pages
Stack
srujan koppunoori
Pas encore d'évaluation
Comp Cheat
Document1 page
Comp Cheat
cccc
Pas encore d'évaluation
Studies in Graph Theory-Distance Related Concepts in Graphs
Document133 pages
Studies in Graph Theory-Distance Related Concepts in Graphs
satheesh
Pas encore d'évaluation
DP Math: G12 SL Function Transformation Review: 1a. (2 Marks) The Diagram Below Shows The Graph
Document1 page
DP Math: G12 SL Function Transformation Review: 1a. (2 Marks) The Diagram Below Shows The Graph
Alex Steinkamp
Pas encore d'évaluation
DSL Pro
Document41 pages
DSL Pro
Sujit Khandare
Pas encore d'évaluation
Java program output for multiple code snippets
Document52 pages
Java program output for multiple code snippets
srmishra2006
Pas encore d'évaluation
Linear Programming Problem Optimal Solution and Variable Ranges
Document5 pages
Linear Programming Problem Optimal Solution and Variable Ranges
nursyahirahm
Pas encore d'évaluation
Machine Learning NN
Document16 pages
Machine Learning NN
Megha
100% (1)
One'S Complement: CC - 148/294 Under Makaut, WB
Document10 pages
One'S Complement: CC - 148/294 Under Makaut, WB
Debmalya Dash
Pas encore d'évaluation
Error Correcting Codes
Document27 pages
Error Correcting Codes
john fisher
Pas encore d'évaluation