Vous êtes sur la page 1sur 2

Cluster Analysis – K-means

Question 5 – Suggested Solution

1. Cluster 1 has 40 cases (40/60 = 66% of buyers) and is characterised by


people who have run considerably less marathons than members of the
other cluster; people who have a slightly above average height; people
who weigh significantly less than members of the other cluster; and who
find the shoes moderately uncomfortable.

Cluster 2 has 20 cases (20/60 = 33% of buyers) and is characterised by


people who have run many marathons; people who have a below average
height; people who weigh considerably more than members of the other
cluster; and who find the shoes very comfortable.

MARATHON: p-val = 0.00 < 0.05


HEIGHT: p-val = 0.007 < 0.05
WEIGHT: p-val = 0.00 < 0.05
COMFORT: p-val = 0.00 < 0.05

Therefore, these are significant at the 5% significance level.

2. CLUSTER 1: 25 “YES”es out of 40 = 62.50%


CLUSTER 2: 7 “YES”es out of 20 = 35.00%

Therefore, Cluster 1 is more likely to contain repeat purchasers.

3. Cluster 1 has 19 cases (19/60 = 31.67% of buyers) and is characterised by


people who have run many marathons; people who have below average
height; people who weigh considerably more than members of the other
clusters; and who find the shoes very comfortable.

Cluster 2 has 21 cases (21/60 = 35% of buyers) and is characterised by


people who have run a number of marathons that is slightly above
average; people who are very tall; people who have marginally below
average weight; and find the shoes slightly uncomfortable.

Cluster 3 has 20 cases (20/60 = 33.33% of buyers) and is characterised by


people who have run very few marathons; people who have below
average height; people who weight much less than members of the other
clusters; and people who find the shoes very uncomfortable.

MARATHON: p-val = 0.00 < 0.05


HEIGHT: p-val = 0.00 < 0.05
WEIGHT: p-val = 0.00 < 0.05
COMFORT: p-val = 0.00 < 0.05

Therefore, these are significant at the 5% significance level.

4. CLUSTER 1: 7 “YES”es out of 19 = 36.80%


CLUSTER 2: 14 “YES”es out of 21 = 66.67%
CLUSTER 3: 11 “YES”es out of 20 = 55.00%

Therefore, Cluster 2 is more likely to contain repeat purchasers.


5. Open to interpretation. Both 2-Cluster and 3-Cluster models are able to
pick up significant differences. The 3-Cluster model has the advantage of
being able to identify a cluster that finds the shoes slightly uncomfortable
as opposed to the 2-Cluster model that only distinguishes between
members who find the shoes moderately uncomfortable and very
uncomfortable. The 3-Cluster model also identifies those people who have
run slightly above average number of marathons; people who are very tall;
and people who weigh slightly below average.

Based on the fact that the 3-cluster solution picks up these important
groupings, it would seem to be preferable to a 2-Cluster solution.

Either answer is acceptable as long as it is substantiated.

Vous aimerez peut-être aussi