Vous êtes sur la page 1sur 1

Bisecting k-means algorithm:

The traditional k-means provides no guarantee of producing balanced clusters. Therefore,


it is quite possible that a single cluster contains a large portion of the entire datasets,
limiting the usefulness of this algorithm for improving scalability. Bisecting k-means can
be used to enforce balancing constraints, which is a variant of k-means and repeatedly
bisects the largest remaining cluster into two sub clusters at each step, until the desired
value of k is reached. This algorithm starts with
The whole data set as a single cluster, and typically converges to balanced clusters of
similar sizes.

Algorithms –K-means and Bisect K-means


1 Basic K-means Steps

(1) Randomly select k initial “center


points” from the dataset.
(2) For each sample, calculate the distances
between the sample and all the k center points.
Then each sample is clustered into one of the k
classes with the closest center point.
(3) Recalculate each new center point as the
mean value of each class.
(4) Repeat steps (2) and (3), until the center
points do not change any more.

0 Bisect K-meansBasic
1
(1)K-means with k=2.
(2)Divide one class into two sub classes at a time.
(3)Can be applied repeatedly on the sub classes to
obtain any number of classes.
2

Vous aimerez peut-être aussi