k-Means Clustering
- We already learned in Module 1 that Clustering is about grouping similar things together, without labels.
- Now, let’s study one of the most popular clustering algorithms: K-Means.
1. What is K-Means?
- K-Means groups data into K clusters (K = number of groups you want).
- Each cluster has a center point (called a centroid).
- Every data point belongs to the cluster whose centroid it is closest to.
Example: Suppose you want to divide customers into 3 groups based on their income and age.
- K-Means will place 3 centers.
- Each customer will be assigned to the nearest center.
- The groups form automatically.
2. How Does K-Means Work?
Step by step intuition:
- Choose K → Decide how many clusters (groups) you want.
- Place centers → K-Means starts with K random points as “centers.”
- Assign points → Each data point is assigned to the nearest center.
- Update centers → Move each center to the average position of its assigned points.