Determining Optimal Clusters: Identifying the right number of clusters to group your data.K-Means Clustering: Calculations and methods for creating K subgroups of the data.
Clustering Distance Measures: Understanding how to measure differences in observations.Data Preparation: Preparing our data for cluster analysis.Replication Requirements: What you’ll need to reproduce the analysis in this tutorial.This tutorial serves as an introduction to the k-means clustering method. K-means clustering is the simplest and the most commonly used clustering method for splitting a dataset into a set of k groups. Clustering allows us to identify which observations are alike, and potentially categorize them therein. Because there isn’t a response variable, this is an unsupervised method, which implies that it seeks to find relationships between the observations without being trained by a response variable. When we cluster observations, we want observations in the same group to be similar and observations in different groups to be dissimilar. Clustering is a broad set of techniques for finding subgroups of observations within a data set.