Rousseeuw's Silhouette Quality Index
Source:R/MeasureClustSimple.R
mlr_measures_clust.silhouette.RdThe Silhouette Width measures how well each observation fits within its assigned cluster compared to neighboring clusters. For each observation, the silhouette value is defined as \(s(i) = (b(i) - a(i)) / \max(a(i), b(i))\) where \(a(i)\) is the average distance to all other observations in the same cluster and \(b(i)\) is the minimum average distance to observations in any other cluster. The score returned is the mean silhouette width across all observations. Values close to 1 indicate well-clustered observations, values near 0 indicate observations on cluster boundaries, and negative values indicate possible misclassification.
The score function calls cluster::silhouette() from package cluster.
Details
If the task contains factor or ordered features, Gower distances (cluster::daisy()) are used instead of
Euclidean distances.
Dictionary
This mlr3::Measure can be instantiated via the dictionary mlr3::mlr_measures or with the
associated sugar function mlr3::msr():
Meta Information
Task type: “clust”
Range: \([-1, 1]\)
Minimize: FALSE
Average: macro
Required Prediction: “partition”
Required Packages: mlr3, mlr3cluster, cluster
References
Rousseeuw, J P (1987). “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis.” Journal of Computational and Applied Mathematics, 20, 53–65. doi:10.1016/0377-0427(87)90125-7 .
See also
Dictionary of Measures: mlr3::mlr_measures
as.data.table(mlr_measures) for a complete table of all (also dynamically created) mlr3::Measure implementations.
Other cluster measures:
mlr_measures_clust.avg_between,
mlr_measures_clust.avg_within,
mlr_measures_clust.ch,
mlr_measures_clust.davies_bouldin,
mlr_measures_clust.dunn,
mlr_measures_clust.dunn2,
mlr_measures_clust.entropy,
mlr_measures_clust.pearsongamma,
mlr_measures_clust.wb_ratio,
mlr_measures_clust.wss