The Shannon entropy of the cluster size distribution, defined as \(H = -\sum_{k=1}^{K} p_k \log(p_k)\) where \(p_k = n_k / n\) is the proportion of observations in cluster \(k\). Lower values indicate more uneven cluster sizes (with 0 for a single cluster), while higher values indicate more uniform sizes. This measure does not evaluate cluster quality directly but characterizes the balance of the partition.
Dictionary
This mlr3::Measure can be instantiated via the dictionary mlr3::mlr_measures or with the
associated sugar function mlr3::msr():
Meta Information
Task type: “clust”
Range: \([0, \infty)\)
Minimize: NA
Average: macro
Required Prediction: “partition”
Required Packages: mlr3, mlr3cluster
See also
Dictionary of Measures: mlr3::mlr_measures
as.data.table(mlr_measures) for a complete table of all (also dynamically created) mlr3::Measure implementations.
Other cluster measures:
mlr_measures_clust.avg_between,
mlr_measures_clust.avg_within,
mlr_measures_clust.ch,
mlr_measures_clust.davies_bouldin,
mlr_measures_clust.dunn,
mlr_measures_clust.dunn2,
mlr_measures_clust.pearsongamma,
mlr_measures_clust.silhouette,
mlr_measures_clust.wb_ratio,
mlr_measures_clust.wss