Skip to contents

The Shannon entropy of the cluster size distribution, defined as \(H = -\sum_{k=1}^{K} p_k \log(p_k)\) where \(p_k = n_k / n\) is the proportion of observations in cluster \(k\). Lower values indicate more uneven cluster sizes (with 0 for a single cluster), while higher values indicate more uniform sizes. This measure does not evaluate cluster quality directly but characterizes the balance of the partition.

Dictionary

This mlr3::Measure can be instantiated via the dictionary mlr3::mlr_measures or with the associated sugar function mlr3::msr():

mlr_measures$get("clust.entropy")
msr("clust.entropy")

Meta Information

  • Task type: “clust”

  • Range: \([0, \infty)\)

  • Minimize: NA

  • Average: macro

  • Required Prediction: “partition”

  • Required Packages: mlr3, mlr3cluster