Skip to contents

Genie hierarchical clustering, a fast and robust outlier-resistant algorithm based on the Gini inequality measure applied to cluster sizes during the linkage process. Calls genieclust::gclust() from package genieclust.

There is no predict method for genieclust::gclust(), so the method returns cluster labels for the training data obtained via stats::cutree() at the requested k.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.genie")
lrn("clust.genie")

Meta Information

  • Task type: “clust”

  • Predict Types: “partition”

  • Feature Types: “logical”, “integer”, “numeric”

  • Required Packages: mlr3, mlr3cluster, genieclust

Parameters

IdTypeDefaultLevelsRange
gini_thresholdnumeric0.3\([0, 1]\)
Minteger0\([0, \infty)\)
distancecharactereuclideaneuclidean, l2, manhattan, cityblock, l1, cosine-
verboselogicalFALSETRUE, FALSE-
kinteger2\([1, \infty)\)

References

Gagolewski, Marek, Bartoszuk, Maciej, Cena, Anna (2016). “Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm.” Information Sciences, 363, 8–23. doi:10.1016/j.ins.2016.05.003 .

Gagolewski, Marek (2021). “genieclust: Fast and robust hierarchical clustering.” SoftwareX, 15, 100722. doi:10.1016/j.softx.2021.100722 .

See also

Other Learner: mlr_learners_clust.MBatchKMeans, mlr_learners_clust.SimpleKMeans, mlr_learners_clust.agnes, mlr_learners_clust.ap, mlr_learners_clust.bico, mlr_learners_clust.birch, mlr_learners_clust.clara, mlr_learners_clust.cmeans, mlr_learners_clust.cobweb, mlr_learners_clust.dbscan, mlr_learners_clust.dbscan_fpc, mlr_learners_clust.diana, mlr_learners_clust.em, mlr_learners_clust.fanny, mlr_learners_clust.featureless, mlr_learners_clust.ff, mlr_learners_clust.flexmix, mlr_learners_clust.hclust, mlr_learners_clust.hdbscan, mlr_learners_clust.kcca, mlr_learners_clust.kkmeans, mlr_learners_clust.kmeans, mlr_learners_clust.kproto, mlr_learners_clust.mclust, mlr_learners_clust.meanshift, mlr_learners_clust.movMF, mlr_learners_clust.optics, mlr_learners_clust.pam, mlr_learners_clust.protoclust, mlr_learners_clust.skmeans, mlr_learners_clust.som, mlr_learners_clust.specc, mlr_learners_clust.stdbscan, mlr_learners_clust.tclust, mlr_learners_clust.xmeans

Super classes

mlr3::Learner -> LearnerClust -> LearnerClustGenie

Methods

Inherited methods


LearnerClustGenie$new()

Creates a new instance of this R6 class.

Usage


LearnerClustGenie$clone()

The objects of this class are cloneable with this method.

Usage

LearnerClustGenie$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.genie")
print(learner)
#> 
#> ── <LearnerClustGenie> (clust.genie): Genie Hierarchical Clustering ────────────
#> • Model: -
#> • Parameters: k=2
#> • Packages: mlr3, mlr3cluster, and genieclust
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and hierarchical
#> • Other settings: use_weights = 'error', predict_raw = 'FALSE'

# Define a Task
task = tsk("usarrests")

# Train the learner on the task
learner$train(task)

# Print the model
print(learner$model)
#> 
#> Call:
#> gclust.mst(d = tree, gini_threshold = gini_threshold, verbose = verbose)
#> 
#> Cluster method   : Genie(0.3) 
#> Distance         : euclidean 
#> Number of objects: 50 
#> 

# Make predictions for the task
prediction = learner$predict(task)
#> Warning: 
#>  Learner 'clust.genie' doesn't predict on new data and predictions may not
#>   make sense on new data.
#> → Class: Mlr3WarningInput

# Score the predictions
prediction$score(task = task)
#> clust.dunn 
#>  0.1532626