Robust trimmed clustering. Each cluster is modeled by a multivariate Gaussian; the most
outlying alpha fraction of observations is trimmed and labeled with cluster 0 in the returned partition.
Calls tclust::tclust() from package tclust.
The k parameter is set to 2 by default since tclust::tclust() doesn't have a default value for the number of
clusters. There is no predict method for tclust::tclust(), so the method returns cluster labels for the training
data.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, tclust
Parameters
| Id | Type | Default | Levels | Range |
| k | integer | - | \([1, \infty)\) | |
| alpha | numeric | 0.05 | \([0, 0.5]\) | |
| nstart | integer | 500 | \([1, \infty)\) | |
| niter1 | integer | 3 | \([1, \infty)\) | |
| niter2 | integer | 20 | \([1, \infty)\) | |
| nkeep | integer | 5 | \([1, \infty)\) | |
| iter.max | integer | - | \([1, \infty)\) | |
| equal.weights | logical | FALSE | TRUE, FALSE | - |
| restr | character | eigen | eigen, deter | - |
| restr.fact | numeric | 12 | \([1, \infty)\) | |
| cshape | numeric | 1e+10 | \([1, \infty)\) | |
| opt | character | HARD | HARD, MIXT | - |
| center | logical | FALSE | TRUE, FALSE | - |
| scale | logical | FALSE | TRUE, FALSE | - |
| parallel | logical | FALSE | TRUE, FALSE | - |
| n.cores | integer | -1 | \((-\infty, \infty)\) | |
| zero_tol | numeric | 1e-16 | \([0, \infty)\) | |
| drop.empty.clust | logical | TRUE | TRUE, FALSE | - |
| trace | integer | 0 | \([0, \infty)\) |
References
García-Escudero, A L, Gordaliza, Alfonso, Matrán, Carlos, Mayo-Iscar, Agustín (2008). “A general trimming approach to robust cluster analysis.” The Annals of Statistics, 36(3), 1324–1345. doi:10.1214/07-AOS515 .
Fritz, Heinrich, García-Escudero, A L, Mayo-Iscar, Agustín (2012). “tclust: An R package for a trimming approach to cluster analysis.” Journal of Statistical Software, 47(12), 1–26. doi:10.18637/jss.v047.i12 .
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_clust.MBatchKMeans,
mlr_learners_clust.SimpleKMeans,
mlr_learners_clust.agnes,
mlr_learners_clust.ap,
mlr_learners_clust.bico,
mlr_learners_clust.birch,
mlr_learners_clust.clara,
mlr_learners_clust.cmeans,
mlr_learners_clust.cobweb,
mlr_learners_clust.dbscan,
mlr_learners_clust.dbscan_fpc,
mlr_learners_clust.diana,
mlr_learners_clust.em,
mlr_learners_clust.fanny,
mlr_learners_clust.featureless,
mlr_learners_clust.ff,
mlr_learners_clust.flexmix,
mlr_learners_clust.genie,
mlr_learners_clust.hclust,
mlr_learners_clust.hdbscan,
mlr_learners_clust.kcca,
mlr_learners_clust.kkmeans,
mlr_learners_clust.kmeans,
mlr_learners_clust.kproto,
mlr_learners_clust.mclust,
mlr_learners_clust.meanshift,
mlr_learners_clust.movMF,
mlr_learners_clust.optics,
mlr_learners_clust.pam,
mlr_learners_clust.protoclust,
mlr_learners_clust.skmeans,
mlr_learners_clust.som,
mlr_learners_clust.specc,
mlr_learners_clust.stdbscan,
mlr_learners_clust.xmeans
Super classes
mlr3::Learner -> LearnerClust -> LearnerClustTclust
Examples
# Define the Learner and set parameter values
learner = lrn("clust.tclust")
print(learner)
#>
#> ── <LearnerClustTclust> (clust.tclust): Robust Trimmed Clustering ──────────────
#> • Model: -
#> • Parameters: k=2
#> • Packages: mlr3, mlr3cluster, and tclust
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: exclusive and partitional
#> • Other settings: use_weights = 'error', predict_raw = 'FALSE'
# Define a Task
task = tsk("usarrests")
# Train the learner on the task
learner$train(task)
# Print the model
print(learner$model)
#> * Results for TCLUST algorithm: *
#> opt=HARD, trim = 0.05, k = 2
#> Restriction on: eigenvalues
#>
#> Classification (trimmed points are indicated by 0 ):
#> [1] 2 2 2 2 2 2 1 2 0 2 0 1 2 1 1 1 1 2 1 2 1 2 1 2 2 1 1 2 1 1 2 2 0 1 1 1 1 1
#> [39] 1 2 1 2 2 1 1 1 1 1 1 1
#> Means:
#> C 1 C 2
#> Assault 109.59259 243.05
#> Murder 4.67037 11.48
#> Rape 15.65926 28.53
#> UrbanPop 63.11111 68.25
#>
#> Trimmed objective function: -752.9647
#> Selected restriction factor: 12
# Make predictions for the task
prediction = learner$predict(task)
#> Warning:
#> ✖ Learner 'clust.tclust' doesn't predict on new data and predictions may not
#> make sense on new data.
#> → Class: Mlr3WarningInput
# Score the predictions
prediction$score(task = task)
#> clust.dunn
#> 0.06709199