Robust Trimmed Clustering Learner

Robust trimmed clustering. Each cluster is modeled by a multivariate Gaussian; the most outlying alpha fraction of observations is trimmed and labeled with cluster 0 in the returned partition. Calls tclust::tclust() from package tclust.

The k parameter is set to 2 by default since tclust::tclust() doesn't have a default value for the number of clusters. There is no predict method for tclust::tclust(), so the method returns cluster labels for the training data.

Initial parameter values

store_x:
- Actual default: TRUE.
- Adjusted default: FALSE.
- Reason for change: Avoid storing the training data in the model to save memory.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.tclust")
lrn("clust.tclust")

Meta Information

Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, tclust

Parameters

Id	Type	Default	Levels	Range
k	integer	-		$[1, \infty)$
alpha	numeric	0.05		$[0, 0.5]$
nstart	integer	500		$[1, \infty)$
niter1	integer	3		$[1, \infty)$
niter2	integer	20		$[1, \infty)$
nkeep	integer	5		$[1, \infty)$
equal.weights	logical	FALSE	TRUE, FALSE	-
restr	character	eigen	eigen, deter	-
restr.fact	numeric	12		$[1, \infty)$
cshape	numeric	1e+10		$[1, \infty)$
opt	character	HARD	HARD, MIXT	-
center	logical	FALSE	TRUE, FALSE	-
scale	logical	FALSE	TRUE, FALSE	-
store_x	logical	TRUE	TRUE, FALSE	-
parallel	logical	FALSE	TRUE, FALSE	-
n.cores	integer	-1		$(-\infty, \infty)$
zero_tol	numeric	1e-16		$[0, \infty)$
drop.empty.clust	logical	TRUE	TRUE, FALSE	-
trace	integer	0		$[0, \infty)$

References

García-Escudero, A L, Gordaliza, Alfonso, Matrán, Carlos, Mayo-Iscar, Agustín (2008). “A general trimming approach to robust cluster analysis.” The Annals of Statistics, 36(3), 1324–1345. doi:10.1214/07-AOS515 .

Fritz, Heinrich, García-Escudero, A L, Mayo-Iscar, Agustín (2012). “tclust: An R package for a trimming approach to cluster analysis.” Journal of Statistical Software, 47(12), 1–26. doi:10.18637/jss.v047.i12 .

Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
Dictionary of Learners: mlr3::mlr_learners
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Package mlr3viz for some generic visualizations.
Extension packages for additional task types:
- mlr3proba for probabilistic supervised regression and survival analysis.
- mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.

Super classes

mlr3::Learner -> LearnerClust -> LearnerClustTclust

Methods

Inherited methods

`LearnerClustTclust$new()`

Creates a new instance of this R6 class.

Usage

LearnerClustTclust$new()

`LearnerClustTclust$clone()`

The objects of this class are cloneable with this method.

Usage

LearnerClustTclust$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.tclust")
print(learner)
#> 
#> ── <LearnerClustTclust> (clust.tclust): Robust Trimmed Clustering ──────────────
#> • Model: -
#> • Parameters: k=2, store_x=FALSE
#> • Packages: mlr3, mlr3cluster, and tclust
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: exclusive, partial, and partitional
#> • Other settings: use_weights = 'error', predict_raw = 'FALSE'

# Define a Task
task = tsk("usarrests")

# Train the learner on the task
learner$train(task)

# Print the model
print(learner$model)
#> * Results for TCLUST algorithm: *
#> opt=HARD, trim = 0.05, k = 2
#> Restriction on: eigenvalues
#> 
#> Classification (trimmed points are indicated by 0 ):
#>  [1] 2 2 2 2 2 2 1 2 0 2 0 1 2 1 1 1 1 2 1 2 1 2 1 2 2 1 1 2 1 1 2 2 0 1 1 1 1 1
#> [39] 1 2 1 2 2 1 1 1 1 1 1 1
#> Means:
#>                C 1    C 2
#> Assault  109.59259 243.05
#> Murder     4.67037  11.48
#> Rape      15.65926  28.53
#> UrbanPop  63.11111  68.25
#> 
#> Trimmed objective function:  -752.9647 
#> Selected restriction factor: 12 

# Make predictions for the task
prediction = learner$predict(task)
#> Warning: 
#> ✖ Learner 'clust.tclust' doesn't predict on new data and predictions may not
#>   make sense on new data.
#> → Class: Mlr3WarningInput

# Score the predictions
prediction$score(task = task)
#> clust.dunn 
#> 0.06709199

Initial parameter values

Dictionary

Meta Information

Parameters

References

See also

Super classes

Methods

Public methods

LearnerClustTclust$new()

Usage

LearnerClustTclust$clone()

Usage

Arguments

Examples

`LearnerClustTclust$new()`

`LearnerClustTclust$clone()`