Spherical K-Means Clustering Learner

Spherical k-means clustering for data on the unit hypersphere. Calls skmeans::skmeans() from package skmeans.

The k parameter is set to 2 by default since skmeans::skmeans() doesn't have a default value for the number of clusters. Observations are partitioned by maximising cosine similarity to cluster prototypes. Predictions on new data assign each observation to the prototype with the highest cosine similarity. Rows with zero norm are not allowed by skmeans::skmeans().

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.skmeans")
lrn("clust.skmeans")

Meta Information

Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, skmeans

Parameters

Id	Type	Default	Levels	Range
k	integer	-		$[2, \infty)$
method	character	-	genetic, pclust, CLUTO, gmeans, kmndirs, LIH, LIHC	-
m	numeric	1		$[1, \infty)$
weights	untyped	1		-
maxiter	integer	-		$[1, \infty)$
nruns	integer	-		$[1, \infty)$
popsize	integer	-		$[1, \infty)$
mutations	numeric	-		$[0, 1]$
reltol	numeric	-		$[0, \infty)$
verbose	logical	-	TRUE, FALSE	-

References

Dhillon, S I, Modha, S D (2001). “Concept decompositions for large sparse text data using clustering.” Machine Learning, 42(1), 143–175. doi:10.1023/A:1007612920971 .

Hornik, Kurt, Feinerer, Ingo, Kober, Martin, Buchta, Christian (2012). “Spherical k-Means Clustering.” Journal of Statistical Software, 50(10), 1–22. doi:10.18637/jss.v050.i10 .

Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
Dictionary of Learners: mlr3::mlr_learners
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Package mlr3viz for some generic visualizations.
Extension packages for additional task types:
- mlr3proba for probabilistic supervised regression and survival analysis.
- mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.

Super classes

mlr3::Learner -> LearnerClust -> LearnerClustSKMeans

Methods

Inherited methods

`LearnerClustSKMeans$new()`

Creates a new instance of this R6 class.

Usage

LearnerClustSKMeans$new()

`LearnerClustSKMeans$clone()`

The objects of this class are cloneable with this method.

Usage

LearnerClustSKMeans$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.skmeans")
print(learner)
#> 
#> ── <LearnerClustSKMeans> (clust.skmeans): Spherical K-Means ────────────────────
#> • Model: -
#> • Parameters: k=2
#> • Packages: mlr3, mlr3cluster, and skmeans
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and partitional
#> • Other settings: use_weights = 'error', predict_raw = 'FALSE'

# Define a Task
task = tsk("usarrests")

# Train the learner on the task
learner$train(task)

# Print the model
print(learner$model)
#> A hard spherical k-means partition of 50 objects into 2 classes.
#> Class sizes: 17, 33
#> Call: skmeans::skmeans(x = as.matrix(task$data()), k = 2L, control = structure(list(), names = character(0)))

# Make predictions for the task
prediction = learner$predict(task)

# Score the predictions
prediction$score(task = task)
#> clust.dunn 
#> 0.03303237

Dictionary

Meta Information

Parameters

References

See also

Super classes

Methods

Public methods

LearnerClustSKMeans$new()

Usage

LearnerClustSKMeans$clone()

Usage

Arguments

Examples

`LearnerClustSKMeans$new()`

`LearnerClustSKMeans$clone()`