A LearnerClust for kernel k-means clustering implemented in kernlab::kkmeans().
kernlab::kkmeans() doesn't have a default value for the number of clusters.
Therefore, the centers parameter here is set to 2 by default.
Kernel parameters have to be passed directly and not by using the kpar list in kkmeans.
The predict method finds the nearest center in kernel distance to
assign clusters for new data points.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, kernlab
Parameters
| Id | Type | Default | Levels | Range |
| centers | untyped | - | - | |
| kernel | character | rbfdot | vanilladot, polydot, rbfdot, tanhdot, laplacedot, besseldot, anovadot, splinedot | - |
| sigma | numeric | - | \([0, \infty)\) | |
| degree | integer | 3 | \([1, \infty)\) | |
| scale | numeric | 1 | \([0, \infty)\) | |
| offset | numeric | 1 | \((-\infty, \infty)\) | |
| order | integer | 1 | \((-\infty, \infty)\) | |
| alg | character | kkmeans | kkmeans, kerninghan | - |
| p | numeric | 1 | \((-\infty, \infty)\) |
References
Karatzoglou, Alexandros, Smola, Alexandros, Hornik, Kurt, Zeileis, Achim (2004). “kernlab-an S4 package for kernel methods in R.” Journal of statistical software, 11, 1–20.
Dhillon, S I, Guan, Yuqiang, Kulis, Brian (2004). A unified view of kernel k-means, spectral clustering and graph cuts. Citeseer.
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_clust.MBatchKMeans,
mlr_learners_clust.SimpleKMeans,
mlr_learners_clust.agnes,
mlr_learners_clust.ap,
mlr_learners_clust.bico,
mlr_learners_clust.birch,
mlr_learners_clust.cmeans,
mlr_learners_clust.cobweb,
mlr_learners_clust.dbscan,
mlr_learners_clust.dbscan_fpc,
mlr_learners_clust.diana,
mlr_learners_clust.em,
mlr_learners_clust.fanny,
mlr_learners_clust.featureless,
mlr_learners_clust.ff,
mlr_learners_clust.hclust,
mlr_learners_clust.hdbscan,
mlr_learners_clust.kmeans,
mlr_learners_clust.mclust,
mlr_learners_clust.meanshift,
mlr_learners_clust.optics,
mlr_learners_clust.pam,
mlr_learners_clust.xmeans
Super classes
mlr3::Learner -> mlr3cluster::LearnerClust -> LearnerClustKKMeans
Examples
# Define the Learner and set parameter values
learner = lrn("clust.kkmeans")
print(learner)
#>
#> ── <LearnerClustKKMeans> (clust.kkmeans): Kernel K-Means ───────────────────────
#> • Model: -
#> • Parameters: centers=2
#> • Packages: mlr3, mlr3cluster, and kernlab
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and partitional
#> • Other settings: use_weights = 'error'
# Define a Task
task = tsk("usarrests")
# Train the learner on the task
learner$train(task)
#> Using automatic sigma estimation (sigest) for RBF or laplace kernel
# Print the model
print(learner$model)
#> Spectral Clustering object of class "specc"
#>
#> Cluster memberships:
#>
#> 1 1 1 1 1 1 2 1 1 1 2 2 1 2 2 2 2 1 2 1 1 1 2 1 1 2 2 1 2 1 1 1 1 2 2 1 1 2 1 1 2 1 1 2 2 1 1 2 2 1
#>
#> Gaussian Radial Basis kernel function.
#> Hyperparameter : sigma = 0.00043279660093855
#>
#> Centers:
#> [,1] [,2] [,3] [,4]
#> [1,] 226.2333 10.13333 25.79333 69.40
#> [2,] 87.5500 4.27000 14.39000 59.75
#>
#> Cluster size:
#> [1] 30 20
#>
#> Within-cluster sum of squares:
#> [1] 1799973.6 213075.9
#>
# Make predictions for the task
prediction = learner$predict(task)
# Score the predictions
prediction$score(task = task)
#> clust.dunn
#> 0.04322918