A LearnerClust for Expectation-Maximization clustering implemented in
RWeka::list_Weka_interfaces().
The predict method uses RWeka::predict.Weka_clusterer() to compute the
cluster memberships for new data.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, RWeka
Parameters
| Id | Type | Default | Levels | Range |
| I | integer | 100 | \([1, \infty)\) | |
| ll_cv | numeric | 1e-06 | \([1e-06, \infty)\) | |
| ll_iter | numeric | 1e-06 | \([1e-06, \infty)\) | |
| M | numeric | 1e-06 | \([1e-06, \infty)\) | |
| max | integer | -1 | \([-1, \infty)\) | |
| N | integer | -1 | \([-1, \infty)\) | |
| num_slots | integer | 1 | \([1, \infty)\) | |
| S | integer | 100 | \([0, \infty)\) | |
| X | integer | 10 | \([1, \infty)\) | |
| K | integer | 10 | \([1, \infty)\) | |
| V | logical | FALSE | TRUE, FALSE | - |
| output_debug_info | logical | FALSE | TRUE, FALSE | - |
References
Witten, H I, Frank, Eibe (2002). “Data mining: practical machine learning tools and techniques with Java implementations.” Acm Sigmod Record, 31(1), 76–77.
Dempster, P A, Laird, M N, Rubin, B D (1977). “Maximum likelihood from incomplete data via the EM algorithm.” Journal of the royal statistical society: series B (methodological), 39(1), 1–22.
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_clust.MBatchKMeans,
mlr_learners_clust.SimpleKMeans,
mlr_learners_clust.agnes,
mlr_learners_clust.ap,
mlr_learners_clust.bico,
mlr_learners_clust.birch,
mlr_learners_clust.cmeans,
mlr_learners_clust.cobweb,
mlr_learners_clust.dbscan,
mlr_learners_clust.dbscan_fpc,
mlr_learners_clust.diana,
mlr_learners_clust.fanny,
mlr_learners_clust.featureless,
mlr_learners_clust.ff,
mlr_learners_clust.hclust,
mlr_learners_clust.hdbscan,
mlr_learners_clust.kkmeans,
mlr_learners_clust.kmeans,
mlr_learners_clust.mclust,
mlr_learners_clust.meanshift,
mlr_learners_clust.optics,
mlr_learners_clust.pam,
mlr_learners_clust.xmeans
Super classes
mlr3::Learner -> mlr3cluster::LearnerClust -> LearnerClustEM
Examples
# Define the Learner and set parameter values
learner = lrn("clust.em")
print(learner)
#>
#> ── <LearnerClustEM> (clust.em): Expectation-Maximization Clustering ────────────
#> • Model: -
#> • Parameters: list()
#> • Packages: mlr3, mlr3cluster, and RWeka
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and partitional
#> • Other settings: use_weights = 'error'
# Define a Task
task = tsk("usarrests")
# Train the learner on the task
learner$train(task)
# Print the model
print(learner$model)
#>
#> EM
#> ==
#>
#> Number of clusters selected by cross validation: 2
#> Number of iterations performed: 12
#>
#>
#> Cluster
#> Attribute 0 1
#> (0.42) (0.58)
#> ==============================
#> Assault
#> mean 252.4139 110.5952
#> std. dev. 43.9813 43.1685
#>
#> Murder
#> mean 11.8783 4.7742
#> std. dev. 2.8171 2.2431
#>
#> Rape
#> mean 28.5285 15.8558
#> std. dev. 8.3686 5.4395
#>
#> UrbanPop
#> mean 68.058 63.6846
#> std. dev. 14.02 14.2715
#>
#>
# Make predictions for the task
prediction = learner$predict(task)
# Score the predictions
prediction$score(task = task)
#> clust.dunn
#> 0.1220028