Skip to contents

A LearnerClust for mini batch k-means clustering implemented in ClusterR::MiniBatchKmeans(). ClusterR::MiniBatchKmeans() doesn't have a default value for the number of clusters. Therefore, the clusters parameter here is set to 2 by default. The predict method uses ClusterR::predict_MBatchKMeans() to compute the cluster memberships for new data. The learner supports both partitional and fuzzy clustering.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.MBatchKMeans")
lrn("clust.MBatchKMeans")

Meta Information

  • Task type: “clust”

  • Predict Types: “partition”, “prob”

  • Feature Types: “logical”, “integer”, “numeric”

  • Required Packages: mlr3, mlr3cluster, ClusterR

Parameters

IdTypeDefaultLevelsRange
clustersinteger2\([1, \infty)\)
batch_sizeinteger10\([1, \infty)\)
num_initinteger1\([1, \infty)\)
max_itersinteger100\([1, \infty)\)
init_fractionnumeric1\([0, 1]\)
initializercharacterkmeans++optimal_init, quantile_init, kmeans++, random-
early_stop_iterinteger10\([1, \infty)\)
verboselogicalFALSETRUE, FALSE-
CENTROIDSuntypedNULL-
tolnumeric1e-04\([0, \infty)\)
tol_optimal_initnumeric0.3\([0, \infty)\)
seedinteger1\((-\infty, \infty)\)

References

Sculley, David (2010). “Web-scale k-means clustering.” In Proceedings of the 19th international conference on World wide web, 1177–1178.

Super classes

mlr3::Learner -> mlr3cluster::LearnerClust -> LearnerClustMiniBatchKMeans

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerClustMiniBatchKMeans$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.MBatchKMeans")
print(learner)
#> 
#> ── <LearnerClustMiniBatchKMeans> (clust.MBatchKMeans): Mini Batch K-Means ──────
#> • Model: -
#> • Parameters: clusters=2
#> • Packages: mlr3, mlr3cluster, and ClusterR
#> • Predict Types: [partition] and prob
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, fuzzy, and partitional
#> • Other settings: use_weights = 'error'

# Define a Task
task = tsk("usarrests")

# Train the learner on the task
learner$train(task)
#> Warning: `predict_MBatchKMeans()` was deprecated in ClusterR 1.3.0.
#>  Beginning from version 1.4.0, if the fuzzy parameter is TRUE the function
#>   'predict_MBatchKMeans' will return only the probabilities, whereas currently
#>   it also returns the hard clusters
#>  The deprecated feature was likely used in the ClusterR package.
#>   Please report the issue at <https://github.com/mlampros/ClusterR/issues>.

# Print the model
print(learner$model)
#> $centroids
#>        [,1]     [,2]     [,3]     [,4]
#> [1,] 235.50 12.08333 26.23333 71.16667
#> [2,]  86.25  4.07500 14.22500 48.00000
#> 
#> $WCSS_per_cluster
#>          [,1]     [,2]
#> [1,] 7889.297 4086.948
#> 
#> $best_initialization
#> [1] 1
#> 
#> $iters_per_initialization
#>      [,1]
#> [1,]   26
#> 
#> attr(,"class")
#> [1] "MBatchKMeans"       "k-means clustering"

# Make predictions for the task
prediction = learner$predict(task)

# Score the predictions
prediction$score(task = task)
#> clust.dunn 
#> 0.06244552