Skip to contents

K-prototypes clustering for mixed-type data. Calls clustMixType::kproto() from package clustMixType.

The k parameter is set to 2 by default since clustMixType::kproto() doesn't have a default value for the number of clusters.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.kproto")
lrn("clust.kproto")

Meta Information

  • Task type: “clust”

  • Predict Types: “partition”

  • Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”

  • Required Packages: mlr3, mlr3cluster, clustMixType

Parameters

IdTypeDefaultLevelsRange
kuntyped--
lambdauntypedNULL-
typecharacterhuanghuang, gower-
iter.maxinteger100\([1, \infty)\)
nstartinteger1\([1, \infty)\)
na.rmcharacteryesyes, no, imp.internal, imp.onestep-
verboselogicalTRUETRUE, FALSE-
initcharacterNULLnbh.dens, sel.cen, nstart.m-
p_nstart.mnumeric0.9\([0, 1]\)

References

Huang, Zhexue (1998). “Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values.” Data Mining and Knowledge Discovery, 2(3), 283–304.

Super classes

mlr3::Learner -> mlr3cluster::LearnerClust -> LearnerClustKProto

Methods

Inherited methods


Method new()

Creates a new instance of this R6 class.

Usage


Method clone()

The objects of this class are cloneable with this method.

Usage

LearnerClustKProto$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.kproto")
print(learner)
#> 
#> ── <LearnerClustKProto> (clust.kproto): K-Prototypes ───────────────────────────
#> • Model: -
#> • Parameters: k=2, verbose=FALSE
#> • Packages: mlr3, mlr3cluster, and clustMixType
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and partitional
#> • Other settings: use_weights = 'error'

# Define a mixed-type Task (kproto requires at least one factor variable)
data = data.frame(
  x1 = c(1, 2, 10, 11, 1, 2, 10, 11),
  x2 = factor(c("a", "a", "b", "b", "a", "a", "b", "b"))
)
task = as_task_clust(data)

# Train the learner on the task
learner$train(task)

# Print the model
print(learner$model)
#> Distance type: huang 
#> 
#> Numeric predictors: 1 
#> Categorical predictors: 1 
#> Lambda: 46.85714 
#> 
#> Number of Clusters: 2 
#> Cluster sizes: 4 4 
#> Within cluster error: 1 1 
#> 
#> Cluster prototypes:
#>       x1     x2
#>    <num> <fctr>
#> 1:  10.5      b
#> 2:   1.5      a

# Make predictions for the task
prediction = learner$predict(task)

# Score the predictions
prediction$score(task = task)
#> Warning: NAs introduced by coercion
#> clust.dunn 
#>          8