K-Prototypes Clustering Learner

K-prototypes clustering for mixed-type data. Calls clustMixType::kproto() from package clustMixType.

The k parameter is set to 2 by default since clustMixType::kproto() doesn't have a default value for the number of clusters.

Initial parameter values

keep.data:
- Actual default: TRUE.
- Adjusted default: FALSE.
- Reason for change: Avoid storing the training data in the model to save memory.
verbose:
- Actual default: TRUE.
- Adjusted default: FALSE.
- Reason for change: Suppress verbose output during training.

Dictionary

This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():

mlr_learners$get("clust.kproto")
lrn("clust.kproto")

Meta Information

Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3cluster, clustMixType

Parameters

Id	Type	Default	Levels	Range
k	untyped	-		-
lambda	untyped	NULL		-
type	character	huang	huang, gower	-
iter.max	integer	100		$[1, \infty)$
nstart	integer	1		$[1, \infty)$
na.rm	character	yes	yes, no, imp.internal, imp.onestep	-
keep.data	logical	TRUE	TRUE, FALSE	-
verbose	logical	TRUE	TRUE, FALSE	-
init	character	NULL	nbh.dens, sel.cen, nstart.m	-
p_nstart.m	numeric	0.9		$[0, 1]$

References

Huang, Zhexue (1998). “Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values.” Data Mining and Knowledge Discovery, 2(3), 283–304.

Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
Dictionary of Learners: mlr3::mlr_learners
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Package mlr3viz for some generic visualizations.
Extension packages for additional task types:
- mlr3proba for probabilistic supervised regression and survival analysis.
- mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.

Super classes

mlr3::Learner -> LearnerClust -> LearnerClustKProto

Methods

Inherited methods

`LearnerClustKProto$new()`

Creates a new instance of this R6 class.

Usage

LearnerClustKProto$new()

`LearnerClustKProto$clone()`

The objects of this class are cloneable with this method.

Usage

LearnerClustKProto$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Define the Learner and set parameter values
learner = lrn("clust.kproto")
print(learner)
#> 
#> ── <LearnerClustKProto> (clust.kproto): K-Prototypes ───────────────────────────
#> • Model: -
#> • Parameters: k=2, keep.data=FALSE, verbose=FALSE
#> • Packages: mlr3, mlr3cluster, and clustMixType
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, numeric, factor, and ordered
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, exclusive, and partitional
#> • Other settings: use_weights = 'error', predict_raw = 'FALSE'

# Define a mixed-type Task (kproto requires at least one factor variable)
data = data.frame(
  x1 = c(1, 2, 10, 11, 1, 2, 10, 11),
  x2 = factor(c("a", "a", "b", "b", "a", "a", "b", "b"))
)
task = as_task_clust(data)

# Train the learner on the task
learner$train(task)

# Print the model
print(learner$model)
#> Distance type: huang 
#> 
#> Numeric predictors: 1 
#> Categorical predictors: 1 
#> Lambda: 46.85714 
#> 
#> Number of Clusters: 2 
#> Cluster sizes: 4 4 
#> Within cluster error: 1 1 
#> 
#> Cluster prototypes:
#>       x1     x2
#>    <num> <fctr>
#> 1:  10.5      b
#> 2:   1.5      a

# Make predictions for the task
prediction = learner$predict(task)

# Score the predictions
prediction$score(task = task)
#> clust.dunn 
#>         18

Initial parameter values

Dictionary

Meta Information

Parameters

References

See also

Super classes

Methods

Public methods

LearnerClustKProto$new()

Usage

LearnerClustKProto$clone()

Usage

Arguments

Examples

`LearnerClustKProto$new()`

`LearnerClustKProto$clone()`