Hierarchical DBSCAN (HDBSCAN) Clustering Learner
Source:R/LearnerClustHDBSCAN.R
mlr_learners_clust.hdbscan.RdHDBSCAN (Hierarchical DBSCAN) clustering.
Calls dbscan::hdbscan() from dbscan.
Dictionary
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
Meta Information
Task type: “clust”
Predict Types: “partition”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3cluster, dbscan
Parameters
| Id | Type | Default | Levels | Range |
| minPts | integer | - | \([0, \infty)\) | |
| cluster_selection_epsilon | numeric | 0 | \((-\infty, \infty)\) | |
| gen_hdbscan_tree | logical | FALSE | TRUE, FALSE | - |
| gen_simplified_tree | logical | FALSE | TRUE, FALSE | - |
| verbose | logical | FALSE | TRUE, FALSE | - |
References
Hahsler M, Piekenbrock M, Doran D (2019). “dbscan: Fast Density-Based Clustering with R.” Journal of Statistical Software, 91(1), 1–30. doi:10.18637/jss.v091.i01 .
Campello, JGB R, Moulavi, Davoud, Sander, Jörg (2013). “Density-based clustering based on hierarchical density estimates.” In Pacific-Asia conference on knowledge discovery and data mining, 160–172. Springer.
See also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)for a table of available Learners in the running session (depending on the loaded packages).mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_clust.MBatchKMeans,
mlr_learners_clust.SimpleKMeans,
mlr_learners_clust.agnes,
mlr_learners_clust.ap,
mlr_learners_clust.bico,
mlr_learners_clust.birch,
mlr_learners_clust.cmeans,
mlr_learners_clust.cobweb,
mlr_learners_clust.dbscan,
mlr_learners_clust.dbscan_fpc,
mlr_learners_clust.diana,
mlr_learners_clust.em,
mlr_learners_clust.fanny,
mlr_learners_clust.featureless,
mlr_learners_clust.ff,
mlr_learners_clust.hclust,
mlr_learners_clust.kkmeans,
mlr_learners_clust.kmeans,
mlr_learners_clust.mclust,
mlr_learners_clust.meanshift,
mlr_learners_clust.optics,
mlr_learners_clust.pam,
mlr_learners_clust.xmeans
Super classes
mlr3::Learner -> mlr3cluster::LearnerClust -> LearnerClustHDBSCAN
Examples
# Define the Learner and set parameter values
learner = lrn("clust.hdbscan")
print(learner)
#>
#> ── <LearnerClustHDBSCAN> (clust.hdbscan): HDBSCAN Clustering ───────────────────
#> • Model: -
#> • Parameters: minPts=5
#> • Packages: mlr3, mlr3cluster, and dbscan
#> • Predict Types: [partition]
#> • Feature Types: logical, integer, and numeric
#> • Encapsulation: none (fallback: -)
#> • Properties: complete, density, and exclusive
#> • Other settings: use_weights = 'error'
# Define a Task
task = tsk("usarrests")
# Train the learner on the task
learner$train(task)
# Print the model
print(learner$model)
#> HDBSCAN clustering for 50 objects.
#> Parameters: minPts = 5
#> The clustering contains 3 cluster(s) and 13 noise points.
#>
#> 0 1 2 3
#> 13 17 11 9
#>
#> Available fields: cluster, minPts, coredist, cluster_scores,
#> membership_prob, outlier_scores, hc, data
# Make predictions for the task
prediction = learner$predict(task)
# Score the predictions
prediction$score(task = task)
#> clust.dunn
#> 0.2423918