📖 Documentation: https://zaoqu-liu.github.io/MultiK/
MultiK is an R package for objective determination of optimal cluster numbers in single-cell RNA sequencing (scRNA-seq) data. It addresses one of the most challenging questions in unsupervised clustering analysis: "How many distinct cell populations exist in my dataset?"
- 🎯 Data-driven K selection using consensus clustering
- 📊 PAC metric for quantifying clustering stability
- 📈 Statistical validation via SigClust testing
- ⚡ Parallel processing for computational efficiency
- 🔬 Seurat v4/v5 compatible
# From R-universe (recommended)
install.packages("MultiK", repos = "https://zaoqu-liu.r-universe.dev")
# From GitHub
# install.packages("remotes")
remotes::install_github("Zaoqu-Liu/MultiK")library(MultiK)
library(Seurat)
# Load example data
data(p3cl)
# Step 1: Run MultiK algorithm
result <- MultiK(p3cl, reps = 100, cores = 4, seed = 42)
# Step 2: Visualize K selection diagnostics
DiagMultiKPlot(result$k, result$consensus)
# Step 3: Get cluster assignments at optimal K
clusters <- getClusters(p3cl, optK = 3)
# Step 4: Statistical validation
pval <- CalcSigClust(p3cl, clusters$clusters[, 1], nsim = 100)
PlotSigClust(p3cl, clusters$clusters[, 1], pval)MultiK employs a subsampling-based consensus clustering approach:
- Subsampling: Randomly sample 80% of cells in each iteration
- Clustering: Apply Seurat clustering across resolution parameters
- Consensus Matrix: Track co-clustering frequency for each K
- PAC Calculation: Quantify clustering ambiguity
The optimal K is selected using a Pareto optimization framework that balances:
- Frequency: How often K appears across resolutions
- Stability: Inverse of relative PAC (rPAC)
Cluster separability is validated using SigClust, which tests whether observed cluster separation exceeds what would be expected by chance under a Gaussian null hypothesis.
| Function | Description |
|---|---|
MultiK() |
Core consensus clustering algorithm |
DiagMultiKPlot() |
Diagnostic visualization for K selection |
getClusters() |
Extract cluster assignments at specified K |
CalcSigClust() |
Pairwise statistical significance testing |
PlotSigClust() |
Visualize cluster hierarchy and significance |
📖 Full documentation: https://zaoqu-liu.github.io/MultiK
If you use MultiK in your research, please cite:
@software{multik2025,
author = {Liu, Zaoqu},
title = {MultiK: Multi-Resolution Consensus Clustering for Single-Cell RNA-seq},
year = {2025},
url = {https://github.com/Zaoqu-Liu/MultiK}
}Zaoqu Liu, PhD
- 📧 Email: liuzaoqu@163.com
- 🔗 ORCID: 0000-0002-0452-742X
- 🐙 GitHub: Zaoqu-Liu
MIT License © 2025 Zaoqu Liu