📚 Documentation: https://zaoqu-liu.github.io/SCEVAN/
SCEVAN (Single CEll Variational Aneuploidy aNalysis) is a computational framework for automated tumor cell identification and copy number alteration (CNA) inference from single-cell RNA sequencing (scRNA-seq) data. The algorithm employs a variational segmentation approach combined with a hierarchical hidden Markov model to accurately delineate clonal copy number substructures within heterogeneous tumor populations.
- Automated Cell Classification: Distinguishes malignant cells from non-malignant tumor microenvironment (TME) cells without prior knowledge
- Clonal Structure Inference: Identifies subclonal populations with distinct copy number architectures
- High Performance: Achieves F1 score of 0.90 across 106 samples (93,332 cells), outperforming state-of-the-art methods (F1 = 0.63)
- Computational Efficiency: Employs greedy multichannel segmentation for scalability to large datasets
- Cross-Platform Compatibility: Full support for Windows, macOS, and Linux
- Seurat Integration: Compatible with Seurat v4 and v5 workflows
De Falco, A., Caruso, F., Su, X.-D., Varone, A., & Ceccarelli, M. (2023). A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. Nature Communications, 14, 1074.
install.packages("SCEVAN", repos = "https://zaoqu-liu.r-universe.dev")# Install dependencies
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("EnsDb.Hsapiens.v86", "EnsDb.Mmusculus.v79", "scran", "fgsea"))
# Install yaGST dependency
remotes::install_github("miccec/yaGST")
# Install SCEVAN
remotes::install_github("Zaoqu-Liu/SCEVAN")library(SCEVAN)
# Basic analysis with default parameters
results <- pipelineCNA(count_mtx, sample = "sample_name")The pipelineCNA() function performs the complete analysis pipeline including cell classification and clonal structure characterization.
results <- pipelineCNA(
count_mtx, # Raw count matrix (genes × cells)
sample = "sample_name", # Sample identifier
par_cores = 20, # Number of parallel cores
organism = "human", # "human" or "mouse"
SUBCLONES = TRUE, # Enable subclone detection
beta_vega = 0.5, # Segmentation granularity parameter
ClonalCN = TRUE, # Infer clonal CN profile
plotTree = FALSE # Plot phylogenetic tree
)| Parameter | Description | Default |
|---|---|---|
count_mtx |
Raw count matrix with genes on rows and cells on columns | Required |
sample |
Sample name for output files | "" |
par_cores |
Number of parallel cores | 20 |
norm_cell |
Known normal cells for reference | NULL |
SUBCLONES |
Enable subclonal analysis | TRUE |
beta_vega |
Segmentation parameter (higher = coarser) | 0.5 |
ClonalCN |
Infer clonal CN profile | TRUE |
plotTree |
Generate phylogenetic tree plot | FALSE |
organism |
Species ("human" or "mouse") |
"human" |
FIXED_NORMAL_CELLS |
Use provided normal cells as fixed reference | FALSE |
For comparative analysis across multiple samples:
multiSampleComparisonClonalCN(
listCountMtx, # Named list of count matrices
listNormCells = NULL, # List of normal cells per sample
analysisName = "comparison", # Analysis identifier
organism = "human",
par_cores = 20
)SCEVAN integrates seamlessly with Seurat workflows:
# Run SCEVAN analysis
results <- pipelineCNA(count_mtx)
# Add to existing Seurat object
seurat_obj <- Seurat::AddMetaData(seurat_obj, metadata = results)
# Or create new Seurat object with SCEVAN annotations
seurat_obj <- Seurat::CreateSeuratObject(count_mtx, meta.data = results)# Compatible with Seurat v4 and v5
count_mtx <- getCountMtxFromSeurat(seurat_obj, assay = "RNA")# Load analysis outputs
load("output/sample_count_mtx_annot.RData")
load("output/sample_CNAmtx.RData")
# Extract CN ratio for specific genomic region
region_cn <- apply(
CNA_mtx_relat[count_mtx_annot$seqnames == 3 &
count_mtx_annot$start >= 158644278 &
count_mtx_annot$end <= 194498364, ],
2, mean
)
# Visualize with Seurat
seurat_obj <- AddMetaData(seurat_obj, region_cn, col.name = "chr3_cn")
FeaturePlot(seurat_obj, "chr3_cn", cols = c("gray", "red"))Generate annotated CNA heatmaps incorporating cell metadata:
plotCNA_withAnnotCells(
SampleName = "sample_name",
metadata = cell_annotations, # data.frame with cell annotations
COLUMNS_TO_PLOT = c("cell_type", "cluster"),
SUBCLONE = TRUE # Plot subclone-specific profiles
)Detailed tutorials demonstrating SCEVAN functionality:
- Intratumoral Heterogeneity Analysis - Glioblastoma case study
- Two-Sample Comparison - Head and neck cancer
- Multi-Sample Analysis - Three-sample comparison
A containerized version is available for reproducible analyses:
# Pull Docker image
docker pull anthonyphis/r_scevan
# Run analysis
docker run -v /path/to/data:/home/data -it anthonyphis/r_scevan:latest \
Rscript /home/data/analysis_script.RPre-processed datasets for testing and tutorials:
| Dataset | Description | Download |
|---|---|---|
| MGH106 | Glioblastoma (GSE131928) | Download |
| listCountMtx | MGH102/104/105/106 combined | Download |
| HNSCC26 | Head & Neck Cancer (GSE10332) | Download |
If you use SCEVAN in your research, please cite:
@article{DeFalco2023,
author = {De Falco, Antonio and Caruso, Francesca and Su, Xiao-Dong and
Varone, Antonio and Ceccarelli, Michele},
title = {A variational algorithm to detect the clonal copy number
substructure of tumors from scRNA-seq data},
journal = {Nature Communications},
year = {2023},
volume = {14},
pages = {1074},
doi = {10.1038/s41467-023-36790-9}
}This project is licensed under the GPL-2 License - see the LICENSE file for details.
- Maintainer: Zaoqu Liu (liuzaoqu@163.com)
- Issues: GitHub Issues
- Repository: https://github.com/Zaoqu-Liu/SCEVAN