Skip to content

Zaoqu-Liu/SCEVAN

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCEVAN: Single-Cell Variational Aneuploidy Analysis

SCEVAN Logo

Website R-universe License: GPL-2 Nature Communications

📚 Documentation: https://zaoqu-liu.github.io/SCEVAN/

Overview

SCEVAN (Single CEll Variational Aneuploidy aNalysis) is a computational framework for automated tumor cell identification and copy number alteration (CNA) inference from single-cell RNA sequencing (scRNA-seq) data. The algorithm employs a variational segmentation approach combined with a hierarchical hidden Markov model to accurately delineate clonal copy number substructures within heterogeneous tumor populations.

SCEVAN Workflow

Key Features

  • Automated Cell Classification: Distinguishes malignant cells from non-malignant tumor microenvironment (TME) cells without prior knowledge
  • Clonal Structure Inference: Identifies subclonal populations with distinct copy number architectures
  • High Performance: Achieves F1 score of 0.90 across 106 samples (93,332 cells), outperforming state-of-the-art methods (F1 = 0.63)
  • Computational Efficiency: Employs greedy multichannel segmentation for scalability to large datasets
  • Cross-Platform Compatibility: Full support for Windows, macOS, and Linux
  • Seurat Integration: Compatible with Seurat v4 and v5 workflows

Publication

De Falco, A., Caruso, F., Su, X.-D., Varone, A., & Ceccarelli, M. (2023). A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. Nature Communications, 14, 1074.

DOI

Installation

From R-universe (Recommended)

install.packages("SCEVAN", repos = "https://zaoqu-liu.r-universe.dev")

From GitHub

# Install dependencies
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("EnsDb.Hsapiens.v86", "EnsDb.Mmusculus.v79", "scran", "fgsea"))

# Install yaGST dependency
remotes::install_github("miccec/yaGST")

# Install SCEVAN
remotes::install_github("Zaoqu-Liu/SCEVAN")

Quick Start

library(SCEVAN)

# Basic analysis with default parameters
results <- pipelineCNA(count_mtx, sample = "sample_name")

Usage

Single-Sample Analysis

The pipelineCNA() function performs the complete analysis pipeline including cell classification and clonal structure characterization.

results <- pipelineCNA(
  count_mtx,                    # Raw count matrix (genes × cells)
  sample = "sample_name",       # Sample identifier
  par_cores = 20,               # Number of parallel cores
  organism = "human",           # "human" or "mouse"
  SUBCLONES = TRUE,             # Enable subclone detection
  beta_vega = 0.5,              # Segmentation granularity parameter
  ClonalCN = TRUE,              # Infer clonal CN profile
  plotTree = FALSE              # Plot phylogenetic tree
)

Parameters

Parameter Description Default
count_mtx Raw count matrix with genes on rows and cells on columns Required
sample Sample name for output files ""
par_cores Number of parallel cores 20
norm_cell Known normal cells for reference NULL
SUBCLONES Enable subclonal analysis TRUE
beta_vega Segmentation parameter (higher = coarser) 0.5
ClonalCN Infer clonal CN profile TRUE
plotTree Generate phylogenetic tree plot FALSE
organism Species ("human" or "mouse") "human"
FIXED_NORMAL_CELLS Use provided normal cells as fixed reference FALSE

Multi-Sample Comparison

For comparative analysis across multiple samples:

multiSampleComparisonClonalCN(
  listCountMtx,                 # Named list of count matrices
  listNormCells = NULL,         # List of normal cells per sample
  analysisName = "comparison",  # Analysis identifier
  organism = "human",
  par_cores = 20
)

Seurat Integration

SCEVAN integrates seamlessly with Seurat workflows:

# Run SCEVAN analysis
results <- pipelineCNA(count_mtx)

# Add to existing Seurat object
seurat_obj <- Seurat::AddMetaData(seurat_obj, metadata = results)

# Or create new Seurat object with SCEVAN annotations
seurat_obj <- Seurat::CreateSeuratObject(count_mtx, meta.data = results)

Extract Count Matrix from Seurat Object

# Compatible with Seurat v4 and v5
count_mtx <- getCountMtxFromSeurat(seurat_obj, assay = "RNA")

Visualizing Region-Specific CNAs

# Load analysis outputs
load("output/sample_count_mtx_annot.RData")
load("output/sample_CNAmtx.RData")

# Extract CN ratio for specific genomic region
region_cn <- apply(
  CNA_mtx_relat[count_mtx_annot$seqnames == 3 & 
                count_mtx_annot$start >= 158644278 & 
                count_mtx_annot$end <= 194498364, ], 
  2, mean
)

# Visualize with Seurat
seurat_obj <- AddMetaData(seurat_obj, region_cn, col.name = "chr3_cn")
FeaturePlot(seurat_obj, "chr3_cn", cols = c("gray", "red"))

CNA Heatmap with Annotations

Generate annotated CNA heatmaps incorporating cell metadata:

plotCNA_withAnnotCells(
  SampleName = "sample_name",
  metadata = cell_annotations,      # data.frame with cell annotations
  COLUMNS_TO_PLOT = c("cell_type", "cluster"),
  SUBCLONE = TRUE                   # Plot subclone-specific profiles
)

Annotated CNA Heatmap

Vignettes

Detailed tutorials demonstrating SCEVAN functionality:

Docker

A containerized version is available for reproducible analyses:

# Pull Docker image
docker pull anthonyphis/r_scevan

# Run analysis
docker run -v /path/to/data:/home/data -it anthonyphis/r_scevan:latest \
    Rscript /home/data/analysis_script.R

Example Datasets

Pre-processed datasets for testing and tutorials:

Dataset Description Download
MGH106 Glioblastoma (GSE131928) Download
listCountMtx MGH102/104/105/106 combined Download
HNSCC26 Head & Neck Cancer (GSE10332) Download

Citation

If you use SCEVAN in your research, please cite:

@article{DeFalco2023,
  author  = {De Falco, Antonio and Caruso, Francesca and Su, Xiao-Dong and 
             Varone, Antonio and Ceccarelli, Michele},
  title   = {A variational algorithm to detect the clonal copy number 
             substructure of tumors from scRNA-seq data},
  journal = {Nature Communications},
  year    = {2023},
  volume  = {14},
  pages   = {1074},
  doi     = {10.1038/s41467-023-36790-9}
}

License

This project is licensed under the GPL-2 License - see the LICENSE file for details.

Contact

About

R package that automatically classifies the cells in the scRNA data by segregating non-malignant cells of tumor microenviroment from the malignant cells. It also infers the copy number profile of malignant cells, identifies subclonal structures and analyses the specific and shared alterations of each subpopulation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • R 83.2%
  • C 16.8%