Skip to content

OSU-BMBL/ssKIND

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ssKIND: An AI-Powered Single-Cell and Spatial Omics Ecosystem for Neurodegenerative Disease

This repository contains analysis code and the NeuroCLIP foundation model training pipeline associated with the ssKIND paper.

Wu et al., "An AI-powered single-cell and spatial omics ecosystem for neurodegenerative disease"

Repository Structure

ssKIND/
├── NeuroCLIP/          NeuroCLIP fine-tuning (vision-omics foundation model)
├── GWAS/               GWAS enrichment and scDRS per-cell disease scoring
└── pseudobulk_deg/     Pseudobulk differential expression (cell-type and class level)

Data Availability

All processed single-cell and spatial transcriptomic datasets, harmonized metadata, and cell-type annotations are publicly available at the ssKIND portal:

https://bmblx.bmi.osumc.edu/sskind/datasets

Download the relevant .h5ad files from the portal before running the analysis scripts. After downloading, set <BASE_DIR> in each script to your local data directory.

NeuroCLIP

NeuroCLIP is a brain-specialized multimodal foundation model that aligns H&E histology with spatial transcriptomics. It is fine-tuned from the OmiCLIP backbone using ~0.96 million paired image patches and gene expression sequences from spatial transcriptomic data across nine neurodegenerative diseases.

Training data preprocessing follows the Loki pipeline for spatial transcriptomics image–expression pairing.

See NeuroCLIP/README.md for training instructions and pretrained model weights.

GWAS / scDRS

Scripts for GWAS enrichment analysis and per-cell disease relevance scoring using scDRS. GWAS summary statistics were obtained for AD, PD, ALS, and HD.

Script Description
0_setup.sh Environment setup
1_download_gwas.sh Download GWAS summary statistics
2_prep_magma_input.py Prepare MAGMA input
3_run_magma.sh Run MAGMA gene-level statistics
4a_make_cov.py Build covariate matrix
4b_prep_adata_new.py Prepare AnnData for scDRS
4c_postprocess.py Postprocess scDRS results
4_run_scdrs.py / .sh Run scDRS scoring
4_submit_scdrs_all.sh Submit scDRS jobs (SLURM array)
5_plot_summary_heatmap.R Summary heatmap
5_plot_violin_region.py Violin plots by brain region
5b_plot_umap_diseaseonly.py UMAP colored by scDRS score

Pseudobulk DEG

Pseudobulk differential expression pipeline using limma-voom. Two granularities are provided: cell-type level and class level (broader cell class, e.g., Excitatory Neurons, Glia).

Script Description
generate_pseudobulk_v5.py Build pseudobulk counts (cell-type level)
generate_pseudobulk_v5_class.py Build pseudobulk counts (class level)
run_limma_deg_v5.R Limma-voom DEG (cell-type level)
run_limma_deg_v5_class.R Limma-voom DEG (class level)
submit_step1_v5.sh SLURM submission — pseudobulk generation
submit_step1_v5_class.sh SLURM submission — class-level generation
submit_step2_v5.sh SLURM submission — DEG
submit_step2_v5_class.sh SLURM submission — class-level DEG

Environment

Python (see requirements.txt):

pip install -r requirements.txt

R: Requires R 4.4.0 with limma, edgeR, and clusterProfiler. On OSC:

module load gcc/12.3.0 R/4.4.0

Related Resources

Resource Link
ssKIND web portal https://bmblx.bmi.osumc.edu/ssKIND/
ssKIND datasets https://bmblx.bmi.osumc.edu/sskind/datasets
Data collection & processing agents (AI-integrated) https://github.com/OSU-BMBL/ssKIND-collection-agents
NeuroCLIP pretrained weights Google Drive

Citation

Wu W, Kim TY, Xu J, Cheng H, Wang C, et al. An AI-powered single-cell and spatial omics ecosystem for neurodegenerative disease. [journal], 2026.

License and Terms of Use

© BMBL and Matrix Lab. This model and associated code are released under the BSD 3-Clause License and may only be used for non-commercial, academic research purposes with proper attribution.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors