Skip to content
/ ELSM Public

ELSM is a novel multi-modal cfDNA fragmentomics fusion framework that effectively enhances the accuracy of early cancer screening and tissue-of-origin prediction through sample-level modality evaluation and a two-stage neural network fusion strategy, while maintaining good biological interpretability.

Notifications You must be signed in to change notification settings

llb895/ELSM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 

Repository files navigation

Enhanced Early Cancer Detection via Multi-Omics cfDNA Fragmentation Integration Using an Early-Late Fusion Neural Network with Sample-Modality Evaluation

Introduction

Multi-omics cfDNA fragmentation patterns show promise as biomarkers for early cancer detection, but fusing their multimodal data faces challenges from heterogeneity and small sample sizes. We propose ELSM, a framework integrating two-stage neural network fusion with sample-modality evaluation to effectively combine 13 cfDNA fragmentomic features. Evaluated across five datasets (1,994 samples, 10 cancer types), ELSM outperforms both unimodal classifiers and state-of-the-art fusion models in cancer detection and tissue-of-origin prediction. Biological analysis of key genomic regions linked to high-contributing modalities validates biological relevance, highlighting ELSM’s clinical utility.

Overview

Table of Contents

1 Environment

We used Python 3.7 for our experiments, and our CUDA version was 11.8. To set up the environment:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

2 Preparation

In this study, we demonstrate the functionality of ELSM through a case study that performs independent validation and cross-validation based on 13 fragmentomic features from Mathios et al.’s independent dataset and the Mathios et al.’s LUCAS dataset1.

project
│   README.md
│   
└───ELSM
   │   readme.txt
   │
   └───sample_level_evaluation_strategy_result
   │   │
   │   │   ...    
   │
   └───model
   │   │
   │   │   ...
   │ 
   └───dataset
   │   │
   │   │   ...
   │ 
   └───result
   │   │
   │   │   ...

The dataset directory contains raw sample data.
The model directory stores ELSM model code and related data processing tools.
The sample_level_evaluation_strategy_result directory holds sampling-resampled data.
The result directory contains the predicted output matrix.

3 Modality Evaluation

In cross-validation

Enter the model folder.

cd /ELSM/model/

Execute the sample_level_evaluation_strategy_cross.py file.

python sample_level_evaluation_strategy_cross.py "../dataset/10-fold-cross-validation/" "../sample_level_evaluation_strategy_result/"

The path '../dataset/10-fold-cross-validation/' represents the source data storage location.
The path '../sample_level_evaluation_strategy_result/' indicates the target storage address.
The processed results will then be available in ELSM/sample_level_evaluation_strategy_result/.
Similarly, this is applicable to independent validation.

4 Model Prediction

In cross-validation

For the sample-level resampled data, model predictions are performed using an early-late fusion neural network.

cd /ELSM/model/
python execution_cross.py "../sample_level_evaluation_strategy_result/" 

The path '../sample_level_evaluation_strategy_result/' indicates the data storage location.
Similarly, this is applicable to independent validation.

5 Output Results

The prediction results are stored in the result folder.

6 Cite Us

If you use ELSM framework in your own studies, and work, please cite it by using the following:

@article{ELSM,
    title={Enhanced Early Cancer Detection via Multi-Omics cfDNA Fragmentation Integration Using an Early-Late Fusion Neural Network with Sample-Modality Evaluation},
    author={Libo Lu, ..., and Xionghui Zhou},
    year={2025},
}

7 References

Footnotes

  1. D. Mathios, J.S. Johansen, S. Cristiano, J.E. Medina, J. Phallen, K.R. Larsen, D.C. Bruhm, N. Niknafs, L. Ferreira, V.J.N.c. Adleff, Detection and characterization of lung cancer using cell-free DNA fragmentomes, 12 (2021) 5060.

About

ELSM is a novel multi-modal cfDNA fragmentomics fusion framework that effectively enhances the accuracy of early cancer screening and tissue-of-origin prediction through sample-level modality evaluation and a two-stage neural network fusion strategy, while maintaining good biological interpretability.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages