VIsualizing BCR repertoires and intraCLonal Diversity
ViCloD is a web server for visualizingz the clonality and intraclonal diversity of BCR repertoires. Here we make the ViCloD pipeline available, the stand-alone version. Users can run it locally and visualize their results online. This pipeline uses the following tools: MobiLLe to cluster the clonally-related sequences in BCR repertoires and ClonalTree to reconstructing the evolutionary history of a BCR lineages.
REFERENCE Lucile Jeusset, Nika Abdollahi, Anne Langlois De Septenville, Marine Armand, Thibaud Verny, Clotilde Bavetti, Frédéric Davi and Juliana S. Bernardes. ViCloD, an interactive web tool for visualizing B cell repertoires and analyzing intra-clonal diversities in B-cell tumors. To be submitted.
CONTACT E-mail: juliana.silva_bernardes@sorbonne-universite.fr
- An AIRR formatted file containing annotated IGH sequences.
- See example input file
-
ViCloD returns:
-
[repertoire_name]_visualization.zip : contains all the files necessary for viewing the BCR repertoire in the format requested by the ViCloD website (compressed file)
-
[repertoire_name]_repertoire.json : the clustering output of clonally-related sequences of the repertoire in JSON format
-
[repertoire_name]_tree_all.json : the reconstructed BCR lineage tree of the 5 first clones in JSON format
-
[repertoire_name]_tree_simplification1.json : the first level of simplification of the previously reconstructed BCR lineage tree in JSON format
-
[repertoire_name]_tree_simplification2.json : the second level of simplification of the previously reconstructed BCR lineage tree in JSON format
-
[repertoire_name]_C[clone_number]_HAUS_sequence.fasta : a fasta file containing the HAUS sequence of the indicated clone
-
[repertoire_name]_C[clone_number]_subclones_sequences.fasta : a fasta file containing the subclone sequences of the indicated clone
-
[repertoire_name]_log.txt : contain the information on the reads used for the analysis.
-
[repertoire_name]_unannotated_seq.txt : contains all the sequences of the input AIRR file which are not correctly annotated to allow analysis by ViCloD
-
[repertoire_name]_AIRR_seq_name_matching.txt : contain the match between the original name of the reads (in AIRR file) and the new id provide by ViCloD
-
[repertoire_name]_unique_seq_id_matching.txt : ids of sequences with the same nucleotide sequence
-
-
We strongly recommend anaconda environment.
-
Python version 3 or later
-
numpy :
conda install numpyor
pip install numpy -
matplotlib
conda install -c conda-forge matplotlibor
pip install matplotlib -
Palettable :
conda install -c conda-forge palettableor
pip install palettable -
skbio
conda install -c anaconda scikit-bioor
pip install scikit-bio -
Levenshtein
conda install -c conda-forge python-levenshteinor
pip install python-Levenshtein -
Biopython
conda install -c conda-forge biopythonor
pip install biopython -
ete3 :
conda install -c etetoolkit ete3or
pip install ete3 -
python-newick :
conda install -c bioconda python-newickor
pip install newick -
muscle :
conda install -c bioconda muscleor
pip install muscle
The command line for launching the ViCloD is:
$ run_ViCloD.sh [AIRR_file] [output_path] [threshold_value] [thresold_type]
- [AIRR_file] is the AIRR file containing the annotated sequences of the BCR repertoire,
- [output_path] is path of the ouptut directory
- [threshold_value] is a threshold for eliminating infrequent reads
- [thresold_type] is the type in which the threshold value is given (percentage or number)
For instance the following command can be run in the src/ folder:
$ src/run_ViCloD.sh Examples/Input/example.tsv Examples/Output 0 number
Output files will be placed in a folder as such:
[output_path]/[AIRR_file_name]_ViCloD_output
- [AIRR_file_name] is the name of the AIRR file provide as input
- The program is distributed under the CeCILL licence
- Feature requests and open issues.