A Julia package for running IgBLAST (v1.22.0) analyses on immunoglobulin (Ig) and T cell receptor (TCR) sequences.
- Automatic installation and management of IgBLAST binaries
- Support for both IgBLASTn and IgBLASTp
- Easy-to-use interface with customizable parameters
- Progress monitoring for long-running analyses
using Pkg
Pkg.add("IgBLAST")using IgBLAST
# Install IgBLAST (if not already installed)
install_igblast()For nucleotide sequences, both query and database files should contain nucleotide sequences:
# Run an IgBLASTn analysis with nucleotide sequences
run_igblast(
IgBLASTn,
"data/ERR4238106.fasta.gz", # Nucleotide query sequences
"data/Macaca_mulatta_V.fasta", # Nucleotide V gene database
"data/Macaca_mulatta_D.fasta", # Nucleotide D gene database
"data/Macaca_mulatta_J.fasta", # Nucleotide J gene database
"data/rhesus_monkey_gl.aux", # Auxiliary file (can be "" if not needed)
"ERR4238106.tsv",
additional_params = Dict("organism" => "rhesus_monkey", "ig_seqtype" => "Ig")
)
# Run without auxiliary file (optional for assignments without CDR3 analysis)
run_igblast(
IgBLASTn,
"query_nucleotide.fasta",
"V_nucleotide.fasta",
"D_nucleotide.fasta",
"J_nucleotide.fasta",
"", # Empty aux_file
"output.tsv",
additional_params = Dict("organism" => "human", "domain_system" => "imgt")
)For protein sequences, the query file must contain protein sequences (amino acids), while the database files should contain nucleotide sequences (which will be automatically translated to protein during database preparation):
# Run an IgBLASTp analysis with protein query sequences
run_igblast(
IgBLASTp,
"query_protein.fasta", # Protein query sequences (must be amino acids, not nucleotides)
"V_nucleotide.fasta", # Nucleotide V gene database (will be translated to protein)
"D_nucleotide.fasta", # Nucleotide D gene database (will be translated to protein)
"J_nucleotide.fasta", # Nucleotide J gene database (will be translated to protein)
"", # Auxiliary file not used by IgBLASTp
"output.tsv",
additional_params = Dict("organism" => "human")
)Important notes:
- For
IgBLASTn: Both query and database files should contain nucleotide sequences. Returns V, D, and J assignments. - For
IgBLASTp: Query file must contain protein sequences (amino acids), database files should contain nucleotide sequences. Only returns V assignments (D and J databases are prepared but not used by IgBLASTp). - The
aux_fileparameter can be an empty string""if not needed (useful for assignments without CDR3 analysis) IgBLASTpdoes not support the auxiliary file parameter
For more detailed information, please refer to the documentation.