Skip to content

delalamo/SAbR

Repository files navigation

Structure-based Antibody Renumbering

This repo is currently in development. If you encounter any bugs, please report the issue here.

SAbR (Structure-based Antibody Renumbering) renumbers antibody PDB files using the 3D coordinate of backbone atoms. It uses custom forked versions of SoftAlign and ANARCI to align structures to SAbDaB-derived consensus embeddings and renumber to various antibody schemes, respectively.

Installation and use

  1. SAbR can be installed into a virtual environment via pip:
# Latest release
pip install sabr-kit

# Most recent version from Github
git clone --recursive https://github.com/delalamo/SAbR.git
cd SAbR/
pip install -e .

It can then be run using the sabr command (see below).

  1. Alternatively, SAbR can be directly run with the latest docker container:
docker run --rm ghcr.io/delalamo/sabr:latest -i input.pdb -o output.pdb -c CHAIN_ID

Running SAbR

usage: sabr [-h] -i INPUT_PDB -c INPUT_CHAIN -o OUTPUT_PDB [-n NUMBERING_SCHEME] [-t] [--overwrite] [-v]

Structure-based Antibody Renumbering (SAbR) renumbers antibody PDB files using the 3D coordinate of backbone atoms.

options:
  -h, --help            show this help message and exit
  -i INPUT_PDB, --input_pdb INPUT_PDB
                        Input pdb file
  -c INPUT_CHAIN, --input_chain INPUT_CHAIN
                        Input chain
  -o OUTPUT_PDB, --output_pdb OUTPUT_PDB
                        Output pdb file
  -n NUMBERING_SCHEME, --numbering_scheme NUMBERING_SCHEME
                        Numbering scheme, default is IMGT. Supports IMGT, Chothia, Kabat, Martin, AHo, and Wolfguy.
  --overwrite           Overwrite PDB
  -v, --verbose         Verbose output

Known issues

  • SAbR currently struggles with scFvs for two reasons. First, it is unclear how to assign canonical numbering to multiple domains within a single chain, unless we accept a spacer (e.g., starting chain #2 at 201 instead of 1). Second, it will sometimes align across both chains, introducing a massive insertion in between. It is unclear how to prevent this; please see issue #2 for details.
  • SAbR sometimes mistakenly includes sheets from the Fab in the VH.
  • The algorithm for renumbering CDRs, which is the same as the one for IMGT, does not account for unassigned residues. So if a residue is missing due to heterogeneity, the CDR numbering algorithm will misnumber other residues in the CDR.

About

Structure-based antibody renumbering

Resources

License

Stars

Watchers

Forks

Packages