Percent identity is computed as:
where
Percent of matched sequences is computed as:
where
-
Percent of matched sequence is also an alternative definition of percent identity used in some cases, for intance, in BLAST.
-
BAM/SAM files must contain MD tags to be able to filter by percent identity. Aligners such as BWA add MD tags to each queried sequence in a BAM file. MD tags can also be generated with samtools.
pip install filtersam
You can find a jupyter notebook with usage examples here.
If you use this software, please cite it as below:
Robaina-Estévez, S. (2022). filterSAM: filter sam/bam files by percent identity or percent of matched sequence (Version 0.0.11)[Computer software]. https://doi.org/10.5281/zenodo.7056278.