Skip to content

IndexError with NCBI gff #128

@MrTomRod

Description

@MrTomRod

Hi!

I annotated a bacterium (Acidipropionibacterium acidipropionici - strain FAM19036) with NCBI PGAP.

I wanted to create SeqIO-objects from the gff file, but it failed:

import pprint
from BCBio import GFF
from BCBio.GFF import GFFExaminer
examiner = GFFExaminer()
with open('data/FAM19036/annot.gff') as in_handle:
    pprint.pprint(examiner.available_limits(in_handle))
print("------------------------------------------------------------")
with open('FAM19036/annot.gff') as in_handle:
    for rec in GFF.parse(in_handle):
        print(rec)
{'gff_id': {('CP040634.1',): 6772},
 'gff_source': {('.',): 3361,
                ('GeneMarkS-2+',): 360,
                ('Local',): 1,
                ('Protein Homology',): 2916,
                ('cmsearch',): 24,
                ('tRNAscan-SE',): 110},
 'gff_source_type': {('.', 'exon'): 8,
                     ('.', 'gene'): 3208,
                     ('.', 'pseudogene'): 137,
                     ('.', 'rRNA'): 8,
                     ('GeneMarkS-2+', 'CDS'): 360,
                     ('Local', 'region'): 1,
                     ('Protein Homology', 'CDS'): 2916,
                     ('cmsearch', 'RNase_P_RNA'): 1,
                     ('cmsearch', 'SRP_RNA'): 1,
                     ('cmsearch', 'exon'): 7,
                     ('cmsearch', 'rRNA'): 4,
                     ('cmsearch', 'riboswitch'): 10,
                     ('cmsearch', 'tmRNA'): 1,
                     ('tRNAscan-SE', 'exon'): 55,
                     ('tRNAscan-SE', 'tRNA'): 55},
 'gff_type': {('CDS',): 3276,
              ('RNase_P_RNA',): 1,
              ('SRP_RNA',): 1,
              ('exon',): 70,
              ('gene',): 3208,
              ('pseudogene',): 137,
              ('rRNA',): 12,
              ('region',): 1,
              ('riboswitch',): 10,
              ('tRNA',): 55,
              ('tmRNA',): 1}}
------------------------------------------------------------

Error
Traceback (most recent call last):
  File "/usr/lib64/python3.7/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/lib64/python3.7/unittest/case.py", line 628, in run
    testMethod()
  File "/project/gene_loci_comparison/test_gene_loci_comparison.py", line 129, in test_recreate_gff_bug
    for rec in GFF.parse(in_handle):
  File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 746, in parse
    target_lines):
  File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 327, in parse_in_parts
    cur_dict = self._results_to_features(cur_dict, results)
  File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 369, in _results_to_features
    base = self._add_directives(base, results.get('directive', []))
  File "/project/venvs/gene_loci_comparison/lib64/python3.7/site-packages/BCBio/GFF/GFFParser.py", line 388, in _add_directives
    val = (val[0], int(val[1]) - 1, int(val[2]))
IndexError: tuple index out of range

To recreate the bug, here is the relevant gff file.

Thanks in advance.

Edit: bcbio-gff version 0.6.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions