fix rawtext referenced before assignment error #33

jronallo · 2016-08-18T00:00:47Z

This is a potential fix for #30 where rawtext gets referenced without having been defined. If innerword.txt is None then rawtext will never get defined. This patch simply continues if innerword.text is also None. I'm not certain this is the best fix, but this works for me.

The issue arises because sometimes tesseract outputs hOCR where an ocrx_word just has a single space. the cases I've seen are with images of text that tesseract has lots of problems with.

<div class='ocr_carea' id='block_1_1' title="bbox 118 3884 122 5088">
    <p class='ocr_par' dir='ltr' id='par_1_1' title="bbox 118 3884 122 5088">
     <span class='ocr_line' id='line_1_1' title="bbox 118 3884 122 5088; baseline 0 0"><span class='ocrx_word' id='word_1_1' title='bbox 118 3884 122 5088; x_wconf 95' lang='eng' dir='ltr'><strong><em> </em></strong></span> 
     </span>
    </p>
   </div>

stweil · 2016-08-24T14:04:49Z

Thank you for this pull request. I could confirm that it fixes the reported bug.

fix rawtext referenced before assignment error

4588461

stweil added the bug label Aug 24, 2016

stweil self-assigned this Aug 24, 2016

stweil merged commit 50c8a9e into ocropus:master Aug 24, 2016

stweil mentioned this pull request Aug 24, 2016

UnboundLocalError: local variable 'rawtext' referenced before assignment #30

Closed

zuphilip mentioned this pull request Sep 15, 2016

Use lxml.etree, iterate ocr_line > ocr_word #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix rawtext referenced before assignment error #33

fix rawtext referenced before assignment error #33

Uh oh!

jronallo commented Aug 18, 2016 •

edited

Loading

Uh oh!

stweil commented Aug 24, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix rawtext referenced before assignment error #33

fix rawtext referenced before assignment error #33

Uh oh!

Conversation

jronallo commented Aug 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stweil commented Aug 24, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jronallo commented Aug 18, 2016 •

edited

Loading