A repo for a Digital Edition of Amadu Kurubari's "History of Samori Toure" from Delafosse's 1901 Jula Grammar (see this blog post for more context).
-
Write scripts to automatically manipulate and change the original OCR text into a critical markdown format
Remove the page headers and page numbers and make them into markdown headersExtract footnotes from prose- Clean up and match the semi-automated footnote markers and text.
-
Figure out auto-replacements for the modern version of the text
- <yè> for
yɛ - for intervocalic
g - for
ani'and' - <-ru> for plural
- <kù-tigi> for
kuntigi - for
siya'many' - for
ele - for
n f - for
fò - for
j - as part of
siyaman(It can besya-maorsya-màetc.) - for
c - for
ɲ - <lô> for
lɔ́n - <kyè> for
cɛ - <-ra> for
???
- <yè> for
- Started with the
ocr.txtfile. - Separated the French language introduction (
intro.txt) from the text proper (text.md) - Added markdown page number headers (e.g.,
### 149) using the scriptpages.py - Removed original document headers and page numbers that were caught in the text (semi-manually using search and replace in an editor)
- Partially automated the conversion of Delafosse's footnotes into markdown footnotes using
footnotes.py