Skip to content

Urdatorn/grc-macronizer

Repository files navigation

A Macronizer for Ancient Greek

This is the first software to automatically mark the vowel length of alphas, iotas and ypsilons in Ancient Greek text, a crucial task for any research on Greek prosody and verse. Developed by me, Albin Thörn Cleland, as part of my doctoral research at Lund university, it is geared towards batch macronizing corpora with machine-friendly markup, avoiding combining diacritics and everything that doesn't render in standard IDE and terminal fonts unless specifically asked for.

Installation:

  • Create a virtual environment with Python 3.12. Nothing will work if you don't get this step right!
  • After having initialized your venv, activate it and install the right version of spaCy, the dependency of odyCy, with pip install spacy>=3.7.4,<3.8.0.
  • Navigate to external/grc_odycy_joint_trf and install odyCy locally with pip install grc_odycy_joint_trf, while making sure that you are still in the venv with Python 3.12 you created earlier.
  • Install the submodule grc-utils with cd grc-utils and pip install ..

And that's it! Start macronizing by running the notebook here, or by modifying this minimal script:

import re

from grc_macronizer import Macronizer
from grc_utils import colour_dichrona_in_open_syllables

macronizer = Macronizer(make_prints=False)

input = "Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν Ἀρταξέρξης, νεώτερος δὲ Κῦρος"

output = macronizer.macronize(input)

output_split = [sentence for sentence in re.findall(r'[^.\n;\u037e]+[.\n;\u037e]?', output) if sentence]
for line in output_split[:500]:
    print(colour_dichrona_in_open_syllables(line))

Note that if you have a newer spaCy pipeline for Ancient Greek, it is easy to substitute it for odyCy. Indeed, the rest of the software has no legacy dependencies and should run with the latest python.

License

This repository is under the copyleft GNU GPL 3 license (compatible with the MIT license), which means you are more than welcome to fork and build on this software for your own open-science research, as long as your code retains an equally generous licensing. If you have found this repository useful, please cite it in the following way:

Thörn Cleland, Albin (2025). Automatic Annotation of Ancient Greek Vowel Length.

About

Automatic annotation of Ancient Greek vowel length

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages