Beodata

A Python package for processing Beowulf text data from Heorot.dk.

Features

The heorot.py module can parse the dual-language edition of Beowulf and render the complete text in several formats:

as a single combined JSON file
as a single combined CSV
as separate .ASS (Advanced SubStation Alpha subtitle format) files, one file per fitt

The module follows the Heorot.dk line numbering and fitt numbering system.

Installation

Get on Python 3.13 and install poetry:

curl -sSL https://install.python-poetry.org | python3 -

poetry install
pre-commit install

Usage

Running heorot.py

To process the Beowulf text and generate all output formats:

poetry run heorot
# or more directly:
poetry run python -m beodata.parse.heorot

This will:

Fetch the HTML content from heorot.dk (if not already cached)
Parse the dual-language text
Generate JSON and CSV files in tests/data/fitts/
Create ASS subtitle files for each fitt in tests/data/subtitles/

Required Files

The script requires a blank ASS template file:

tests/data/blank.ass - Template file for ASS subtitle formatting

Output Files

After running the script, you'll find:

In tests/data/fitts/:

maintext.json - Complete text data in JSON format
maintext.csv - Complete text data in CSV format
maintext.html - Cached HTML from heorot.dk

In tests/data/subtitles/:

fitt_0.ass through fitt_43.ass - ASS subtitle files for each fitt (except fitt 24, which doesn't exist)

Project Structure

beodata/parse/heorot.py - Main parsing and processing logic for Heorot.DK HTML text
beodata/text/numbering.py - Fitt boundaries and text structure constants
beodata/subtitle/constants.py - Subtitle generation constants
tests/data/fitts/ - Output directory for generated files
tests/data/subtitles/ - Output directory for ASS subtitle files

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.claude		.claude
.idea		.idea
.vscode		.vscode
beodata		beodata
tests		tests
.cursorignore		.cursorignore
.cursorrules		.cursorrules
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Beodata

Features

Installation

Usage

Running heorot.py

Required Files

Output Files

Project Structure

Copyright

About

Uh oh!

Releases

Packages

Languages

Calliehub/beodata

Folders and files

Latest commit

History

Repository files navigation

Beodata

Features

Installation

Usage

Running heorot.py

Required Files

Output Files

Project Structure

Copyright

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages