Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
-
Updated
Sep 12, 2025 - Python
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Document Layout Analysis resources repos for development with PdfPig.
ParlaMint: Comparable Parliamentary Corpora
hand-written dictionaries from the FreeDict project
Kitodo.Presentation is a feature-rich framework for building a METS- or IIIF-based digital library. It is part of the Kitodo Digital Library Suite.
The main TEI Publisher app
An experiment in collaborative TEI text encoding
a repository to help introduce and orient students to the GitHub collaboration environment, and to support DH classes.
Lili Elbe Digital Archive practicum - learning markup via an engaged markdown community. Visit our wiki!
PhiloLogic4
Manuscript Descriptions encoded according to the Text Encoding Initiative
A highly customizable plugin for setting up and activating remote-driven autocompletions of attribute values in the oXygen XML Editor.
Add a description, image, and links to the tei topic page so that developers can more easily learn about it.
To associate your repository with the tei topic, visit your repo's landing page and select "manage topics."