conllu

Here are 33 public repositories matching this topic...

rmalouf / treesearch

High-performance toolkit for querying linguistic dependency parses

python nlp rust linguistics treebank computational-linguistics corpus-linguistics universal-dependencies dependency-parsing conll-u corpus-tools conllu

Updated Jan 21, 2026
Rust

pyconll / pyconll

Star

A minimal, pure Python library to interface with CoNLL-U format files.

python annotation minimal linguistics universal-dependencies dependency-parsing conllu

Updated Dec 5, 2025
Python

gpizzorno / conllu_tools

Star

A Python toolkit for working with CoNLL-U files, Universal Dependencies treebanks, and annotated corpora.

nlp natural-language-processing latin universal-dependencies text-annotation brat ud conllu tag-conversion conllu-validation conllu-evaluation tag-normalization

Updated Nov 29, 2025
Python

bogwi / rookeen

Star

spaCy-based CLI for web linguistic analysis with embeddings, sentiment, POS/NER, and Unix pipeline composability. Outputs JSON, Parquet, CoNLL-U for ML workflows.

Updated Nov 2, 2025
Python

instituutnederlandsetaal / galahad

Star

"Galahad". Goal: enable linguists to experiment with different taggers and use the result in other INT products

kotlin evaluation tagging linguistics tagger tei tei-xml evaluation-metrics folia conll-u naf conllu

Updated Feb 6, 2026
Kotlin

aspirant2018 / conllu-pos-dataset

Star

A minimal, pure Python interface that turns CoNLL-U format files into A huggingFace Dataset

dataset pos pos-tagging conll-u conllu huggingface-datasets

Updated Jul 8, 2025
Python

kanincityy / bert_pos

Star

BERT Fine-Tuning for Part-of-Speech (POS) Tagging (PyTorch & Hugging Face).

nlp deep-learning pytorch computational-linguistics bert pos-tagging conllu huggingface-transformers

Updated Jun 5, 2025
Python

proycon / foliatools

Star

A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.

nlp converters computational-linguistics folia clarin clariah conllu

Updated May 8, 2025
Python

veldhub / veld_code__udpipe

Star

Code velds encapsulating UDPipe.

nlp tokenization udpipe conllu

Updated Jan 21, 2025
C++

veldhub / veld_data__eltec_conllu_stats

Star

Data velds encapsulating statistics on conllu data.

nlp statistics analysis conllu

Updated Jan 20, 2025

veldhub / veld_data__demo_train_data_ts-vienna-2024

Star

Demo training data for the CLSInfra training school 2024.

nlp training-data conllu gold-data

Updated Jan 20, 2025

veldhub / veld_code__analyse_conllu

Star

Code velds encapsulating creation of statistical summary on conllu data.

nlp analysis conllu

Updated Jan 20, 2025
Jupyter Notebook

GiulioTaralli / Hidden-Markov-Model-NER-tagging

Star

NER tagging with HMM and Viterbi algorithm - University Project

python viterbi-algorithm pandas hidden-markov-model conllu ner-tagging

Updated Jul 27, 2024
Jupyter Notebook

eaklykova / syntaxcomp

Star

A Python3 package for extracting syntactic complexity measures from CoNLL-U annotations.

syntax complexity sentence-segmentation udpipe conllu text-complexity clause-segmentation syntactic-complexity

Updated Jun 18, 2024
Python

TajaKuzman / Parlamint-translation

Star

A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.

machine-translation word-alignment conllu dataset-preparation parlamint