Skip to content
View oroszgy's full-sized avatar
:octocat:
:octocat:

Organizations

@ec-doris @huspacy

Block or report oroszgy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

NLP tools

174 repositories

🪼 a python library for doing approximate and phonetic matching of strings.

Jupyter Notebook 2,177 163 Updated Dec 15, 2025

This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.

Python 125 11 Updated May 29, 2024

Library for clinical NLP with spaCy.

Jupyter Notebook 623 107 Updated Aug 4, 2025

Open Source Data Annotation & Labeling Tools

663 56 Updated Oct 27, 2025
C++ 857 123 Updated May 24, 2023

spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface

Python 261 20 Updated Aug 21, 2025

This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.

Python 244 14 Updated Jun 19, 2023

Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

Python 30,607 3,626 Updated Dec 22, 2025

Few-shot Named Entity Recognition

Python 122 6 Updated Mar 30, 2022

Automatically detect errors in annotated corpora.

Python 48 6 Updated Sep 8, 2023

BERT for Coreference Resolution

Python 454 95 Updated Dec 8, 2022

A simple library for training named entity recognition model from partially annotated data

Jupyter Notebook 24 2 Updated Nov 12, 2023

Community Curated NLP List

201 33 Updated Jul 25, 2022

Parse natural language time expressions in python

Python 131 27 Updated Nov 28, 2022

A list of publications on NLP interpretability (Welcome PR)

168 6 Updated Dec 13, 2020

Python binding to Modest and Lexbor engines. Fast HTML5 parser with CSS selectors for Python.

Cython 1,494 90 Updated Dec 17, 2025

A visual labeling system implemented in Jupyter widgets.

Python 155 14 Updated Nov 13, 2024
Python 176 33 Updated Jun 19, 2024

REMERGE - Multi-Word Expression discovery algorithm

Python 14 3 Updated Oct 12, 2022

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,342 3,244 Updated Dec 22, 2025
Python 55 5 Updated Jan 9, 2024

MTEB: Massive Text Embedding Benchmark

Python 3,037 526 Updated Dec 22, 2025

Super Fast String Matching in Python

Python 371 75 Updated Mar 14, 2025

OpenRefine is a free, open source power tool for working with messy data and improving it

Java 11,662 2,112 Updated Dec 22, 2025

Unsupervised text tokenizer focused on computational efficiency

C++ 974 109 Updated Mar 29, 2024

✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3

Python 323 26 Updated Aug 9, 2023

Export Hugging Face models to Core ML and TensorFlow Lite

Python 688 52 Updated Jul 23, 2024

Zero and Few shot named entity & relationships recognition

Python 398 25 Updated Sep 17, 2025

Python Finite-State Toolkit

Python 60 11 Updated Nov 21, 2025