Skip to content
View sinaahmadi's full-sized avatar
⏱️
Kî ne em?
⏱️
Kî ne em?

Highlights

  • Pro

Organizations

@insight-centre @ZurichNLP @elexis-eu @CoFiF @KurdishXeLaTeX @DOLMA-NLP

Block or report sinaahmadi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

29 stars written in C++
Clear filter

A library for efficient similarity search and clustering of dense vectors.

C++ 39,013 4,220 Updated Feb 6, 2026

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,976 8,844 Updated Feb 7, 2026

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++ 26,723 4,101 Updated Jun 19, 2025

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,627 1,318 Updated Feb 7, 2026

A C++ standalone library for machine learning

C++ 5,440 502 Updated Jan 12, 2026

Fast inference engine for Transformer models

C++ 4,291 440 Updated Feb 4, 2026

KenLM: Faster and Smaller Language Model Queries

C++ 2,731 534 Updated Mar 30, 2025

The most popular spellchecking library.

C++ 2,427 265 Updated Jan 14, 2026

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.

C++ 2,256 807 Updated Dec 10, 2023

Fast Neural Machine Translation in C++

C++ 1,418 244 Updated Aug 25, 2023

pgf/TikZ diagram editor

C++ 1,224 78 Updated Apr 17, 2024

Unsupervised text tokenizer focused on computational efficiency

C++ 977 109 Updated Mar 29, 2024
C++ 870 126 Updated May 24, 2023

Simple, fast unsupervised word aligner

C++ 766 160 Updated Jul 19, 2022

MARISA: Matching Algorithm with Recursively Implemented StorAge

C++ 594 99 Updated Jan 27, 2026

Juman++ (a Morphological Analyzer Toolkit)

C++ 407 46 Updated Oct 3, 2023

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files

C++ 393 86 Updated Jan 28, 2026

Fast and customizable text tokenization library with BPE and SentencePiece support

C++ 329 79 Updated Jan 10, 2026

GIZA++ is a statistical machine translation toolkit that is used to train IBM Models 1-5 and an HMM word alignment model. This package also contains the source for the mkcls tool which generates th…

C++ 273 83 Updated Nov 18, 2025

🖋️ Fast and safe spellchecking C++ library

C++ 261 27 Updated Nov 25, 2025

Notepad++ Spell-checking Plug-in

C++ 226 36 Updated Jan 1, 2026

(Official repo for pypi package) Python bindings for the Hunspell spellchecker engine

C++ 190 40 Updated Feb 2, 2021

Sentence aligner

C++ 124 40 Updated May 21, 2021

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use…

C++ 70 14 Updated Dec 11, 2025
C++ 42 12 Updated Jul 17, 2018

Extracts highlighted text from PDF documents.

C++ 31 5 Updated Jan 14, 2018

Editor for aligned parallel texts (personal desktop application).

C++ 20 1 Updated Jan 15, 2026

Code to reproduce experiments in "A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages"

C++ 9 3 Updated Apr 10, 2019

Anchor Hidden Markov Models

C++ 8 3 Updated Aug 2, 2016