Stars
Tesseract Open Source OCR Engine (main repository)
A library for efficient similarity search and clustering of dense vectors.
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Unsupervised text tokenizer for Neural Network-based text generation.
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
Open-Source Quantum Chemistry – an electronic structure package in C++ driven by Python
Unsupervised text tokenizer focused on computational efficiency
The continuation of the venerable JA2-Stracciatella project.
Fast and customizable text tokenization library with BPE and SentencePiece support
DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.
The PyICU project repository has moved to https://pyicu.org.
A multilingual dependency parser based on linear programming relaxations.
Chu-Lui-Edmonds decoding extracted from TurboParser
Chu-Liu-Edmonds maximum spanning algorithm from TurboParser for use within Python