Stars
Silero Stress — pre-trained enterprise-grade automated stress and homograph disambiguation for the Russian language
Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electro…
Chu-Lui-Edmonds decoding extracted from TurboParser
Chu-Liu-Edmonds maximum spanning algorithm from TurboParser for use within Python
A multilingual dependency parser based on linear programming relaxations.
aiopg is a library for accessing a PostgreSQL database from the asyncio
A fast PostgreSQL Database Client Library for Python/asyncio.
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Open Source search based on OpenStreetMap data
Foundational Model for Speech Recognition Tasks
Morphological analyzer / inflection engine for Russian and Ukrainian languages. Fork of https://github.com/pymorphy2/pymorphy2
no-plagiarism / pymorphy3
Forked from pymorphy2/pymorphy2Morphological analyzer / inflection engine for Russian and Ukrainian languages.
DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library. Fork of https://github.com/pytries/DAWG
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
ExplainitAll — это библиотека для интерпретируемого ИИ, предназначенная для интерпретации генеративных моделей ( GPT-like), и векторизаторов, например, Sbert.
Metric learning and retrieval pipelines, models and zoo.
"Руформеры" - список популярных базовых моделей на основе трансформеров для решения задач по автоматической обработке русского языка
A fast inference library for running LLMs locally on modern consumer-class GPUs
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Official inference library for Mistral models
OpenChat: Advancing Open-source Language Models with Imperfect Data
OpenProject is the leading open source project management software.
OpenMMLab Text Detection, Recognition and Understanding Toolbox