Stars
SkyReels V1: The first and most advanced open-source human-centric video foundation model
Simultaneous speech-to-text model
Run Windows apps such as Microsoft Office/Adobe in Linux (Ubuntu/Fedora) and GNOME/KDE as if they were a part of the native OS, including Nautilus integration. Hard fork of https://github.com/Fmst…
Native BM25 Ranking Index in PostgreSQL
Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
Pelias is a modular open-source geocoder using Elasticsearch.
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Tesseract Open Source OCR Engine (main repository)
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
OCR, layout analysis, reading order, table recognition in 90+ languages
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
桌面版脑图 (百度脑图离线版,思维导图) 跨平台支持 Windows/Linux/Mac OS. (A cross-platform multilingual Mind Map Tool)
GreatSQL is a MySQL branch originated from GreatDB
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.
The ultimate LLM/AI application development framework in Golang.
qData is an open-source, all-in-one data middle platform that supports core capabilities including data infrastructure, data governance, data development, monitoring & alerting, data services, and …
chinese speech pretrained models
Apache Pinot - A realtime distributed OLAP datastore
Apache Drill is a distributed MPP query layer for self describing data
One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.