Starred repositories
📚 Freely available programming books
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
🦜🔗 The platform for reliable agents.
Interact with your documents using the power of GPT, 100% privately, no data leaks
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
Federated query engine for AI - The only MCP Server you'll ever need
Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Unified framework for building enterprise RAG pipelines with small, specialized models
Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
Minimal reproduction of DeepSeek R1-Zero
Fully automated homelab from empty disk to running services with a single command.
Modeling, training, eval, and inference code for OLMo
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
A social networking service scraper in Python
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Elyra extends JupyterLab with an AI centric approach.
A reactive Python kernel for Jupyter notebooks.
dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Specification for storing geospatial vector data (point, line, polygon) in Parquet
Streaming reactive and dataflow graphs in Python