Stars
An MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.
Generate map templates for Farming Simulator from real places.
This app can now use Android, just like a human.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
[ICCV 2025] UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoing and Understanding.
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Generate audiobooks from e-books, voice cloning & 1158+ languages!
A consolidation of various compiled open-source AI image/video upscaling product for a working CLI friendly image and video upscaling program.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell. Audio foundation model.
A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.
Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
This is a cross-platform desktop application that allows you to chat with locally hosted LLMs and enjoy features like MCP support
The simplest, fastest repository for training/finetuning small-sized VLMs.
Dia-JAX: A JAX port of Dia, the text-to-speech model for generating realistic dialogue from text with emotion and tone control.
Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you c…
A powerful coding agent toolkit providing semantic retrieval and editing capabilities (MCP server & other integrations)
video description generation vision-language model
This Windows Batchscript helps setup a Mingw-w64 compiler environment for building ffmpeg and other media tools under Windows.
Scripts to build a trimmed-down Windows 11 image.
A real-time silent speech recognition tool.