Highlights
- Pro
Stars
OpenAI compatible TTS for Sesame CSM:1b & dia:1.6b - Voice Cloning from File/YT
Differentiable neuron simulations with biophysical detail on CPU, GPU, or TPU.
Implementation of Hippoformer, Integrating Hippocampus-inspired Spatial Memory with Transformers
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Optimize prompts, code, and more with AI-powered Reflective Text Evolution
React app for inspecting, building and debugging with the Realtime API
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
Wan: Open and Advanced Large-Scale Video Generative Models
Unofficial WIP LoRa Finetuning repository for VibeVoice
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).
MiMo-Audio: Audio Language Models are Few-Shot Learners
AI agents can now use real Android and iOS apps, just like a human.
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
Local-first AI Notepad for Private Meetings
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Easily and securely send things from one computer to another 🐊 📦
A generative world for general-purpose robotics & embodied AI learning.
Hierarchical Reasoning Model Official Release
Minimal, lightweight JAX implementations of popular models.
Pusa: Thousands Timesteps Video Diffusion Model
Kimi K2 is the large language model series developed by Moonshot AI team
uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence