Starred repositories
OpenSkills: Run Claude Skills Locally using any LLM
Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.
Lightning-Fast, On-Device TTS — running natively via ONNX.
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
欢迎来到电子书下载宝库,一个汇聚了各类电子书下载链接的地方。无论你是喜欢阅读经典文学、经管励志、终身学习、职场创业、技术手册还是其他类型的书籍,这里都能满足你的需求。 该库涵盖了帆书app(原樊登读书)、微信读书、京东读书、喜马拉雅等读书app的大部分电子书。
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
Light, flexible and extensible ASGI framework | Built to scale
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more
A simple yet powerful agent framework that delivers with open-source models
"RAG-Anything: All-in-One RAG Framework"
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation
Eigent: The World's First Multi-agent Workforce to Unlock Your Exceptional Productivity.
Unified Multimodal Model for image generation/editing/understanding
Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.
An open-source AI agent that lives in your terminal.