Starred repositories
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de…
可循环值守和多人录制的直播录制软件,支持抖音、TikTok、Youtube、快手、虎牙、斗鱼、B站、小红书、pandatv、sooplive、flextv、popkontv、twitcasting、winktv、百度、微博、酷狗、17Live、Twitch、Acfun、CHZZK、shopee等40+平台直播录制
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
A Model Context Protocol (MCP) server that provides structured spec-driven development workflow tools for AI-assisted software development, featuring a real-time web dashboard and VSCode extension …
Automated workflows for Claude Code. Features spec-driven development for new features (Requirements → Design → Tasks → Implementation) and streamlined bug fix workflow for quick issue resolution (…
Breakthrough Method for Agile Ai Driven Development
OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page conte…
Build Real-Time Knowledge Graphs for AI Agents
Lets make video diffusion practical!
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
Open-source Rust based AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization. 100% local processing. no cloud required. Meetily (Me…
Turn your workflow into a Photoshop plugin.把你的工作流变成Photoshop插件。
Give Cursor Agent an AI Team and Advanced Skills
Turn any website into clean data pipelines & structured APIs in minutes!
Swei Sans-derived from Noto Sans CJK font family with a more concise & modern look. 獅尾黑體基於思源黑體改造,擁有更加簡明現代化的字體家族。
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"(ICCV2025)
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
No fortress, purely open ground. OpenManus is Coming.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Sonic is a method about ' Shifting Focus to Global Audio Perception in Portrait Animation',you can use it in comfyUI
Magic to turn Cursor/Windsurf as 90% of Devin