-
UM & SIAT & Shanghai AI Lab
- Shanghai, China
- https://chxy95.github.io/
Stars
Repo for SwiftVR: Real-Time One-Step Generative Video Restoration
World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio.
你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.
🏛️ 三省六部制 · OpenClaw Multi-Agent Orchestration System — 9 specialized AI agents with real-time dashboard, model config, and full audit trails
LLM驱动的 A/H/美股智能分析:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US markets.
An open, curated collection of Agent Skills for scientific research — clone it, use it, extend it!
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Elevate your AI research writing, no more tedious polishing ✨
[CVPR 2026] PersonaLive! : Expressive Portrait Image Animation for Live Streaming
FireRed-Image-Edit is a powerful image editing foundation model achieving open-source state-of-the-art performance with precise instruction following, high-fidelity generation, superior identity co…
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
akanametov / yolo-face
Forked from ultralytics/ultralyticsYOLO Face 🚀 in PyTorch
CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包
Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
TeleMem is a high-performance drop-in replacement for Mem0, featuring semantic deduplication, long-term dialogue memory, and multimodal video reasoning.
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
Tiny AutoEncoder for Hunyuan Video (and other video models)
Tiny AutoEncoder for Stable Diffusion (and other image models)
Light Image Video Generation Inference Framework
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.
The official repo of TeleEgo - A Benchmark for Egocentric AI Assistants.
[CVPR 2026] Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny co…
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
[CVPR 2026] ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding(书生 · 妙析多模态美学理解大模型)