- SF Bay Area
-
23:30
(UTC -08:00) - https://bryanyzhu.github.io/
Highlights
- Pro
Stars
A pipeline parallel training script for diffusion models.
Community trainer for Lightricks' LTX Video model 🎬 ⚡️
[SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head
[Preprint] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
Wan: Open and Advanced Large-Scale Video Generative Models
Multilingual Document Layout Parsing in a Single Vision-Language Model
We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…
[ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
Foundation Models and Data for Human-Human and Human-AI interactions.
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
React UI + elegant infrastructure for AI Copilots, AI chatbots, and in-app AI agents. The Agentic last-mile 🪁
Text-audio foundation model from Boson AI
本仓库包含对 Claude Code v1.0.33 进行逆向工程的完整研究和分析资料。包括对混淆源代码的深度技术分析、系统架构文档,以及重构 Claude Code agent 系统的实现蓝图。主要发现包括实时 Steering 机制、多 Agent 架构、智能上下文管理和工具执行管道。该项目为理解现代 AI agent 系统设计和实现提供技术参考。
AI overlays on top of what you are doing
Get your documents ready for gen AI
Kimi K2 is the large language model series developed by Moonshot AI team
Fast and local neural text-to-speech engine
[Up-to-date] Awesome Agentic Deep Research Resources
Anthropic's educational courses
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Latest Advances on Long Chain-of-Thought Reasoning
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus Agent Tools, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae…