-
Tsinghua university
- Haidian District, Beijing, PRC
-
19:42
(UTC +08:00) - https://jzsherlock4869.github.io/
Stars
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
MCP 资源精选, MCP指南,Claude MCP,MCP Servers, MCP Clients
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Train transformer language models with reinforcement learning.
The official GitHub page for the survey paper "A Survey of Large Language Models".
An open protocol enabling communication and interoperability between opaque agentic applications.
🦜🔗 The platform for reliable agents.
face recognition algorithms in pytorch framework, including arcface, cosface, sphereface and so on
[ECCV 2024 Oral 🔥] Arc2Face: A Foundation Model for ID-Consistent Human Faces ------------------------ [ICCVW 2025] ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion
Open Iris Recognition Inference System (IRIS)
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Unofficial implementation of Tune-A-Video
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
[Siggraph Asia 2024 & IJCV 2025] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
Janus-Series: Unified Multimodal Understanding and Generation Models
CVPR 2025 DarkIR: Robust Low-Light Image Restoration - State of the art low light deblurring. NTIRE 2025 Best Method. [Official PyTorch Implementation]
Control Color: Multimodal Diffusion-based Interactive Image Colorization
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)