Stars
We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for complex reasoning tasks. Building on this resource, we propose Sou…
[CVPR2024] DisCo: Referring Human Dance Generation in Real World
[NeurIPS 2024] An official implementation of "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions"
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
Memory-Guided Diffusion for Expressive Talking Video Generation
Panda项目是于2023年5月启动的开源海外中文大语言模型项目,致力于大模型时代探索整个技术栈,旨在推动中文自然语言处理领域的创新和合作。
A unified ensemble framework for PyTorch to improve the performance and robustness of your deep learning model.
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
[CVPR 2025 Highlight] 3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
A codebase and a curated list of awesome deep long-tailed learning (TPAMI 2023).
A Light CNN for Deep Face Representation with Noisy Labels, TIFS 2018
Code implementation of "Learning Efficient Online 3D Bin Packing on Packing Configuration Trees". We propose to enhance the practical applicability of online 3D Bin Packing Problem (BPP) via learni…
Video generation from text&image, 1st-gen
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Official code for TimeCraft: A Time Series Generation Framework for Real-World Applications
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Next-Generation Interactive Intelligent Programming Assistant
🚀 Truly open-source real-time, high-fidelity face-swap engine for AI avatar(digital human)..
🧠+🎧 Build your music algorithms and AI models with the next-gen DAW 🔥
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
OpenFE: automated feature generation with expert-level performance
Implementation of DynIBaR Neural Dynamic Image-Based Rendering (CVPR 2023)
A Supervised and Semi-Supervised Object Detection Library for YOLO Series
[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"
[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)
[AAAI 2025] Dynamic Protein Data Bank