Stars
- All languages
- Assembly
- Batchfile
- C
- C#
- C++
- CSS
- Cuda
- Dart
- Dockerfile
- Elixir
- Elm
- Erlang
- FreeMarker
- Go
- Groovy
- HTML
- Java
- JavaScript
- Jinja
- Jupyter Notebook
- Kotlin
- MATLAB
- MDX
- Makefile
- Mojo
- Move
- Objective-C
- PHP
- PostScript
- Python
- R
- Roff
- Ruby
- Rust
- Scala
- Shell
- Smarty
- Solidity
- Swift
- Tcl
- TeX
- TypeScript
- Vue
- WebAssembly
Res-SAM Framework for GPR Underground Hazard Detection
Code for SCIS-2025 Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
PantoMatrix: Generating Face and Body Animation from Speech
[ICLR 2025 Oral] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for complex reasoning tasks. Building on this resource, we propose Sou…
Neural Network Compression Framework for enhanced OpenVINO™ inference
[NeurIPS 2024] An official implementation of "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions"
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
Memory-Guided Diffusion for Expressive Talking Video Generation
A unified ensemble framework for PyTorch to improve the performance and robustness of your deep learning model.
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
[CVPR 2025 Highlight] 3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
A codebase and a curated list of awesome deep long-tailed learning (TPAMI 2023).
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsens…
Code implementation of "Learning Efficient Online 3D Bin Packing on Packing Configuration Trees". We propose to enhance the practical applicability of online 3D Bin Packing Problem (BPP) via learni…
Video generation from text&image, 1st-gen
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
A comprehensive benchmark of deepfake detection
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Next-Generation Interactive Intelligent Programming Assistant
Official code for TimeCraft: A Time Series Generation Framework for Real-World Applications
🧠+🎧 Build your music algorithms and AI models with the next-gen DAW 🔥