-
Ningbo University
- 818 Feng Hua Road, Jiangbei, Ningbo, Zhejiang, China
-
12:14
(UTC +08:00)
Stars
Elevate your AI research writing, no more tedious polishing ✨
[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
MIRA: Medical Time Series Foundation Model for Real-World Health Data
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
[Early Accepted at MICCAI 2023] Pytorch Code of "InverseSR: 3D Brain MRI Super-Resolution Using a Latent Diffusion Model"
[NeurIPS 2024] Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer
A latent text-to-image diffusion model
An official implementation of "Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition" in PyTorch. (CVPR 2025)
[JBHI 2024] This is a code implementation of the hybrid-granularity ordinal learning proposed in the manuscript "HOPE: Hybrid-granularity Ordinal Prototype Learning for Progression Prediction of Mi…
[ECCV 2024] Official repository of Agent Attention
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
three-dimensional medical image classification using Multi-plane and Multi-slice Transformer
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
A Modality-Flexible Framework for Alzheimer's Disease Diagnosis Following Clinical Routine
A PyTorch implementation of VGG16. This could be considered as a variant of the original VGG16 since BN layers are added after each conv. layer
[CVPR 2024] Guided Slot Attention for Unsupervised Video Object Segmentation
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Efficient Multimodal Transformer with Dual-Level Feature Restoration for Robust Multimodal Sentiment Analysis (TAC 2023)
Software platform for clinical neuroimaging studies
Deep learning models for remote sensing applications
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".