Stars
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
Cartographer is a system that provides real-time simultaneous localization and mapping (SLAM) in 2D and 3D across multiple platforms and sensor configurations.
Heterogeneous Pre-trained Transformer (HPT) as Scalable Policy Learner.
world modeling challenge for humanoid robots
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
This repository is the ComfyUI version of FaceChain
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximu…
Transfer the ControlNet with any basemodel in diffusers🔥
FaRL for Facial Representation Learning [Official, CVPR 2022]
Official implementation of Behavior Prior Representation learning for Offline Reinforcement Learning
Official code for "Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models" (TCSVT'2023)
Efficient Dataset Distillation by Representative Matching
Lossless Training Speed Up by Unbiased Dynamic Data Pruning
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Official repository for CVPR 2022 paper: I M Avatar: Implicit Morphable Head Avatars from Videos
Official PyTorch implementation of "Neural Head Avatars from Monocular RGB Videos"
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch
Official PyTorch Implementation of TransZero (AAAI'22)
This is a collection of our zero-cost NAS and efficient vision applications.
[ICCV-2021] TransReID: Transformer-based Object Re-Identification