Highlights
- Pro
Stars
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
CVPR2022 - Deep Hierarchical Semantic Segmentation - A structured, pixel-wise description of visual scenes in terms of the class hierarchy.
[CVPR24] Volumetric Environment Representation for Vision-Language Navigation
👀 | MobileGaze: Real-Time Gaze Estimation models using ResNet 18/34/50, MobileNet v2 and MobileOne s0-s4 | In PyTorch >> ONNX Runtime Inference
[ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation
Repository of our CVPR2023 paper "Lana: A Language-Capable Navigator for Instruction Following and Generation"
This is the official implementation of "LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels" (Accepted at CVPR 2024).
The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization
[CVPR'24] Neural Clustering based Visual Representation Learning
This is the official implementation of "Clustering based Point Cloud Representation Learning for 3D Analysis" (Accepted at ICCV 2023).
(ICLR25 Oral) Do as We Do, Not as You Think: the Conformity of Large Language Models
PyTorch implementation for Contrastive Representation Learning for Gaze Estimation
(ICCV23 Oral) LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning
[NeurIPS'2023] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
[NeurIPS2023] Neural-Logic Human-Object Interaction Detection
3QFP: Efficient neural implicit surface reconstruction using Tri-Quadtrees and Fourier feature Positional encoding
This is the official implementation of "Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds" (Accepted at AAAI 2024).
PyTorch implementation for paper "CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection"
Official implementation of "Chemical knowledge-informed framework for privacy-aware retrosynthesis learning".