Highlights
- Pro
Stars
👀 | MobileGaze: Real-Time Gaze Estimation models using ResNet 18/34/50, MobileNet v2 and MobileOne s0-s4 | In PyTorch >> ONNX Runtime Inference
[NeurIPS'2023] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models
my own studied materials and scripts
The official repository of SEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy Optimization
Enhancing {ggplot2} plots with statistical analysis 📊📣
[NeurIPS'24] Scene Graph Generation with Role-Playing Large Language Models
[CVPR'24] Neural Clustering based Visual Representation Learning
CUDA accelerated rasterization of gaussian splatting
Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.
(ICLR25 Oral) Do as We Do, Not as You Think: the Conformity of Large Language Models
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
[ICCV2025] 3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation
Official implementation of "Chemical knowledge-informed framework for privacy-aware retrosynthesis learning".
This is the official implementation of "LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels" (Accepted at CVPR 2024).
PyTorch implementation for paper "CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection"
PyTorch implementation for Contrastive Representation Learning for Gaze Estimation
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiase…
Code Notes (in Chinese) for 3D Gaussian Splatting
[CVPR24] Volumetric Environment Representation for Vision-Language Navigation
Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
[NeurIPS2023] Neural-Logic Human-Object Interaction Detection