Highlights
- Pro
Stars
johnnynunez / decord2
Forked from dmlc/decordAn efficient video loader for deep learning with smart shuffling that's super easy to digest
VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments
See the Future: A Semantic Segmentation Network Predicting Ego-vehicle Trajectory with a Single Monocular Camera
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
RT-GENE: Real-Time Eye Gaze and Blink Estimation in Natural Environments
The official PyTorch implementation of L2CS-Net for gaze estimation and tracking
This repository contains demos I made with the Transformers library by HuggingFace.
Dual Swin Transformer for video-time-series fusion
Access large language models from the command-line
Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜
Refine high-quality datasets and visual AI models
Official repo for our paper: "What Matters in Autonomous Driving Anomaly Detection: A Weakly Supervised Horizon"
Annotation for reproducing the result of the paper "Cross-model temporal cooperation via saliency maps for efficient recognition and classification of relevant traffic lights" .
Software Development Kit for the Zenseact Open Dataset (ZOD)
A playbook for systematically maximizing the performance of deep learning models.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
A latent text-to-image diffusion model
[CVPR2023] The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
[ICCV 2021] Deep Reinforced Accident Anticipation with Visual Explanation
GLIDE: a diffusion-based text-conditional image synthesis model
[ACM MM 2020] CCD dataset for traffic accident anticipation.
Optimization Models used in my e-book with the same title
This is the repo for our Detection of Traffic Anomaly (DoTA) dataset.
[ECCV 2020] Learning stereo from single images using monocular depth estimation networks