Highlights
Lists (8)
Sort Name ascending (A-Z)
Awesome Lists
Digital Twins
GS / NeRF
Multimodal/Foundation/OpenVocab
Papers related to vision-language, multi-modal, multi-sensor, foundation models and open vocabulary modelsMVS / SfM / SLAM / Depth / Pose
Papers related to SfM, MVS, SLAM, Depth, Matching, and camera pose estimation methods.Recognition/ Detection/ Tracking
Papers related to recognition, detection, tracking and pre-training methods.Scene Reconstruction/ Understand
Papers related to scene reconstruction, understanding, and simulation.Tools / Datasets
Papers and Projects related to useful tools and datasets.Stars
All Algorithms implemented in Python
Python tool for converting files and office documents to Markdown.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Collection of Summer 2026 tech internships!
Convert PDF to markdown + JSON quickly with high accuracy
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep lear…
Python sample codes and textbook for robotics algorithms.
Open-Sora: Democratizing Efficient Video Production for All
A generative world for general-purpose robotics & embodied AI learning.
State-of-the-art 2D and 3D Face Analysis Project
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
An open source implementation of CLIP.
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Self-hosted video downloader for YouTube and other sites (web UI for youtube-dl / yt-dlp)
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
Enjoy the magic of Diffusion models!
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A collaboration friendly studio for NeRFs
Refine high-quality datasets and visual AI models
Hydra is a framework for elegantly configuring complex applications