Skip to content
View sneakerkg's full-sized avatar

Organizations

@dmlc

Block or report sneakerkg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LLM101n: Let's build a Storyteller

37,319 2,051 Updated Aug 1, 2024

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

1,024 47 Updated Sep 27, 2025

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 33,084 4,184 Updated Jun 14, 2026

Paper reading notes on Deep Learning and Machine Learning

Jupyter Notebook 1,263 180 Updated Jun 4, 2026

[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving

3,630 333 Updated Jul 2, 2025

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Jupyter Notebook 1,841 101 Updated Feb 1, 2025

We write your reusable computer vision tools. 💜

Python 44,206 3,927 Updated Jun 15, 2026

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,648 206 Updated Feb 16, 2025

A curated list of foundation models for vision and language tasks

1,164 60 Updated Apr 20, 2026

Awesome papers & datasets specifically focused on long-term videos.

378 14 Updated Oct 9, 2025

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,633 1,593 Updated Sep 5, 2024

Active learning

Python 78 10 Updated Feb 8, 2023

Computer Vision Annotation Tool (CVAT) is a leading platform for building high-quality visual datasets for vision AI. It offers open-source, cloud, and enterprise products, as well as labeling serv…

Python 16,065 3,705 Updated Jun 12, 2026

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 24,853 2,145 Updated Jul 29, 2025

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 54,344 6,354 Updated Sep 18, 2024

An open-source framework for training large multimodal models.

Python 4,106 321 Updated Aug 31, 2024

🎢 Creating and sharing simulation environments for embodied and synthetic data research

Python 195 14 Updated May 26, 2026

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,695 446 Updated Dec 16, 2025

Visual tracking library based on PyTorch.

Python 3,504 613 Updated Aug 8, 2024

An on-going paper list on new trends in 3D vision with deep learning

331 31 Updated Jun 17, 2022

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 33,766 4,016 Updated Mar 25, 2026

Visualization tool for Graph Neural Networks

TypeScript 261 29 Updated Sep 20, 2022

The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .

C++ 1,283 111 Updated Jul 22, 2024

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,750 275 Updated May 21, 2026

PointTrack (ECCV2020 ORAL): Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

Python 265 47 Updated Oct 3, 2023
Next