Skip to content
View sneakerkg's full-sized avatar

Organizations

@dmlc

Block or report sneakerkg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LLM101n: Let's build a Storyteller

37,364 2,053 Updated Aug 1, 2024

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

1,027 47 Updated Sep 27, 2025

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 33,146 4,196 Updated Jun 22, 2026

Paper reading notes on Deep Learning and Machine Learning

Jupyter Notebook 1,265 180 Updated Jun 4, 2026

[IEEE T-PAMI 2024] All you need for End-to-end Autonomous Driving

3,637 333 Updated Jul 2, 2025

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Jupyter Notebook 1,840 101 Updated Feb 1, 2025

We write your reusable computer vision tools. 💜

Python 44,797 3,974 Updated Jun 22, 2026

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 2,649 207 Updated Feb 16, 2025

A curated list of foundation models for vision and language tasks

1,166 60 Updated Apr 20, 2026

Awesome papers & datasets specifically focused on long-term videos.

380 14 Updated Oct 9, 2025

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,635 1,594 Updated Sep 5, 2024

Active learning

Python 78 10 Updated Feb 8, 2023

Computer Vision Annotation Tool (CVAT) is a leading platform for building high-quality visual datasets for vision AI. It offers open-source, cloud, and enterprise products, as well as labeling serv…

Python 16,120 3,716 Updated Jun 22, 2026

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 24,875 2,152 Updated Jul 29, 2025

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 54,372 6,363 Updated Sep 18, 2024

An open-source framework for training large multimodal models.

Python 4,107 321 Updated Aug 31, 2024

🎢 Creating and sharing simulation environments for embodied and synthetic data research

Python 194 14 Updated May 26, 2026

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Python 4,693 445 Updated Dec 16, 2025

Visual tracking library based on PyTorch.

Python 3,504 613 Updated Aug 8, 2024

An on-going paper list on new trends in 3D vision with deep learning

332 31 Updated Jun 17, 2022

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 33,831 4,020 Updated Mar 25, 2026

Visualization tool for Graph Neural Networks

TypeScript 261 29 Updated Sep 20, 2022

The Replica Dataset v1 as published in https://arxiv.org/abs/1906.05797 .

C++ 1,288 111 Updated Jul 22, 2024

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,758 275 Updated May 21, 2026

PointTrack (ECCV2020 ORAL): Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

Python 265 47 Updated Oct 3, 2023
Next