Starred repositories
A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
An autonomous agent that conducts deep research on any data using any LLM providers
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Paper list for Efficient Reasoning.
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
[ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
A high-throughput and memory-efficient inference and serving engine for LLMs
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
[ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
FreeDA: Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation (CVPR 2024)
Official repository of paper "Subobject-level Image Tokenization" (ICML-25)
Recent LLM-based CV and related works. Welcome to comment/contribute!
Code for the paper Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models @ CVPR 2024
A curated list of Large Language Model (LLM) Interpretability resources.
[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models
PyTorch Implementation of NACLIP in "Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"
official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"
DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an expe…
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory wh…