Stars
This repository contains the official implementation of the research papers, "MobileCLIP" CVPR 2024 and "MobileCLIP2" TMLR August 2025
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
Generative models for conditional audio generation
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
✨✨Latest Advances on Multimodal Large Language Models
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Official implementations for paper: Anydoor: zero-shot object-level image customization
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
A playbook for systematically maximizing the performance of deep learning models.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer
Instant neural graphics primitives: lightning fast NeRF and more
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Cross-platform, customizable ML solutions for live and streaming media.
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
State-of-the-art 2D and 3D Face Analysis Project
protobuf-converter is library for transforming your Domain Model Objects into Google Protobuf Messages and vice versa.
Magnificent app which corrects your previous console command.
Over 450 terminal color schemes/themes for iTerm/iTerm2. Includes ports to Terminal, Konsole, PuTTY, Xresources, XRDB, Remmina, Termite, XFCE, Tilda, FreeBSD VT, Terminator, Kitty, MobaXterm, LXTer…