Stars
Simple proxy worker for using ollama in cursor
[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.
[ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs
VisioFirm: Cross-Platform AI-assisted Annotation Tool for Computer Vision
Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
Official implementation of "Unseen Visual Anomaly Generation" (CVPR 2025)
Segment Anything for Stable Diffusion WebUI
Yet another SAM webui + CLIP
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
You Only Look Once for Panopitic Driving Perception.(MIR2022)
Unofficial implemention of lanenet model for real time lane detection
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
GPU cluster manager for optimized AI model deployment
Next-generation AI Agent Optimization Platform: Cozeloop addresses challenges in AI agent development by providing full-lifecycle management capabilities from development, debugging, and evaluation…
Open Source HTML5 Puzzle Game Engine
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
Aligning pretrained language models with instruction data generated by themselves.
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
An open source implementation of CLIP.
A PyTorch toolbox for domain generalization, domain adaptation and semi-supervised learning.