Lists (18)
Sort Name ascending (A-Z)
Stars
[Notice] The repo temporarily locked while ownership transfer. in the meantime we maintain on here: https://github.com/ultraworkers/claw-code-parity. The fastest repo in history to surpass 100K sta…
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
🔥 🔥 🔥 Awesome MLLMs/Benchmarks for Short/Long/Streaming Video Understanding 📹
A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
A Survey of Reinforcement Learning for Large Reasoning Models
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Reference PyTorch implementation and models for DINOv3
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
SGLang is a high-performance serving framework for large language models and multimodal models.
Real-time webcam demo with SmolVLM and llama.cpp server
Muon is an optimizer for hidden layers in neural networks
A high-throughput and memory-efficient inference and serving engine for LLMs
[CVPR 2023 Highlight] Perspective Fields for Single Image Camera Calibration
🚀 「大模型」1小时从0训练67M参数的视觉多模态VLM!🌏 Train a 67M-parameter VLM from scratch in just 1 hours!
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
A Framework of Small-scale Large Multimodal Models
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Everything about the SmolLM and SmolVLM family of models
The framework to prune LLMs to any size and any config.