Starred repositories
No fortress, purely open ground. OpenManus is Coming.
【三年面试五年模拟】AIGC/LLM/AI Agent算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning and training
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
A curated list of reinforcement learning with human feedback resources (continually updated)
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
A list of awesome papers and resources of recommender system on large language model (LLM).
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
Get up and running with Kimi-K2.6, GLM-5.1, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
DeepSeek-VL: Towards Real-World Vision-Language Understanding
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
[ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant
Large World Model -- Modeling Text and Video with Millions Context