Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 70,484 9,811 Updated Feb 9, 2026

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 67,101 8,157 Updated Feb 9, 2026

meta-llama / llama

Inference code for Llama models

Python 59,131 9,827 Updated Jan 26, 2025

FoundationAgents / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 54,376 9,527 Updated Jan 5, 2026

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 52,797 8,943 Updated Nov 12, 2025

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,400 4,779 Updated Jun 2, 2025

karpathy / LLM101n

LLM101n: Let's build a Storyteller

36,287 1,975 Updated Aug 1, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 29,234 3,515 Updated Jan 26, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 28,514 2,887 Updated Apr 30, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 28,114 2,605 Updated Feb 9, 2026

Stability-AI / generative-models

Generative Models by Stability AI

Python 26,903 3,037 Updated Dec 16, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 26,532 1,876 Updated Jan 9, 2026

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,873 2,409 Updated Nov 24, 2025

black-forest-labs / flux

Official inference repo for FLUX.1 models

Python 25,194 1,853 Updated Jul 31, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,437 2,729 Updated Aug 12, 2024

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 22,431 2,062 Updated Jan 27, 2026

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 22,171 2,364 Updated Feb 8, 2026

PicoTrex / Awesome-Nano-Banana-images

A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

20,741 2,147 Updated Dec 12, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,746 2,033 Updated Jan 13, 2026

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,097 3,215 Updated Feb 9, 2026

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,482 2,337 Updated Dec 25, 2024

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,219 1,583 Updated Jan 30, 2026

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,202 1,406 Updated Feb 7, 2026

KlingTeam / LivePortrait

Bring portraits to life!

Python 17,763 1,841 Updated Nov 16, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,698 2,238 Updated Feb 1, 2025

yhirose / cpp-httplib

A C++ header-only HTTP/HTTPS server and client library

C++ 16,111 2,627 Updated Feb 9, 2026

QwenLM / Qwen3-Coder

Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team.

Python 15,422 1,076 Updated Feb 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robert Luo RobertLuo1

Block or report RobertLuo1

Lists (3)

Computer Vision

Natural Language Processing

保研

Stars

deepseek-ai / DeepSeek-V3

deepseek-ai / DeepSeek-R1

punkpeye / awesome-mcp-servers

PaddlePaddle / PaddleOCR