Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 75,345 10,229 Updated Apr 6, 2026

mulanai / MuLan

MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)

Python 149 3 Updated Jan 24, 2025

HarborYuan / ovsam

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 1,033 36 Updated Aug 4, 2025

chn-lee-yumi / MaterialSearch

Semantic search. Search local photos and videos through natural language. AI语义搜索本地素材。以图搜图、查找本地素材、根据文字描述匹配画面、视频帧搜索、根据画面描述搜索视频。

HTML 1,846 202 Updated Mar 19, 2026

gomate-community / TrustRAG

TrustRAG：The RAG Framework within Reliable input,Trusted output

Python 1,250 136 Updated Jan 7, 2026

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,955 770 Updated Sep 22, 2025

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,673 134 Updated Apr 4, 2026

alibaba / EasyNLP

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

Python 2,179 258 Updated Nov 27, 2024

zhaibowen / Retriever-CLIP

Python 9 2 Updated Apr 1, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 29,292 3,530 Updated Jan 26, 2025

wusize / CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Python 202 10 Updated Feb 5, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,669 2,756 Updated Aug 12, 2024

JIA-Lab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,328 275 Updated May 4, 2024

lansinuote / Huggingface_Toturials

bert-base-chinese example

Jupyter Notebook 939 248 Updated Aug 7, 2023

open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark

Python 32,586 9,846 Updated Aug 21, 2024

xai-org / grok-1

Grok open release

Python 51,525 8,465 Updated Aug 30, 2024

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,648 585 Updated Oct 24, 2024

ksOAn6g5 / TaiSu

TaiSu（太素）--a large-scale Chinese multimodal dataset（亿级大规模中文视觉语言预训练数据集）

Python 191 13 Updated Nov 17, 2023

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

17,596 1,123 Updated Apr 9, 2026

zai-org / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,738 454 Updated May 29, 2024

zai-org / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Python 4,164 423 Updated Aug 23, 2024

yangjianxin1 / CLIP-Chinese

中文CLIP预训练模型

Python 423 61 Updated Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwerty6518

Block or report qwerty6518

Stars

QwenLM / Qwen-Agent

hiyouga / LlamaFactory

deepseek-ai / DeepSeek-V3

deepseek-ai / DeepSeek-R1

aim-uofa / AdelaiDet

ultralytics / ultralytics

Dao-AILab / flash-attention

LinWeizheDragon / Retrieval-Augmented-Visual-Question-Answering

PaddlePaddle / PaddleOCR