Stars
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
Fast and memory-efficient exact attention
This is the official repository for Retrieval Augmented Visual Question Answering
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Semantic search. Search local photos and videos through natural language. AI语义搜索本地素材。以图搜图、查找本地素材、根据文字描述匹配画面、视频帧搜索、根据画面描述搜索视频。
TrustRAG:The RAG Framework within Reliable input,Trusted output
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
bert-base-chinese example
OpenMMLab Detection Toolbox and Benchmark
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)
✨✨Latest Advances on Multimodal Large Language Models
a state-of-the-art-level open visual language model | 多模态预训练模型
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型