Starred repositories
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
A high-throughput and memory-efficient inference and serving engine for LLMs
FlashMLA: Efficient Multi-head Latent Attention Kernels
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
A generative speech model for daily dialogue.
Effortless data labeling with AI support from Segment Anything and other awesome models.
Generate text line images for training deep learning OCR models
🔥🔥🔥 专注于YOLO11,YOLOv8、TYOLOv12、YOLOv10、RT-DETR、YOLOv7、YOLOv5改进模型,Support to improve backbone, neck, head, loss, IoU, NMS and other modules🚀
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
ModelScope: bring the notion of Model-as-a-Service to life.
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows
这是一款提高ChatGPT的数据安全能力和效率的插件。并且免费共享大量创新功能,如:自动刷新、保持活跃、数据安全、取消审计、克隆对话、言无不尽、净化页面、展示大屏、拦截跟踪、日新月异、明察秋毫等。让我们的AI体验无比安全、顺畅、丝滑、高效、简洁。
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Tesseract Open Source OCR Engine (main repository)