Starred repositories
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
A high-throughput and memory-efficient inference and serving engine for LLMs
FlashMLA: Efficient Multi-head Latent Attention Kernels
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
A generative speech model for daily dialogue.
Effortless data labeling with AI support from Segment Anything and other awesome models.
Generate text line images for training deep learning OCR models
🔥🔥🔥 专注于YOLO11,YOLOv8、TYOLOv12、YOLOv10、RT-DETR、YOLOv7、YOLOv5改进模型,Support to improve backbone, neck, head, loss, IoU, NMS and other modules🚀
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
ModelScope: bring the notion of Model-as-a-Service to life.
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows
这是一款提高ChatGPT的数据安全能力和效率的插件。并且免费共享大量创新功能,如:自动刷新、保持活跃、数据安全、取消审计、克隆对话、言无不尽、净化页面、展示大屏、拦截跟踪、日新月异、明察秋毫等。让我们的AI体验无比安全、顺畅、丝滑、高效、简洁。
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
UB-Mannheim / tesseract
Forked from tesseract-ocr/tesseractTesseract Open Source OCR Engine (main repository)