Stars
VSCode Remote Development: Open any folder in the Windows Subsystem for Linux (WSL).
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
[TPAMI 2025] Official code of paper "ONNXPruner: ONNX-Based General Model Pruning Adapter"
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Multilingual Document Layout Parsing in a Single Vision-Language Model
devgdovg / deepcompressor
Forked from nunchaku-ai/deepcompressorModel Compression Toolbox for Large Language Models and Diffusion Models
lantudou / deepcompressor
Forked from nunchaku-ai/deepcompressorModel Compression Toolbox for Large Language Models and Diffusion Models
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.
Official inference repo for FLUX.1 models
This is my CUDA optimization of OpenCV seamlessClone API at NORMAL_CLONE mode.
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achieving exceptional performance on the edge.
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Efficient vision foundation models for high-resolution generation and perception.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining
Swin Unet V2 is a modified version of Swin Unet based on Swin Transfomer V2.
Official implementation code for Attention Swin U-Net: Cross-Contextual Attention Mechanism for Skin Lesion Segmentation paper