Stars
Leverage WorldQuant API to generate alpha signals, and mine promising alpha expressions.
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
[ICLR 2026] RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
DocBank: A Benchmark Dataset for Document Layout Analysis
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A high-throughput and memory-efficient inference and serving engine for LLMs
AIClient2API:模拟Gemini CLI和Kiro 客户端请求,兼容OpenAI API。可每日千次Gemini模型请求, 免费使用Kiro内置Claude模型。通过API轻松接入任何客户端,让AI开发更高效!
Simulates Gemini CLI, Antigravity, Qwen Code, and Kiro client requests, compatible with the OpenAI API. It supports thousands of Gemini model requests per day and offers free use of the built-in Cl…
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
[CVPR 2024] PEM: Prototype-based Efficient MaskFormer for Image Segmentation
A Unified Toolkit for Deep Learning Based Document Image Analysis
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection, CVPR, Oral, 2020
OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.