Stars
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
verl: Volcano Engine Reinforcement Learning for LLMs
Cross-platform, customizable ML solutions for live and streaming media.
A python module to repair invalid JSON from LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
✨✨Latest Advances on Multimodal Large Language Models
A Python-based Xiaozhi AI for users who want the full Xiaozhi experience without owning specialized hardware.
No fortress, purely open ground. OpenManus is Coming.
Multilingual Document Layout Parsing in a Single Vision-Language Model
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Rembg is a tool to remove images background
"RAG-Anything: All-in-One RAG Framework"
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official inference repo for FLUX.1 models
General technology for enabling AI capabilities w/ LLMs and MLLMs
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
1liuren / MonkeyOCR
Forked from Yuliang-Liu/MonkeyOCRA lightweight LMM-based Document Parsing Model
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
A curated list of resources dedicated to table recognition
Real-time face swap for PC streaming or video calls
OpenMMLab Detection Toolbox and Benchmark