Lists (18)
Sort Name ascending (A-Z)
Starred repositories
Stable Diffusion web UI
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Magnificent app which corrects your previous console command.
Robust Speech Recognition via Large-Scale Weak Supervision
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …
A generative speech model for daily dialogue.
😘 让你“爱”上 GitHub,解决访问时图裂、加载慢的问题。(无需安装)
🍰 Desktop utility to download images/videos/music/text from various websites, and more.
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source …
Fast and memory-efficient exact attention
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
提供多款 Shadowrocket 规则,带广告过滤功能。用于 iOS 未越狱设备选择性地自动翻墙。
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
A framework to enable multimodal models to operate a computer.