Lists (32)
Sort Name ascending (A-Z)
academic
acoustic echo cancellation
AIGC
audio codec
audio codecs
audio separation
audio tools
bandwidth extension
beamforming
computer vision
deep learning
diffusion
entertainments
hearing aid
LLM
mircophone array
music tools
noise reduction
packet loss compensation
programming related
simulation tools
singing voice tools
sound source localization
spatial audio
speaker recognition
speech dereverberation
speech diarization
speech frontend
speech recognition
speech separation
speech signal processing
speech voice tools
Starred repositories
All Algorithms implemented in Python
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Stable Diffusion web UI
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Command-line program to download videos from YouTube.com and other video sites
A feature-rich command-line audio/video downloader
🦜🔗 The platform for reliable agents.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Robust Speech Recognition via Large-Scale Weak Supervision
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
real time face swap and one-click video deepfake with only a single image
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
The official gpt4free repository | various collection of powerful language models | o4, o3 and deepseek r1, gpt-4.1, gemini 2.5
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
A high-throughput and memory-efficient inference and serving engine for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A natural language interface for computers
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
Interact with your documents using the power of GPT, 100% privately, no data leaks
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
CLI platform to experiment with codegen. Precursor to: https://lovable.dev