- Shanghai
- https://cronrpc.github.io
Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Robust Speech Recognition via Large-Scale Weak Supervision
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
The simplest, fastest repository for training/finetuning medium-sized GPTs.
The original local LLM interface. Text, vision, tool-calling, training, and more. 100% offline.
A generative speech model for daily dialogue.
A generative world for general-purpose robotics & embodied AI learning.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
GUI for a Vocal Remover that uses Deep Neural Networks.
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
Open-source AI hackers to find and fix your app’s vulnerabilities.
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Translate the video from one language to another and embed dubbing & subtitles.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
可循环值守和多人录制的直播录制软件,支持抖音、TikTok、Youtube、快手、虎牙、斗鱼、B站、小红书、pandatv、sooplive、flextv、popkontv、twitcasting、winktv、百度、微博、酷狗、17Live、Twitch、Acfun、CHZZK、shopee等40+平台直播录制
ModelScope: bring the notion of Model-as-a-Service to life.
vits2 backbone with multilingual-bert
Text-audio foundation model from Boson AI
a state-of-the-art-level open visual language model | 多模态预训练模型
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box