Stars
🎯 哔哩哔哩(bilibili)评论区数据可视化分析软件-- up主可用于指导自己的题材选择,明确自己的粉丝群体
Bravura music font, reference font for SMuFL (Standard Music Font Layout)
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
A dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
Generative models for conditional audio generation
Web video downloader for Bilibili, iQIYI, Tencent Video, MGTV and WeTV. 网站视频下载器,主要支持Bilibili、爱奇艺、腾讯视频、芒果TV、WeTV、愛奇藝台灣站。
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
清华大学计算机系课程攻略 Guidance for courses in Department of Computer Science and Technology, Tsinghua University
Plainchant Analyser tool for MEI Neumes (PAM)
Transformer: PyTorch Implementation of "Attention Is All You Need"
Deezer source separation library including pretrained models.
A list of tools, papers and code related to Fake Audio Detection.
A curated list of awesome article, tutorial, library, webpage, etc.
An "awesome music theory" kinda wiki with books, resources and courses for studying everything about music and sound
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
✨ AsrTools: Smart Voice-to-Text Tool | Efficient Batch Processing | User-Friendly Interface | No GPU Required | Supports SRT/TXT Output | Turn your audio into accurate text in an instant!
利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.