Skip to content
View agangzz's full-sized avatar

Block or report agangzz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
466 stars written in Python
Clear filter

Command-line program to download videos from YouTube.com and other video sites

Python 139,942 10,601 Updated Feb 19, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 98,533 27,286 Updated Mar 24, 2026

Robust Speech Recognition via Large-Scale Weak Supervision

Python 96,542 11,918 Updated Dec 15, 2025

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 41,211 5,202 Updated Jun 27, 2024

Let us control diffusion models!

Python 33,763 3,004 Updated Feb 25, 2024

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Python 25,537 11,692 Updated Jun 7, 2024

Image-to-Image Translation in PyTorch

Python 25,040 6,574 Updated Aug 6, 2025

Open-Source Frontier Voice AI

Python 24,005 2,650 Updated Mar 24, 2026

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 22,063 2,695 Updated Jan 23, 2026

Faster Whisper transcription with CTranslate2

Python 21,702 1,764 Updated Nov 19, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 20,888 2,201 Updated Mar 17, 2026

Magenta: Music and Art Generation with Machine Intelligence

Python 19,773 3,790 Updated Jan 6, 2026

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 19,334 1,942 Updated Mar 2, 2026

State-of-the-Art Text Embeddings

Python 18,448 2,768 Updated Mar 12, 2026

Automatic headphone equalization from frequency responses

Python 15,549 2,534 Updated Jul 20, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 15,372 1,614 Updated Mar 17, 2026

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 12,189 2,325 Updated Oct 30, 2023

Python bindings for FFmpeg - with complex filtering support

Python 10,973 939 Updated Aug 4, 2024

Simultaneous speech-to-text models

Python 9,980 1,023 Updated Mar 18, 2026

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Python 8,963 2,435 Updated Mar 24, 2026

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 8,554 744 Updated Mar 8, 2026

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Python 8,405 797 Updated Oct 7, 2024

Python library for audio and music analysis

Python 8,282 1,040 Updated Mar 24, 2026

Multilingual Voice Understanding Model

Python 7,812 713 Updated Dec 30, 2025

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Python 7,321 1,293 Updated Mar 16, 2026

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

Python 6,704 987 Updated Nov 5, 2022

Official repo for consistency models.

Python 6,475 433 Updated Mar 22, 2024

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 6,371 507 Updated Mar 24, 2026

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Python 6,234 1,224 Updated Aug 4, 2025
Next