-
Tencent
- Shanghai
-
07:49
(UTC +08:00)
Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Robust Speech Recognition via Large-Scale Weak Supervision
Models and examples built with TensorFlow
scikit-learn: machine learning in Python
A generative speech model for daily dialogue.
PyTorch Tutorial for Deep Learning Researchers
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Deezer source separation library including pretrained models.
Generative Models by Stability AI
GUI for a Vocal Remover that uses Deep Neural Networks.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A TTS model capable of generating ultra-realistic dialogue in one pass.
End-to-End Object Detection with Transformers
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
A Python library for anomaly detection across tabular, time series, graph, text, and image data. 60+ detectors, benchmark-backed ADEngine orchestration, and an agentic workflow for AI agents.
High-Resolution 3D Human Digitization from A Single Image.
The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
Anomaly detection related books, papers, videos, and toolboxes. Last update late 2025 for LLM and VLM works!
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
vits2 backbone with multilingual-bert
Python library for audio and music analysis
PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models