Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Command-line program to download videos from YouTube.com and other video sites
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Build and run Docker containers leveraging NVIDIA GPUs
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Hidden Markov Models in Python, with scikit-learn like API
A python package to analyze and compare voices with deep learning
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Bioinformatics'2020: BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)
A library for soundscape synthesis and augmentation
Topic-Aware Convolutional Neural Networks for Extreme Summarization
Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"
ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.
StereoSet: Measuring stereotypical bias in pretrained language models
Lightweight Python library for in-memory matrix completion.
cat🐈: the repo for the paper "Embarrassingly Simple Unsupervised Aspect extraction"
Implementing Content based and Collaborative filtering(with KNN, Matrix Factorization and Neural Networks) in Python
DS-GA 1013 Mathematical Tools for Data Science
Fine-tune transformers with pytorch-lightning
Quantifying biases in BERT embeddings pretrained on MIMIC-III clinical notes
Python code for producing emotionality scores from Gennaro and Ash (2021).
Diarizing Legal Proceedings with d-vectors.