Skip to content
View ryancyeung's full-sized avatar

Block or report ryancyeung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
44 results for source starred repositories written in Python
Clear filter

Robust Speech Recognition via Large-Scale Weak Supervision

Python 90,565 11,346 Updated Sep 8, 2025

💫 Industrial-strength Natural Language Processing (NLP) in Python

Python 32,780 4,618 Updated Nov 6, 2025

Deezer source separation library including pretrained models.

Python 27,719 3,048 Updated Apr 2, 2025

Faker is a Python package that generates fake data for you.

Python 18,873 2,023 Updated Nov 5, 2025

Bringing Old Photo Back to Life (CVPR 2020 oral)

Python 15,635 2,083 Updated Oct 26, 2023

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,744 848 Updated Jun 1, 2025

A data augmentations library for audio, image, text, and video.

Python 5,057 309 Updated Oct 31, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,732 273 Updated Jul 18, 2025

Free Motion Capture for Everyone 💀✨

Python 3,920 306 Updated Nov 8, 2025

Beautiful visualizations of how language differs among document types.

Python 2,321 290 Updated Apr 29, 2025

Large Concept Models: Language modeling in a sentence representation space

Python 2,302 202 Updated Jan 29, 2025

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 1,186 75 Updated Oct 8, 2025

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy

Python 738 62 Updated Aug 15, 2024

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text

Python 681 196 Updated Sep 19, 2021

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Python 640 119 Updated Mar 22, 2021

Mnemosyne: efficient learning with powerful digital flash-cards.

Python 554 82 Updated Jun 4, 2025

Concurrently detect the minimum Python versions needed to run code

Python 504 28 Updated Nov 3, 2025

An evolving list of electronic media data sets used to model mental-health status.

Python 448 78 Updated Sep 3, 2021

ACRONYM (Acronym CReatiON for You and Me)

Python 417 30 Updated Dec 1, 2022

Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.

Python 365 51 Updated Dec 8, 2022

TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/understand tweets such as sentiment analysis, emoji prediction,…

Python 364 35 Updated Apr 2, 2025

TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)

Python 360 61 Updated Mar 15, 2025

analyze text with empath

Python 338 59 Updated Apr 22, 2017

data⎰describe: Pythonic EDA Accelerator for Data Science

Python 302 18 Updated Feb 22, 2023

Code for collecting, processing, and preparing datasets for the Common Pile

Python 238 23 Updated Sep 10, 2025

Python library for Representational Similarity Analysis

Python 223 47 Updated Oct 27, 2025

The AI Knowledge Editor

Python 185 10 Updated Jul 12, 2022

Repository containing codes and dataset access instructions for the EMNLP 2020 paper on empathy in text-based mental health support

Python 179 38 Updated Jul 13, 2023

Main model and preprocessing code

Python 116 35 Updated Oct 13, 2025
Python 86 24 Updated Oct 16, 2023
Next