joannahong

Joanna Hong joannahong

Research Scientist @ Google DeepMind

45 followers · 14 following

New York, New York
03:17 (UTC -05:00)
https://joannahong.github.io/

Achievements

Stars

64 stars written in Python

Clear filter

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,161 31,963 Updated Feb 5, 2026

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 94,195 11,720 Updated Dec 15, 2025

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 22,007 2,694 Updated Jan 23, 2026

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 16,165 1,280 Updated Jan 18, 2025

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 10,443 1,266 Updated Aug 4, 2025

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,717 2,378 Updated Feb 4, 2026

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,334 751 Updated May 31, 2024

tkipf / pygcn

Graph Convolutional Networks in PyTorch

Python 5,395 1,224 Updated Sep 20, 2020

prabhupant / python-ds

No non-sense and no BS repo for how data structure code should be in Python - simple and elegant.

Python 3,061 624 Updated Apr 6, 2024

rosinality / stylegan2-pytorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Python 2,830 629 Updated Nov 6, 2023

taki0112 / Tensorflow-Cookbook

Simple Tensorflow Cookbook for easy-to-use

Python 2,758 465 Updated Feb 9, 2020

NVIDIA / waveglow

A Flow-based Generative Network for Speech Synthesis

Python 2,335 536 Updated Oct 19, 2023

jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 2,309 550 Updated Jul 27, 2024

ermongroup / ddim

Denoising Diffusion Implicit Models

Python 1,782 230 Updated Jul 26, 2024

genforce / interfacegan

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

Python 1,557 280 Updated Feb 9, 2022

Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.

Python 1,210 315 Updated Dec 19, 2020

NVIDIA / BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 1,181 143 Updated Sep 5, 2024

rosinality / style-based-gan-pytorch

Implementation A Style-Based Generator Architecture for Generative Adversarial Networks in PyTorch

Python 1,111 229 Updated Aug 26, 2021

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 970 158 Updated Dec 7, 2023

joonson / syncnet_python

Out of time: automated lip sync in the wild

Python 868 188 Updated Jan 23, 2024

Rudrabha / Lip2Wav

This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"

Python 714 153 Updated Jul 6, 2023

soobinseo / Transformer-TTS

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

Python 691 140 Updated Nov 8, 2023

seungwonpark / melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Python 650 114 Updated Oct 3, 2020

sooftware / kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Python 635 192 Updated May 27, 2023

mpc001 / Visual_Speech_Recognition_for_Multiple_Languages

Visual Speech Recognition for Multiple Languages

Python 459 72 Updated Aug 17, 2023

mpc001 / Lipreading_using_Temporal_Convolutional_Networks

ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks

Python 432 102 Updated May 18, 2023

facebookresearch / muavic

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 401 35 Updated Sep 11, 2023

hytseng0509 / CrossDomainFewShot

Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation (ICLR 2020 spotlight)

Python 351 61 Updated Apr 12, 2020

keonlee9420 / DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Python 347 44 Updated Feb 21, 2022

rishikksh20 / VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Python 321 59 Updated Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joanna Hong joannahong

Achievements

Achievements

Block or report joannahong

Stars

huggingface / transformers

openai / whisper

microsoft / unilm

lukas-blecher / LaTeX-OCR

lucidrains / denoising-diffusion-pytorch

espnet / espnet

facebookresearch / DiT

tkipf / pygcn

prabhupant / python-ds

rosinality / stylegan2-pytorch

taki0112 / Tensorflow-Cookbook

NVIDIA / waveglow

jik876 / hifi-gan

ermongroup / ddim

genforce / interfacegan

Alexander-H-Liu / End-to-end-ASR-Pytorch

NVIDIA / BigVGAN

rosinality / style-based-gan-pytorch

facebookresearch / av_hubert

joonson / syncnet_python

Rudrabha / Lip2Wav

soobinseo / Transformer-TTS

seungwonpark / melgan

sooftware / kospeech

mpc001 / Visual_Speech_Recognition_for_Multiple_Languages

mpc001 / Lipreading_using_Temporal_Convolutional_Networks

facebookresearch / muavic

hytseng0509 / CrossDomainFewShot

keonlee9420 / DiffGAN-TTS

rishikksh20 / VocGAN