joannahong

Joanna Hong joannahong

Research Scientist @ Google DeepMind

45 followers · 14 following

New York, New York
17:32 (UTC -05:00)
https://joannahong.github.io/

Achievements

Stars

prabhupant / python-ds

No non-sense and no BS repo for how data structure code should be in Python - simple and elegant.

Python 3,042 624 Updated Apr 6, 2024

choijeongsoo / av2av

[CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

Python 43 3 Updated Sep 6, 2024

ms-dot-k / Image-to-Speech

Pytorch implementation of "Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens"

Python 12 Updated Mar 9, 2024

choijeongsoo / utut

[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation

Python 31 4 Updated Sep 6, 2024

joannahong / AV-RelScore

Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23

Python 35 2 Updated Jun 20, 2023

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 16,037 1,269 Updated Jan 18, 2025

ms-dot-k / AVSR

PyTorch implementation of "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" (CVPR2023) and "Visual Context-driven Audio Feature Enhan…

Python 20 2 Updated Apr 3, 2024

xriley / Shine-Theme

FREE Bootstrap 5 Light Mode Resume/CV Template for Developers

SCSS 31 43 Updated Sep 16, 2024

ms-dot-k / Visual-Audio-Memory

PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)

Python 20 4 Updated Apr 11, 2022

ms-dot-k / Lip-to-Speech-Synthesis-in-the-Wild

PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)

Python 70 7 Updated Mar 9, 2024

ahaliassos / raven

Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)

Python 77 7 Updated Feb 27, 2025

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 962 154 Updated Dec 7, 2023

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 92,130 11,547 Updated Dec 15, 2025

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,892 2,679 Updated Dec 15, 2025

facebookresearch / muavic

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 401 34 Updated Sep 11, 2023

keonlee9420 / DiffSinger

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

Python 242 33 Updated Feb 3, 2022

ermongroup / ddim

Denoising Diffusion Implicit Models

Python 1,754 228 Updated Jul 26, 2024

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 10,304 1,250 Updated Aug 4, 2025

keonlee9420 / DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Python 343 45 Updated Feb 21, 2022

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,172 736 Updated May 31, 2024

ByungKwanLee / Masking-Adversarial-Damage

[CVPR 2022] Official PyTorch Implementation for "Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network"

Python 32 4 Updated Mar 13, 2023

ByungKwanLee / Double-Debiased-Adversary

[ICCV 2023] Official PyTorch Implementation for "Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning"

Python 32 3 Updated Oct 13, 2023

rishikksh20 / VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Python 321 59 Updated Jul 25, 2024

NVIDIA / BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 1,157 143 Updated Sep 5, 2024

ivanvovk / WaveGrad

Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.

Jupyter Notebook 405 53 Updated Jul 7, 2021

kan-bayashi / ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Jupyter Notebook 1,630 349 Updated Apr 22, 2024

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,647 2,364 Updated Dec 16, 2025

ms-dot-k / Visual-Context-Attentional-GAN

PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)

Python 25 5 Updated Mar 9, 2024

kuai-lab / sound-guided-semantic-image-manipulation

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

Python 80 11 Updated Aug 14, 2023

WonhoZhung / ee474

EE474 Term Project

Python 3 1 Updated Nov 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joanna Hong joannahong

Achievements

Achievements

Block or report joannahong

Stars

prabhupant / python-ds

choijeongsoo / av2av

ms-dot-k / Image-to-Speech

choijeongsoo / utut

joannahong / AV-RelScore

lukas-blecher / LaTeX-OCR

ms-dot-k / AVSR

xriley / Shine-Theme

ms-dot-k / Visual-Audio-Memory

ms-dot-k / Lip-to-Speech-Synthesis-in-the-Wild

ahaliassos / raven

facebookresearch / av_hubert

openai / whisper

microsoft / unilm

facebookresearch / muavic

keonlee9420 / DiffSinger

ermongroup / ddim

lucidrains / denoising-diffusion-pytorch

keonlee9420 / DiffGAN-TTS

facebookresearch / DiT

ByungKwanLee / Masking-Adversarial-Damage

ByungKwanLee / Double-Debiased-Adversary

rishikksh20 / VocGAN

NVIDIA / BigVGAN

ivanvovk / WaveGrad

kan-bayashi / ParallelWaveGAN

espnet / espnet

ms-dot-k / Visual-Context-Attentional-GAN

kuai-lab / sound-guided-semantic-image-manipulation

WonhoZhung / ee474