LindgeW

Follow

🎯

Focusing

Lam Chi LindgeW

🎯

Focusing

Follow

Research Interests: audio-visual speech recognition, lip-reading, NLP, deep learning

32 followers · 75 following

UESTC PhD, TJU Master's

Achievements

Achievements

Lists (6)

Sort

AVSE

AVSR

29 repositories

Lip2Speech/Speech2Lip

PaperReading

Super Star

Mark some fundamental multimodal repos

14 repositories

VAE

Starred repositories

598 stars written in Python

LAION-AI / CLAP

Contrastive Language-Audio Pretraining

Python 1,885 191 Updated May 15, 2025

Fafa-DL / Awesome-Backbones

Integrate deep learning models for image classification | Backbone learning/comparison/magic modification project

Python 1,871 276 Updated Jan 17, 2025

lucidrains / byol-pytorch

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Python 1,852 248 Updated Jul 15, 2024

Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"

Python 1,834 138 Updated Nov 26, 2024

QwenLM / Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,817 135 Updated Jul 5, 2024

microsoft / Cream

This is a collection of our NAS and Vision Transformer work.

Python 1,809 238 Updated Jul 25, 2024

facebookresearch / TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Python 1,789 242 Updated Apr 9, 2024

rosinality / vq-vae-2-pytorch

Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch

Python 1,780 283 Updated Feb 15, 2023

qibin0506 / Cortex

个人构建MoE大模型：从预训练到DPO的完整实践

Python 1,773 139 Updated Nov 5, 2025

idiap / fast-transformers

Pytorch library for fast transformer implementations

Python 1,748 188 Updated Mar 23, 2023

facebookresearch / deepcluster

Deep Clustering for Unsupervised Learning of Visual Features

Python 1,734 322 Updated Oct 12, 2021

salesforce / ALBEF

Code for ALBEF: a new vision-language pre-training method

Python 1,729 222 Updated Sep 20, 2022

LCAV / pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,703 475 Updated Oct 27, 2025

facebookresearch / multimodal

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,664 158 Updated Nov 3, 2025

invictus717 / MetaTransformer

Meta-Transformer for Unified Multimodal Learning

Python 1,633 117 Updated Dec 5, 2023

SwinTransformer / Video-Swin-Transformer

Forked from open-mmlab/mmaction2

This is an official implementation for "Video Swin Transformers".

Python 1,593 210 Updated Mar 8, 2023

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 1,571 138 Updated Sep 22, 2025

lucidrains / soundstorm-pytorch

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python 1,535 92 Updated Apr 24, 2025

IBM / pytorch-seq2seq

An open source framework for seq2seq models in PyTorch.

Python 1,516 376 Updated Sep 17, 2025

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Python 1,403 131 Updated Apr 24, 2024

TorchSSL / TorchSSL

A PyTorch-based library for semi-supervised learning (NeurIPS'21)

Python 1,364 188 Updated Aug 28, 2023

sail-sg / poolformer

PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)

Python 1,354 119 Updated Jun 1, 2024

k2-fsa / icefall

Python 1,274 376 Updated Oct 5, 2025

jishengpeng / WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,224 104 Updated Mar 2, 2025

Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.

Python 1,212 316 Updated Dec 19, 2020

KMnP / vpt

❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119

Python 1,183 100 Updated Sep 2, 2023

lucidrains / perceiver-pytorch

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Python 1,175 137 Updated Aug 22, 2023

clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition

Python 1,143 286 Updated Mar 26, 2024

iver56 / torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Python 1,100 96 Updated Jan 15, 2025

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,084 186 Updated Dec 22, 2023

Starred topics

vector-quantization

speaker-embedding

language-modelling

beam-search

seq2seq

Machine learning

variational-inference

information-bottleneck

listen-attend-and-spell

chinese-speech-recognition

See all starred topics