Skip to content
View LindgeW's full-sized avatar
🎯
Focusing
🎯
Focusing
  • UESTC PhD, TJU Master's

Block or report LindgeW

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

598 stars written in Python
Clear filter

Contrastive Language-Audio Pretraining

Python 1,885 191 Updated May 15, 2025

Integrate deep learning models for image classification | Backbone learning/comparison/magic modification project

Python 1,871 276 Updated Jan 17, 2025

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Python 1,852 248 Updated Jul 15, 2024

Official implementation of "Separate Anything You Describe"

Python 1,834 138 Updated Nov 26, 2024

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,817 135 Updated Jul 5, 2024

This is a collection of our NAS and Vision Transformer work.

Python 1,809 238 Updated Jul 25, 2024

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Python 1,789 242 Updated Apr 9, 2024

Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch

Python 1,780 283 Updated Feb 15, 2023

个人构建MoE大模型:从预训练到DPO的完整实践

Python 1,773 139 Updated Nov 5, 2025

Pytorch library for fast transformer implementations

Python 1,748 188 Updated Mar 23, 2023

Deep Clustering for Unsupervised Learning of Visual Features

Python 1,734 322 Updated Oct 12, 2021

Code for ALBEF: a new vision-language pre-training method

Python 1,729 222 Updated Sep 20, 2022

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Python 1,703 475 Updated Oct 27, 2025

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,664 158 Updated Nov 3, 2025

Meta-Transformer for Unified Multimodal Learning

Python 1,633 117 Updated Dec 5, 2023

This is an official implementation for "Video Swin Transformers".

Python 1,593 210 Updated Mar 8, 2023

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 1,571 138 Updated Sep 22, 2025

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python 1,535 92 Updated Apr 24, 2025

An open source framework for seq2seq models in PyTorch.

Python 1,516 376 Updated Sep 17, 2025

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Python 1,403 131 Updated Apr 24, 2024

A PyTorch-based library for semi-supervised learning (NeurIPS'21)

Python 1,364 188 Updated Aug 28, 2023

PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)

Python 1,354 119 Updated Jun 1, 2024
Python 1,274 376 Updated Oct 5, 2025

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,224 104 Updated Mar 2, 2025

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.

Python 1,212 316 Updated Dec 19, 2020

❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119

Python 1,183 100 Updated Sep 2, 2023

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

Python 1,175 137 Updated Aug 22, 2023

In defence of metric learning for speaker recognition

Python 1,143 286 Updated Mar 26, 2024

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Python 1,100 96 Updated Jan 15, 2025

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,084 186 Updated Dec 22, 2023