Skip to content
View LindgeW's full-sized avatar
🎯
Focusing
🎯
Focusing
  • UESTC PhD, TJU Master's

Block or report LindgeW

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Implementation of Sparsemax activation in Pytorch

Python 165 26 Updated May 27, 2020

Simple conversion and localization between simplified and traditional Chinese using tables from MediaWiki.

Python 555 40 Updated Apr 17, 2024

A python binding for FFmpeg which provides sync and async APIs

Python 374 53 Updated Jul 31, 2024

Pythonic bindings for FFmpeg's libraries.

Python 3,010 409 Updated Oct 13, 2025

Python bindings for FFmpeg - with complex filtering support

Python 10,836 928 Updated Aug 4, 2024

Python interface to the WebRTC Voice Activity Detector (VAD) [released with binary wheels!]

C 33 2 Updated Oct 27, 2025

Python interface to the WebRTC Voice Activity Detector

C 2,386 423 Updated Jul 4, 2024

Advanced data structures for handling temporal segments with attached labels.

Jupyter Notebook 122 50 Updated Sep 16, 2025

Identifying "who speak when" using visual speech input and pretrained lip-sync expert

Python 12 Updated Jul 1, 2023

SyncNet's modern implementation (Python 3.9~3.13)

Python 1 Updated Jul 6, 2025

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]

Python 275 31 Updated Jul 7, 2024
Python 3 Updated Sep 5, 2024

Official implement of SPEAKER-ADAPTIVE LIPREADING VIA SPATIO-TEMPORAL INFORMATION LEARNING

Python 6 Updated Jan 17, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 17,538 2,178 Updated Dec 25, 2024

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 52,376 6,131 Updated Sep 18, 2024

A novel cross-modal decoupling and alignment framework for multimodal representation learning.

JavaScript 36 1 Updated Mar 19, 2025

Face recognition using Tensorflow

Python 14,228 4,808 Updated Jul 24, 2023

Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"

Python 37 3 Updated May 10, 2025

The official implemention of StimuVAR: Spatiotemporal Stimuli-aware Video Affective Reasoning with Multimodal Large Language Models accepted by IJCV

Python 5 Updated Jul 10, 2025

VQVAE for video prediction

Python 29 7 Updated Apr 22, 2022
Python 6 Updated Oct 18, 2022

A WaveRNN implementation

Python 201 48 Updated Oct 14, 2019

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 431 34 Updated Jan 25, 2024

audio-visual,embedding fusion,inter-attention

Shell 3 Updated Oct 12, 2023

CAS-VSR-MOV20: A challenging dataset for Chinese visual speech recognition, consisting of video clips from 20 movies.

3 Updated Jun 5, 2025

A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application

320 10 Updated Jan 31, 2025

[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Python 177 5 Updated May 21, 2025

Awesome Unified Multimodal Models

850 25 Updated Aug 17, 2025

This repo contains the implementation of VQGAN, Taming Transformers for High-Resolution Image Synthesis in PyTorch from scratch. I have added support for custom datasets, testings, experiment track…

Python 37 4 Updated Aug 20, 2024

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Python 49 10 Updated Dec 25, 2024
Next