Skip to content
View swapb94's full-sized avatar
😎
😎

Highlights

  • Pro

Block or report swapb94

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc)

HTML 9,632 1,811 Updated Dec 25, 2025

[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

Python 61 3 Updated Oct 8, 2025

Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)

Python 42 4 Updated Sep 10, 2025

A 360-degree video dataset designed for 360-degree video-to-spatial audio generation.

4 Updated Feb 17, 2025

[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"

Python 363 13 Updated Jun 27, 2025

Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, accepted in 2024 ICASSP

Python 34 2 Updated May 25, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,877 234 Updated Aug 11, 2024

Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement [ACL 2026 Findings]"

Python 53 2 Updated Apr 7, 2026

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 2,141 251 Updated Feb 23, 2026

A curated list of recent diffusion models for video generation, editing, and various other applications.

5,579 353 Updated Apr 3, 2026

Event Relation in Text-to-Audio (TTA) Generation

Python 21 Updated Feb 26, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 28,485 2,668 Updated Apr 11, 2026

[3DV 2025] MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Python 86 4 Updated Nov 28, 2024

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 371 24 Updated Sep 3, 2024

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python 1,019 112 Updated Jan 15, 2026

[ICRA2025] Integrates the vision, touch, and common-sense information of foundational models, customized to the agent's perceptual needs.

Python 47 4 Updated Apr 4, 2025

DN-Splatter + AGS-Mesh: Depth and Normal Priors for Gaussian Splatting

Python 780 65 Updated Jul 5, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 14,320 2,113 Updated Apr 4, 2026

Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko,…

Python 396 63 Updated Oct 12, 2025

Subsurface Scattering for Gaussian Splatting

Python 166 9 Updated Mar 10, 2026

The open source code for LLM-Codec

Python 146 10 Updated Aug 18, 2024

Generative models for conditional audio generation

Python 3,665 440 Updated Feb 14, 2026

The implementation of MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis

42 1 Updated Jul 15, 2024

[ECCV 2024 Oral] Audio-Synchronized Visual Animation

Python 61 1 Updated Mar 15, 2026

[TVCG2024] PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction

Python 990 79 Updated Dec 25, 2024

Codebase for the WayveScenes101 Dataset

Python 193 6 Updated Sep 25, 2024

PyTorch implementation of paper: GaussNav: Gaussian Splatting for Visual Navigation

Python 203 20 Updated Nov 11, 2024

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 6,792 392 Updated Mar 27, 2026

Localized Gaussian Point Management

Python 81 6 Updated Feb 23, 2026

[NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

Python 36 1 Updated Feb 15, 2024
Next