Skip to content
View soham97's full-sized avatar

Highlights

  • Pro

Block or report soham97

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 8 Updated Oct 8, 2025

Make beautiful isometric infrastructure diagrams

TypeScript 13,833 886 Updated Dec 8, 2025

Collection of works for evaluating (and analyzing) large audio-language models (LALMs)

40 Updated Aug 11, 2025

Open-source unified multimodal model

Python 5,491 480 Updated Oct 27, 2025
Jupyter Notebook 46 Updated Apr 13, 2025

small audio language model for reasoning

Python 83 4 Updated Dec 4, 2025

A Conversational Speech Generation Model

Python 14,369 1,458 Updated May 27, 2025

Explaining audio differences using language

Python 16 Updated Feb 11, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 649 48 Updated Jun 5, 2025

Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems

Python 13 1 Updated Jan 16, 2025

Audio Entailment: Deductive Reasoning for Audio Understanding

16 1 Updated Dec 10, 2024

Awesome speech/audio LLMs, representation learning, and codec models

1,191 74 Updated Aug 13, 2025

A simple library for Fréchet Audio Distance (FAD) calculation

Python 240 24 Updated Aug 22, 2025

PAM is a no-reference audio quality metric for audio generation tasks

Python 76 6 Updated Jul 19, 2024

Repository for "Training Audio Captioning Models without Audio"

10 1 Updated Sep 26, 2023

An Audio Language model for Audio Tasks

Python 317 15 Updated Apr 19, 2024

Tracking states of the arts and recent results (bibliography) on sound tasks.

32 2 Updated Jan 10, 2023

Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"

Python 50 1 Updated Nov 10, 2022

Learning audio concepts from natural language supervision

Python 621 42 Updated Sep 18, 2024

Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"

Python 17 4 Updated Nov 9, 2022

speech enhancement\speech seperation\sound source localization

1,209 224 Updated Nov 14, 2023

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,490 520 Updated Jun 13, 2025

Reading list for research topics in Sound AI

193 8 Updated Aug 8, 2024