Skip to content
View marypilataki's full-sized avatar

Highlights

  • Pro

Block or report marypilataki

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Training, validation, and inference code for various SSL approaches and architectures.

Python 88 1 Updated Apr 7, 2026

Audio Dataset for training CLAP and other models

Python 2 Updated Jan 10, 2025

MIDI / symbolic music tokenizers for Deep Learning models 🎶

Python 1 Updated Apr 25, 2025

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

Jupyter Notebook 871 184 Updated Jul 22, 2023

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 10,209 1,514 Updated Apr 24, 2024

Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'

Python 101 5 Updated Jul 24, 2024

A library for efficient similarity search and clustering of dense vectors.

C++ 40,284 4,421 Updated Jun 13, 2026

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,668 491 Updated Aug 7, 2024

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,376 1,787 Updated Jan 30, 2026

Leetcode for Pytorch

Jupyter Notebook 2,241 285 Updated Jun 14, 2026

Foundation Architecture for (M)LLMs

Python 3,131 225 Updated Apr 11, 2024

CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models [NAACL 2025]

Python 65 4 Updated Feb 28, 2025

LLaQo, a Large Language Query-based Coach in the domain of expressive performance

Python 115 13 Updated Jan 20, 2026

Xournal++ is a handwriting notetaking software with PDF annotation support. Written in C++ with GTK3, supporting Linux (e.g. Ubuntu, Debian, Arch, SUSE), macOS and Windows 10. Supports pen input fr…

C++ 14,881 1,068 Updated Jun 14, 2026

A benchmark for evaluating audio encoders on various audio tasks.

Python 53 8 Updated Apr 27, 2026

Automated Machine Learning on Kubernetes

Python 1,683 527 Updated Jun 12, 2026

Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Built upon the I-JEPA paradigm, it uses a Vision Transformer (Vi…

Python 57 4 Updated Apr 17, 2026

Distributed AI Model Training and LLM Fine-Tuning on Kubernetes

Go 2,115 969 Updated Jun 13, 2026

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 3,116 232 Updated Feb 9, 2026

Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

Python 2,944 308 Updated Jan 26, 2026

Arduino documentation (docs.arduino.cc)

Python 353 552 Updated Jun 11, 2026

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

946 94 Updated Jul 8, 2025

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".

Python 467 33 Updated May 25, 2025

JEPAs for audio representation learning

Python 26 3 Updated Jun 11, 2026

EVAR ~ Evaluation package for Audio Representations

Jupyter Notebook 80 5 Updated Feb 19, 2026

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,953 400 Updated Feb 27, 2025

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Jupyter Notebook 156 8 Updated Feb 23, 2026

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

1,142 96 Updated Dec 15, 2025

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 15,411 5,358 Updated Sep 22, 2025

🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.

C++ 1,575 78 Updated Mar 1, 2026
Next