Lists (1)
Sort Name ascending (A-Z)
Starred repositories
C++ library for audio and music analysis, description and synthesis, including Python bindings
Library using WebAudioAPI to analyse BPM from files, audionodes. It's also able to compute BPM from streams as well as realtime using a microphone. This tool might be useful for music producers and…
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
🎨 ArtPlayer.js is a modern and full featured HTML5 video player
Collection of decent Community-made GRUB themes. Contributions welcome!
Neural Network that is able to translate any sign language into text.
Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"
This is the repo for ICCV 2025 Paper: Multi-Modal Few-Shot Temporal Action Segmentation
Plug and play cpu percentage and icon indicator for Tmux.
🎮 ⌨ An easy to use tool to change the behaviour of your input devices.
Lightweight coding agent that runs in your terminal
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
[CVPR 2026] 🔥🔥 Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
Simple, powerful, and fast logging for Python.
Segment a given audio into utterances using a trained end-to-end ASR model.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Gemma open-weight LLM library, from Google DeepMind
The official Python library for the OpenAI API