Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
-
Updated
Jan 31, 2025 - Python
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Pytorch port of Google Research's VGGish model used for extracting audio features.
Audio classification with VGGish as feature extractor in TensorFlow
Machine learning model for bird songs recognition
Query service to serve the JibJib TensorFlow model
Re-Implementation of Google Research's VGGish model used for extracting audio features using Pytorch with GPU support.
mono-kit is a lightweight library that lets you build your own Google Lens, hum-to-search, and RAG-style applications. It supports text, audio, and image embeddings using both default and custom-trained models — with simple tools for processing single or batch data across modalities.
This is where I store files and documents relating to my graduation project
Add a description, image, and links to the vggish topic page so that developers can more easily learn about it.
To associate your repository with the vggish topic, visit your repo's landing page and select "manage topics."