Stars
Browser automation CLI for AI agents
Robust Speech Recognition via Large-Scale Weak Supervision
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
An agentic skills framework & software development methodology that works.
A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD
Align Anything: Training All-modality Model with Feedback
The official PyTorch implementation of Google's Gemma models
Deezer source separation library including pretrained models.
Muzic: Music Understanding and Generation with Artificial Intelligence
An open-source tool-augmented conversational language model from Fudan University
VQLite - Simple and Lightweight Vector Search Engine based on Google ScaNN
A playbook for systematically maximizing the performance of deep learning models.
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
[ICCV-2021] TransReID: Transformer-based Object Re-Identification
Top 200 deep learning Github repositories sorted by the number of stars.
Implementation of the Wave-U-Net for audio source separation
🎙️ Node.js Bot to identify songs in Twitter videos
Children's sitting posture detecting with PifPaf、YOLOv5、 DSST and Reid.
A Node client for the Facebook Messenger Platform