Starred repositories
21 Lessons, Get Started Building with Generative AI
🔊 Text-Prompted Generative Audio Model
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
Anthropic's Interactive Prompt Engineering Tutorial
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
A guidance language for controlling large language models.
Anthropic's educational courses
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Examples and guides for using the Gemini API
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Foundational Models for State-of-the-Art Speech and Text Translation
QLoRA: Efficient Finetuning of Quantized LLMs
AirLLM 70B inference with single 4GB GPU
Reference PyTorch implementation and models for DINOv3
A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like RF-DETR, YOLO11, SAM …
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Zero-Shot Speech Editing and Text-to-Speech in the Wild
this repository accompanies the book "Grokking Deep Learning"
Overview and tutorial of the LangChain Library
A unified framework for 3D content generation.
The fastest way to create an HTML app
A course on aligning smol models.
A collection of examples that show how to use CrewAI framework to automate workflows.
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340