Starred repositories
Learn how to develop, deploy and iterate on production-grade ML applications.
Google Research
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Companion webpage to the book "Mathematics For Machine Learning"
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
PyTorch code and models for the DINOv2 self-supervised learning method.
LAVIS - A One-stop Library for Language-Vision Intelligence
Best Practices, code samples, and documentation for Computer Vision.
Image restoration with neural networks but without learning.
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
Python code for "Probabilistic Machine learning" book by Kevin Murphy
COCO API - Dataset @ http://cocodataset.org/
Silero Models: pre-trained text-to-speech models made embarrassingly simple
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
From the basics to slightly more interesting applications of Tensorflow
Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
Single Shot MultiBox Detector in TensorFlow
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
Efficient neural feature detector and descriptor
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
An unsupervised learning framework for depth and ego-motion estimation from monocular videos
For extensive instructor led learning