Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
🔊 Text-Prompted Generative Audio Model
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
stable diffusion webui colab
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
利用Python进行数据分析 第二版 (2017) 中文翻译笔记
line drawing colorization using chainer
An interactive book on deep learning. Much easy, so MXNet. Wow. [Straight Dope is growing up] ---> Much of this content has been incorporated into the new Dive into Deep Learning Book available at …
Puzzles for learning Triton
Handout for the tutorial "Creating publication-quality figures with matplotlib"
Keras implementation of "One pixel attack for fooling deep neural networks" using differential evolution on Cifar10 and ImageNet
What would you do with 1000 H100s...
Replika.ai Research Papers, Posters, Slides & Datasets
ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).
Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KN…
Data and tooling to compare the API surfaces of various array libraries.
Reinplementation of the paper On Data-Driven Saak Transform
Image2Emoji — Predict the most relevant emoji for a given image 🐱