Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,850 160 Updated Oct 9, 2025

Liuziyu77 / Visual-RFT

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,246 100 Updated Oct 29, 2025

amazon-science / auto-cot

Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)

Jupyter Notebook 1,968 175 Updated Mar 13, 2024

microsoft / mup

maximal update parametrization (µP)

Jupyter Notebook 1,621 104 Updated Jul 17, 2024

mdeff / cnn_graph

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Jupyter Notebook 1,363 390 Updated Jun 13, 2020

bckenstler / CLR

Jupyter Notebook 1,209 245 Updated Jun 5, 2020

rhymes-ai / Aria

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 1,079 85 Updated Jan 22, 2025

togethercomputer / together-cookbook

A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.

Jupyter Notebook 1,071 190 Updated Nov 4, 2025

allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models

Jupyter Notebook 901 83 Updated Sep 23, 2025

SunzeY / AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Jupyter Notebook 852 56 Updated Jul 20, 2025

raoyongming / DynamicViT

[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Jupyter Notebook 635 76 Updated Jul 11, 2023

zhunzhong07 / person-re-ranking

Person Re-ranking (CVPR 2017)

Jupyter Notebook 614 174 Updated Oct 26, 2021

jazzsaxmafia / show_attend_and_tell.tensorflow

Jupyter Notebook 504 190 Updated Oct 23, 2019

zcaceres / spec_augment

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Jupyter Notebook 499 61 Updated Jun 11, 2021

kohjingyu / fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Jupyter Notebook 483 37 Updated Oct 30, 2023

project-numina / aimo-progress-prize

Jupyter Notebook 466 34 Updated Jul 22, 2024

kracwarlock / action-recognition-visual-attention

Action recognition using soft attention based deep recurrent neural networks

Jupyter Notebook 352 158 Updated Oct 30, 2016

Glanvery / LLM-Travel

欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。

Jupyter Notebook 351 39 Updated Jul 21, 2024

yuhuayc / da-faster-rcnn

An implementation of our CVPR 2018 work 'Domain Adaptive Faster R-CNN for Object Detection in the Wild'

Jupyter Notebook 349 70 Updated Oct 9, 2019

zhaochen0110 / OpenThinkIMG

OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.

Jupyter Notebook 323 6 Updated Jun 1, 2025

Anonym0u3 / AttentiveEraser

Official implementation of the paper "Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance" (AAAI 2025 Oral)

Jupyter Notebook 197 9 Updated May 9, 2025

Xiaoye Qu XiaoYee

Highlights

Lists (7)

Competition

GPT-low-resource

HUST

Innovation List

MoE

RLHF

TTS

Stars