Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,854 303 Updated Jun 12, 2025

Sakshi113 / MMAU

Python 127 5 Updated Sep 4, 2025

AudioLLMs / AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 287 14 Updated Jun 17, 2025

vinaychetnani / Q-Learning-for-Non-Competitive-Bridge-Bidding

Python 6 Updated Jan 23, 2018

zizhang-qiu / BridgeBidding

Open source code for supervised learning of bridge bidding.

Python 4 Updated Oct 31, 2023

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

923 74 Updated Dec 15, 2025

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,040 1,271 Updated Oct 11, 2025

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 11,437 1,155 Updated Apr 30, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,461 435 Updated Sep 14, 2025

OpenGVLab / VideoChat-Flash

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Python 490 15 Updated Nov 18, 2025

VectorSpaceLab / Video-XL

🔥🔥First-ever hour scale video understanding models

Python 593 40 Updated Jul 14, 2025

yongliang-wu / DFT

[Preprint] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.

Python 513 20 Updated Nov 5, 2025

facebookresearch / MetaCLIP

NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

Python 1,782 76 Updated Nov 27, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,456 1,999 Updated Nov 1, 2025

kaixuanwang2003 / zju-welcome

Brief guides for ZJU freshmen. [site](https://zjuers.com/welcome/)

HTML 124 19 Updated Oct 24, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,740 2,372 Updated Dec 22, 2025

mlfoundations / open_clip

An open source implementation of CLIP.

Python 13,148 1,220 Updated Nov 4, 2025

ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

Python 10,158 1,394 Updated Jul 15, 2025

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 32,042 3,865 Updated Jul 23, 2024

FreddeFrallan / Multilingual-CLIP

OpenAI CLIP text encoders for multiple languages!

Jupyter Notebook 823 69 Updated May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serapay Sonder-zyz

Achievements

Achievements

Block or report Sonder-zyz

Stars

karpathy / nanochat

Gar-b-age / CookLikeHOC

coin-dataset / annotations

google-deepmind / limit

LAION-AI / audio-dataset

LAION-AI / CLAP

apple / ml-mobileclip-dr

audio-captioning / clotho-dataset

cdjkim / audiocaps

AudioLLMs / Awesome-Audio-LLM

QwenLM / Qwen2.5-Omni