Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,913 316 Updated Jun 12, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,383 209 Updated Jan 8, 2026

z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 3,102 356 Updated Jan 20, 2026

onnx / tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

Jupyter Notebook 2,514 466 Updated Sep 12, 2025

MarkTechStation / VideoCode

Jupyter Notebook 2,061 519 Updated Aug 13, 2025

lyhue1991 / torchkeras

Pytorch❤️ Keras 😋😋

Jupyter Notebook 2,012 256 Updated Sep 22, 2025

bbruceyuan / LLMs-Zero-to-Hero

从无名小卒到大模型（LLM）大英雄~ 欢迎关注后续！！！

Jupyter Notebook 1,999 135 Updated Nov 22, 2025

shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 1,237 179 Updated Jan 19, 2026

jiangxinyang227 / textClassifier

tensorflow implementation

Jupyter Notebook 1,149 552 Updated Jul 19, 2019

vincentherrmann / pytorch-wavenet

An implementation of WaveNet with fast generation

Jupyter Notebook 1,022 233 Updated Sep 17, 2020

dennybritz / rnn-tutorial-rnnlm

Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Jupyter Notebook 900 465 Updated Aug 14, 2023

FlashLabs-AI-Corp / FlashLabs-Chroma

Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.

Jupyter Notebook 521 50 Updated Jan 28, 2026

zcaceres / spec_augment

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Jupyter Notebook 499 61 Updated Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lianfei

Block or report lianfei

Stars

CompVis / stable-diffusion

google-research / google-research

datawhalechina / llm-cookbook

bloc97 / Anime4K

tloen / alpaca-lora

Lordog / dive-into-llms

instillai / TensorFlow-Course

CompVis / latent-diffusion

chenyuntc / pytorch-book

artidoro / qlora

Baiyuetribe / paper2gui

TencentARC / PhotoMaker

pyannote / pyannote-audio

jasonppy / VoiceCraft

geekyutao / Inpaint-Anything

cocodataset / cocoapi

NVIDIA / tacotron2

QwenLM / Qwen2.5-Omni