A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

19,149 1,998 Updated Dec 12, 2025

TEN-framework / ten-vad

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

C 1,828 143 Updated Dec 23, 2025

TEN-framework / ten-framework

Open-source framework for conversational voice AI agents

Python 9,414 1,102 Updated Dec 25, 2025

Tongyi-MAI / Z-Image

Python 7,882 465 Updated Dec 25, 2025

MuyangDu / T5Voice

T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech synthesis with zero-shot capabilities.

Python 28 5 Updated Nov 7, 2025

ZHZisZZ / dllm

dLLM: Simple Diffusion Language Modeling

Python 1,511 155 Updated Dec 25, 2025

facebookresearch / omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,512 214 Updated Dec 16, 2025

Alibaba-NLP / OmniSearch

Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

Python 402 29 Updated Apr 22, 2025

scambier / obsidian-omnisearch

A search engine that "just works" for Obsidian. Supports OCR and PDF indexing.

TypeScript 1,736 81 Updated Nov 18, 2025

lattifai / lattifai-python

Precision Alignment, Infinite Possibilities

Python 100 7 Updated Dec 22, 2025

stepfun-ai / Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python 791 53 Updated Dec 22, 2025