- Ho chi minh city, Vietnam
-
-
NeMo Public
Forked from NVIDIA-NeMo/NeMoA scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Python Apache License 2.0 UpdatedSep 15, 2025 -
parlant Public
Forked from emcie-co/parlantLLM agents built for control. Designed for real-world use. Deployed in minutes.
Python Apache License 2.0 UpdatedSep 14, 2025 -
-
-
coze-studio Public
Forked from coze-dev/coze-studioAn AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
TypeScript Apache License 2.0 UpdatedSep 10, 2025 -
FunASR Public
Forked from modelscope/FunASRA Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Python MIT License UpdatedSep 9, 2025 -
dots.ocr Public
Forked from rednote-hilab/dots.ocrMultilingual Document Layout Parsing in a Single Vision-Language Model
Python MIT License UpdatedSep 8, 2025 -
LLaMA-Factory Public
Forked from hiyouga/LLaMA-FactoryUnified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Python Apache License 2.0 UpdatedSep 3, 2025 -
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python MIT License UpdatedSep 3, 2025 -
-
tiny-transducer Public
Forked from yinruiqing/tiny-transducerTiny Transducer: A Highly-Efficient Speech Recognition Model on Edge Devices
Python UpdatedSep 1, 2025 -
viF5TTS Public
Forked from EraX-AI/viF5TTSEraX Text to Speech base on F5-TTS Base V1
Python MIT License UpdatedSep 1, 2025 -
-
-
MiniCPM-V-CookBook Public
Forked from OpenSQZ/MiniCPM-V-CookBookCook up amazing multimodal AI applications effortlessly with MiniCPM-o
Python Apache License 2.0 UpdatedAug 31, 2025 -
-
LEANN Public
Forked from yichuan-w/LEANNRAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
Python MIT License UpdatedAug 30, 2025 -
espnet Public
Forked from espnet/espnetEnd-to-End Speech Processing Toolkit
Python Apache License 2.0 UpdatedAug 29, 2025 -
sim Public
Forked from simstudioai/simSim is an open-source AI agent workflow builder. Sim's interface is a lightweight, intuitive way to rapidly build and deploy LLMs that connect with your favorite tools.
TypeScript Apache License 2.0 UpdatedAug 29, 2025 -
-
sherpa-onnx Public
Forked from k2-fsa/sherpa-onnxSpeech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
C++ Apache License 2.0 UpdatedAug 27, 2025 -
VibeVoice Public
Forked from microsoft/VibeVoiceFrontier Open-Source Text-to-Speech
Python MIT License UpdatedAug 27, 2025 -
system_prompts_leaks Public
Forked from asgeirtj/system_prompts_leaksCollection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
JavaScript UpdatedAug 26, 2025 -
Wan2.2 Public
Forked from Wan-Video/Wan2.2Wan: Open and Advanced Large-Scale Video Generative Models
Python Apache License 2.0 UpdatedAug 26, 2025 -
espeak-ng Public
Forked from espeak-ng/espeak-ngeSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
C GNU General Public License v3.0 UpdatedAug 25, 2025 -
ultralytics Public
Forked from ultralytics/ultralyticsUltralytics YOLO 🚀
Python GNU Affero General Public License v3.0 UpdatedAug 24, 2025 -
lhotse Public
Forked from lhotse-speech/lhotseTools for handling multimodal data in machine learning projects.
Python Apache License 2.0 UpdatedAug 22, 2025 -
nanoVLM Public
Forked from huggingface/nanoVLMThe simplest, fastest repository for training/finetuning small-sized VLMs.
Python Apache License 2.0 UpdatedAug 20, 2025 -