Lists (1)
Sort Name ascending (A-Z)
Stars
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A multi-voice TTS system trained with an emphasis on quality
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
serp-ai / bark-with-voice-clone
Forked from suno-ai/bark🔊 Text-prompted Generative Audio Model - With the ability to clone voices
[ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models
MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
A collection of guides and examples for the Gemma open models from Google.
Cell2Sentence: Teaching Large Language Models the Language of Biology
Fast parallel LLM inference for MLX
RO is an ontology of relations for use with biological ontologies