Showing 87 open source projects for "ai voice"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 1
    XiaoZhi AI Chatbot

    XiaoZhi AI Chatbot

    Build your own AI friend

    xiaozhi-esp32 is an open-source project that guides users in building their own AI-powered conversational companion using the ESP32 microcontroller. The project provides detailed instructions on assembling the hardware, setting up the software, and integrating AI models to enable natural language interactions. This DIY approach offers an accessible entry point into AI and hardware development.
    Downloads: 290 This Week
    Last Update:
    See Project
  • 2
    Alan AI

    Alan AI

    In-App assistant SDK to build a multimodal conversational UX websites

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions. A powerful web-based IDE where you can write, test and debug dialog scenarios for your voice assistant or chatbot. Alan's AI...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    OpenVINO AI Plugins for Audacity

    OpenVINO AI Plugins for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 4
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Alan AI for iOS

    Alan AI for iOS

    In-App assistant SDK to build a multimodal conversational UX for iOS

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions. A powerful web-based IDE where you can write, test and debug dialog scenarios for your voice assistant or chatbot. Alan's AI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Alan AI for Android

    Alan AI for Android

    Assistant SDK to build a multimodal conversational UX for Android

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions. A powerful web-based IDE where you can write, test and debug dialog scenarios for your voice assistant or chatbot. Alan's AI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Piper TTS

    Piper TTS

    A fast, local neural text to speech system

    Piper is a fast, local neural text-to-speech (TTS) system developed by the Rhasspy team. Optimized for devices like the Raspberry Pi 4, Piper enables high-quality speech synthesis without relying on cloud services, making it ideal for privacy-conscious applications. It utilizes ONNX models trained with VITS to deliver natural-sounding voices across various languages and accents. Piper is particularly suited for offline voice assistants and embedded systems.
    Downloads: 114 This Week
    Last Update:
    See Project
  • 8
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 9
    Parlant

    Parlant

    The behavior guidance framework for customer-facing LLM agents

    Parlant is a lightweight speech-to-text and text-to-speech framework designed for real-time AI-driven voice applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Voice Accounting For Blind & Mute People

    Voice Accounting For Blind & Mute People

    Free & Easy AI Voice Accounting Software For Blind & Speechless People

    Just download the above zip file, extract it and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, please view and download my full collection of softwares for people with disabilities, here : https://sourceforge.net/projects/softwares-for-disabled-people/ This full collection also includes the Voice Accounting Software as well.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TEN Framework

    TEN Framework

    TEN, a voice agent framework to create conversational AI.

    TEN (Transformative Extensions Network) is a voice agent framework for creating conversational AI applications, focusing on high performance and modularity.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Bolna

    Bolna

    Conversational voice AI agents

    Bolna is an end-to-end open-source platform for building conversational voice AI agents, enabling developers to create voice-first conversational assistants efficiently.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Eliza

    Eliza

    Autonomous agents for everyone

    Build and deploy autonomous AI agents with consistent personalities across Discord, Twitter, and Telegram. Full support for voice, text, and media interactions. Built-in RAG memory system, document processing, media analysis, and autonomous trading capabilities. Supports multiple AI models including Llama, GPT-4, and Claude. Create custom actions, add new platform integrations, and extend functionality through a modular plugin system. Full TypeScript support.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    PyGPT

    PyGPT

    Open source personal AI Assistant for Linux, Windows and Mac

    PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage, history...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Amical

    Amical

    Open Source AI Dictation App

    Amical is an open source, AI-powered desktop dictation and note-taking application that enables users to dictate hands-free, transcribe meetings, and capture notes effortlessly with unmatched speed, accuracy, and privacy. It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    VoltAgent

    VoltAgent

    Open Source TypeScript AI Agent Framework

    An AI Agent Framework provides the foundational structure and tools needed to build applications powered by autonomous agents. These agents, often driven by Large Language Models (LLMs), can perceive their environment, make decisions, and take actions to achieve specific goals. Building such agents from scratch involves managing complex interactions with LLMs, handling state, connecting to external tools and data, and orchestrating workflows. VoltAgent is an open source TypeScript framework...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    elevenlabs-api

    elevenlabs-api

    elevenlabs-api is an open source Java wrapper around the ElevenLabs

    ... source code. The most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. Generate top-quality spoken audio in any voice and style with the most advanced and multipurpose AI speech tool out there. Our deep learning model renders human intonation and inflections with unprecedented fidelity and adjusts delivery based on context.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Better Chatbot

    Better Chatbot

    Just a Better Chatbot. Powered by MCP Client & Workflows

    Better‑chatbot is an AI chatbot framework powered by MCP protocols and workflows, allowing developers to deploy and integrate AI-powered chat systems with ease. Integrates all major LLMs: OpenAI, Anthropic, Google, xAI, Ollama, and more. MCP protocol, web search, JS/Python code execution, data visualization. Custom agents, visual workflows, artifact generation. Custom agents, visual workflows, artifact generation. Realtime voice chat with full MCP tool integration.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Rhino

    Rhino

    On-device Speech-to-Intent engine powered by deep learning

    Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. The end-to-end platform for embedding private voice AI into any software in a few lines of code. Design with no limits on top of a modular platform. Create use-case-specific voice AI models in seconds. Develop voice features with a few lines of code using intuitive and cross-platform SDKs. Deliver voice AI everywhere: on-device, mobile, web browsers...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    TEN

    TEN

    Open-source framework for conversational voice AI agents

    TEN (Transformative Extensions Network) is an open source framework designed to empower developers to build real-time multimodal AI agents capable of voice, video, text, image, and data-stream interaction with ultra-low latency. It includes a full ecosystem, TEN Turn Detection, TEN Agent, and TMAN Designer, allowing developers to rapidly assemble human-like, responsive agents that can see, speak, hear, and interact. With support for languages like Python, C++, and Go, it offers flexible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Pedalboard

    Pedalboard

    A Python library for audio

    ... is used for data augmentation to improve machine learning models and to help power features like Spotify’s AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Feishu ChatGPT

    Feishu ChatGPT

    Voice dialogue, role-playing, multi-topic discussion, picture creation

    Feishu × (GPT-3.5 + DALL·E + Whisper) = flying-like work experience. Voice dialogue, role-playing, multi-topic discussion, picture creation, table analysis, document export. Golang language, it goes without saying! Master the gin framework proficiently, developing the backend is as natural as breathing! Familiar with the SDKs of DingTalk, Feishu, Qiwei and other platforms, and be able to develop and integrate a series of amazing functions! Proficient in platform-based detail thinking, let...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.