A fully local, offline-capable voice assistant that lets you speak a question and get a spoken response — powered by Whisper, LLaMA2 via Ollama, and pyttsx3. Built with a modern and professional Gradio interface.
- 🎤 Voice Input (Record your question via mic)
- 🧠 Whisper (STT) + LLaMA2 (via Ollama) for question answering
- 🔊 Text-to-Speech using pyttsx3
- 💬 Beautiful chat-style UI with transcript and assistant response
- 🖥️ Runs 100% locally – No API keys needed
Layer | Technology |
---|---|
Speech-to-Text | faster-whisper |
LLM Engine | llama2 via Ollama |
Text-to-Speech | pyttsx3 |
GUI Frontend | Gradio (Blocks) |
- Python 3.8+
- Windows/macOS/Linux
- 8GB+ RAM (recommended)
- For GPU support (optional): NVIDIA GPU with CUDA
git clone https://github.com/your-username/ai-voice-assistant.git
cd ai-voice-assistant
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
💡 If you get CUDA or cuDNN errors, change Whisper’s
compute_type="int8_float32"
anddevice="cpu"
inwhisper_module.py
.
-
Download and install Ollama: 👉 https://ollama.com/download
-
Pull a local LLaMA2 model:
ollama pull llama2
-
Keep Ollama running in the background:
ollama run llama2
python app.py
- Click "🎧 Start Talking"
- Speak your question (e.g., "What is the capital of Japan?")
- Watch the transcript + assistant response appear
- Listen to the assistant speak back
Feel free to add any issues you find.
MIT License. Use freely for personal, academic, or demo purposes.
Deshan Senanayake