Misaki is a lightweight offline desktop voice assistant built in Python that runs smoothly even on low-end laptops.
| Idle | Listening | Thinking | Talking |
It listens for a wake word ("Hey Misaki"), understands speech, thinks using a local LLM, and replies with a natural voice β all without internet.
It also includes a cute animated robot face GUI that changes expressions while:
- Idle π΄
- Listening π§
- Thinking π€
- Speaking π£οΈ
β
Wake word detection
β
Real-time mic streaming
β
Silence detection (auto stop recording)
β
Offline Speech-to-Text ( Whisper )
β
Local LLM brain ( Ollama β phi3 mini )
β
Offline Text-to-Speech ( Piper TTS )
β
Animated GUI with GIF expressions
β
Works on low-end CPU laptops
β
No cloud / no API cost
β
Take Photos and Screenshots (say take picture / screenshot)
β
Can access internet when ever she want
β
Mikami also has memory to keep information
| Part | Library |
|---|---|
| GUI | Tkinter |
| STT | OpenAI Whisper |
| TTS | Piper (fast + offline) |
| Brain | Ollama (phi3:mini) |
| Audio | sounddevice |
| picture | cv2 |
| screenshort | pyautogui |
| GIF | Pillow |
| Language | Python 3.10+ |
MyFriend/
β
βββ main.py # CLI assistant
βββ gui_main.py # GUI version
β
βββ πbrain/
β βββ cerebrum.py # LLM logic (Ollama)
β βββ πstt/
β β βββ auditoryCortex.py # Speech β text (Whisper)
β βββ πtts/
β βββ motorCortex.py # Text β speech (Piper)
βββ πface/ # GIF animations
β βββ Idle.gif
β βββ listening.gif
β βββ thinking.gif
β βββ Talking1.gif
βββ πears/
β βββ mic.py # record until silence
β βββ mic_small.py # wake word chunk recorder
βββ πears/
β βββ gif.py # GifPlayer
β βββ get_path.py # resource path helper
βββ voice/ # Piper voices
β
βββ README.md
Processor (CPU): 2 GHz or faster.
Memory (RAM): 8 GB or higher.
Storage: 3 GB.pip install -r requirements.txt
Download: π https://ollama.com
Then:
ollama pull phi3:miniDownload voice files if you want different voice:
π https://Piper.com
Example:
en_US-amy-medium.onnx
en_US-amy-medium.onnx.json
Put inside:
voice/
python gui_main.pypython main.pySay:
Hey Misaki
Then speak your command.
Example:
Hey Misaki
What time is it?
Buddy will:
- Wake
- Listen
- Think
- Speak reply
If animations lag:
Use:
200x200 px
Flow:
Mic β Whisper (STT)
β Ollama (Brain)
β Piper (TTS)
β Speaker
GUI states:
Idle β Listening β Thinking β Talking β Idle
You can share Buddy with friends.
pip install pyinstallerpyinstaller --onefile --noconsole gui_main.pyEXE will be in:
dist/gui_main.exe
Buddy runs:
- locally
- no internet
- no data sent anywhere
Safe for personal use.
Aryan Gawade
- π LinkedIn
- π GitHub URL
- π Portfolio