A fork of @brenpoly's be-more-agent project, built to run fully on-device on a Raspberry Pi 5 with the Raspberry Pi AI HAT 2+ (Hailo-10H). BMO listens for its wake word, understands what you say, thinks about it locally, and talks back — no cloud, no subscriptions, no data leaving your house.
This fork adds a browser-based web interface, a shared core/ module layer used by both interfaces, and updated support for the Hailo NPU hardware.
| Component | Where it runs | Notes |
|---|---|---|
LLM (qwen2.5-instruct:1.5b) |
Hailo-10H NPU | via hailo-ollama |
Vision (Qwen2-VL-2B-Instruct) |
Hailo-10H NPU | via HailoRT Python API; optional, requires camera |
STT (Whisper-Small) |
Hailo-10H NPU | via hailo_platform.genai.Speech2Text; falls back to whisper.cpp on CPU if HEF absent or NPU fails |
| TTS (Piper) | CPU | streams sentence-by-sentence while LLM generates |
| Wake word (openWakeWord) | CPU | "Hey BMO" custom model |
STT uses the native Speech2Text API introduced in HailoRT 5.x, which avoids the PCIe timeout issues seen with earlier direct-buffer approaches. whisper.cpp is kept as a CPU fallback — it activates automatically if the HEF is missing or NPU inference throws.
BMO in its natural habitat. Plug in a screen, a USB mic, and a USB speaker and you get the full experience: animated faces, wake word detection, and the whole listen → think → speak loop running locally. After a response, tap the screen (or the tap button) to speak again without repeating the wake word — the screen shows "Tap to speak" when BMO is ready.
A FastAPI server with a browser-based UI — useful if you want to talk to BMO from another room, or you'd rather not have a screen hanging off your Pi. Hold a button to record, and BMO responds with audio in your browser.
The web interface includes:
- Debug panel — conversation history and live server logs
- Pronunciation override — corrects how Piper pronounces specific words
- LLM status indicator — shows whether the NPU model is ready
- Hands-free mode — enables wake word detection so you don't need to hold the button
- Pi Audio toggle — routes audio to the Pi's physical speaker instead of browser playback
BMO includes several dynamic, interactive capabilities beyond basic conversation:
- Timers & Alarms: Ask BMO to "Set a timer for 10 minutes" or "Remind me to check the oven". BMO will happily interrupt you later when the time is up!
- Minigames: BMO is a living game console. Say "Let's play Trivia" or "Let's play a guessing game" — BMO will act as the host, wait for your answers, and keep score.
- Vision Analysis: Hold an object up to the camera and say "What am I holding?" or "Does this look good?". BMO will snap a photo, analyze it using the local VLM, and give you its opinion.
- Musical Talent: Ask BMO to "Play some music" or "Sing a song", and BMO will cycle into a dancing
Jammingface while playing chiptunes (add your own.wavfiles tosounds/music/). - Idle Pet Animations: When left alone in Screensaver mode, BMO will periodically (and silently) show affection by flashing pixelated hearts, getting dizzy, or falling asleep to keep your desk feeling alive.
Modern browsers require HTTPS for microphone access, which makes things awkward when your Pi is just sitting on your local network. Tailscale solves this elegantly — install it on your Pi and your other devices, enable HTTPS certificates, and you get a proper *.ts.net address with a real cert, reachable from anywhere on your Tailnet. No port forwarding, no dynamic DNS nonsense.
Disclosure: I work at Tailscale. That said, I genuinely use it for this project and it's the best solution I've found for exactly this problem.
- Install Tailscale on the Pi and your client device
- Enable HTTPS certificates in the Tailscale admin console
- On the Pi, run:
tailscale serve --bg --https=443 localhost:8080
- Access the web UI at
https://<your-pi-hostname>.ts.net
Your BMO is then reachable from your phone, laptop, or any device on your Tailnet — mic access works, and it's not exposed to the open internet.
- Raspberry Pi 5 (4GB or 8GB recommended)
- Raspberry Pi AI HAT 2+ (Hailo-10H, required for NPU features)
- USB microphone and speaker (for on-device mode)
- HDMI or DSI display (for on-device GUI)
- Raspberry Pi Camera Module (optional, for vision/photo features)
- Original Project: This is a fork of @brenpoly's be-more-agent.
- Custom BMO Voice: Huge thanks to Brenpoly for his work fine-tuning the custom BMO neural voice model (
v1.0-voice). This model provides the more accurate, charming BMO voice you hear today! - Face Artwork: BMO's face animations are rendered from SVG artwork by Cherry Honey, published as a free community resource on Figma. Thank you for the pixel-perfect expressions that bring BMO to life! Cherry Honey BMO Faces on Figma Community
- Lip-Sync Visemes: BMO's 6 talking mouth shapes were hand-animated by moorew using Rhubarb Lip Sync and After Effects — properly articulated visemes trained on real speech, replacing the original procedurally-generated shapes.
- Community Features: This fork imports several interactivity and utility features from the upstream
be-more-agentproject, including DuckDuckGo News search, fast nearest-neighbor audio resampling, and robust silence detection (VAD). - Hardware Support: Built for the Raspberry Pi 5 + Raspberry Pi AI HAT 2+ (Hailo-10H).
- Gapless TTS: Piper is held open for the entire speaking turn — sentences stream out one after another with no startup gap between them, so long answers sound natural rather than staccato.
- Articulate Lip-Sync: Talking drives a 6-shape viseme palette (closed → tiny lip-crack → small open → round /o/ → wide /a/ → full open) derived from Rhubarb-animated, artist-drawn frames. The OH/WIDE/AH "open vowel" shapes rotate every 120-220 ms during sustained vowels, with an asymmetric attack/release envelope (snappy onsets, slow relax) and a coarticulation gate so the mouth visibly steps through intermediate shapes when closing.
- Touch-Friendly Volume Slider: Tap the top-centre of BMO's face to bring up a chunky, BMO-styled volume slider — 60 px knob, 538 px track, big monospace readout, BMO's mouth/tongue palette. Designed for finger taps on the 800×480 panel: tap anywhere on the track to jump, drag the knob to fine-tune. Auto-hides 6 s after the last interaction; settings persist to
settings.json400 ms after release. - Tap to Speak: After BMO answers, tap the screen to speak again immediately without re-saying the wake word. BMO shows "Tap to speak" when ready.
- Persistent Chat History: Conversations are saved to
memory.jsonand reloaded on restart so BMO remembers previous exchanges. - Web UI Refactor: Fully responsive, mobile-friendly interface for interacting with BMO from any device.
- Improved Aliveness: Interactive "Pondering" mode — BMO will periodically share fun facts, news, and quirky thoughts when idle.
- Enhanced Search: BMO can now search for current news and regional information (Canada/Ontario prioritized).
- Audio Stability: Fast nearest-neighbor resampling and improved ALSA contention handling for more reliable wake-word detection and voice recording.
- Desktop Ready: Includes a
.desktoplauncher (install.shcreates it automatically).
be-more-agent/
├── agent_hailo.py # On-device GUI application
├── web_app.py # FastAPI web server
├── core/
│ ├── config.py # All configuration (models, devices, paths, system prompt)
│ ├── llm.py # LLM inference, web search, conversation history
│ ├── tts.py # Text-to-speech via Piper
│ └── stt.py # Speech-to-text via whisper.cpp
├── templates/ # Jinja2 HTML templates for the web UI
├── static/ # CSS, JS, favicon
├── install.sh # Automated installation script
├── upgrade_hailo53.sh # Upgrades HailoRT 5.2 → 5.3 and pulls Qwen3 models
├── setup_services.sh # Installs systemd background services
├── start_web.sh # Starts the web server
├── start_agent.sh # Starts the on-device GUI
├── requirements.txt # Python dependencies
├── wakeword.onnx # OpenWakeWord model
├── piper/ # Piper TTS engine and voice model
├── models/ # Whisper model weights + VLM HEF (auto-downloaded)
├── whisper.cpp/ # Compiled whisper.cpp STT binary
├── generate_faces.py # SVG-based face generator (2× supersampled, auto-normalised)
├── svg_faces/ # Source SVG artwork (33 hand-crafted expression assets)
├── faces/ # Generated face animations (27 expression states, 800×480 PNG)
│ ├── idle/ # Neutral smile with blink cycle
│ ├── speaking/ # Mouth open/close synced to audio volume
│ ├── listening/ # Attentive smile with slow blink
│ ├── thinking/ # Hmm expression with gentle bounce
│ ├── happy/ # Wide smile with bounce
│ ├── sad/ # Frown with slow sway
│ ├── angry/ # Mad expression with horizontal shake
│ ├── surprised/ # Wide eyes with bounce
│ ├── sleepy/ # Half-closed eyes cycling open/shut
│ ├── dizzy/ # Spiral eyes with side-sway
│ ├── cheeky/ # Cheeky grin with blink
│ ├── heart/ # Heart eyes with pulse zoom
│ ├── starry_eyed/ # Star eyes with bounce
│ ├── confused/ # Hmm expression variant
│ ├── shhh/ # Finger-to-lips shush face
│ ├── jamming/ # Happy face with energetic bounce
│ ├── football/ # Shouting face with bounce
│ ├── detective/ # Side-eye with slow blink
│ ├── sir_mano/ # Cheeky face with bounce
│ ├── low_battery/ # Barely-open tired eyes
│ ├── bee/ # Bee critter flying a figure-8 path
│ ├── daydream/ # Relaxed arc eyes with float
│ ├── bored/ # Side-eye with slow blink
│ ├── curious/ # Wide-eyed ooh face with bounce
│ ├── error/ # Exasperated face with shake
│ ├── capturing/ # Wide-eyed bounce (photo mode)
│ └── warmup/ # Eyes opening from closed (boot sequence)
├── sounds/ # GUI sound assets
└── templates/ static/ # Web UI assets
- Raspberry Pi OS (64-bit, current stable)
hailo-h10-allinstalled — the setup script handles this, but if installing manually:sudo apt install hailo-h10-allhailo-ollama— the setup script builds this from source automatically. If installing manually, see hailo_model_zoo_genai
curl -sSL https://raw.githubusercontent.com/moorew/be-more-hailo/main/install.sh | bash
cd be-more-agentThe script handles everything:
- Installs system packages including
libcamera-appsfor camera support - Fixes the Hailo driver conflict (blacklists the legacy
hailo_pcimodule) - Builds and installs
hailo-ollamafrom source if not already present - Downloads and extracts the Piper TTS engine
- Downloads the
Whisper-Small.heffor NPU speech-to-text - Clones and compiles
whisper.cppas a CPU fallback for STT - Downloads the
ggml-small.enWhisper model for CPU fallback - Creates a Python virtual environment and installs dependencies
- Pulls
qwen2.5-instruct:1.5b(LLM) viahailo-ollama - Downloads the
Qwen2-VL-2B-InstructVLM HEF directly from Hailo's CDN (~2.2 GB) - Enables system site-packages in the venv so Python can use
hailo_platform - Checks camera availability and lets you know if anything's missing
git clone --recurse-submodules https://github.com/moorew/be-more-hailo.git be-more-agent
cd be-more-agent
chmod +x *.sh
./install.shAlready cloned without
--recurse-submodules? Rungit submodule update --init --recursivefrom inside the repo to pullwhisper.cppat the pinned upstream commit. (Or just re-runinstall.sh— it does this automatically.)
Web Interface (Kiosk Mode):
./setup_web.shThis script installs all necessary Python and system audio dependencies, sets up the bmo-web.service to start on boot, and configures Chromium to automatically open in full-screen kiosk mode on desktop login.
To manually start/stop the web backend: sudo systemctl start|stop|restart bmo-web
To run manually without the service: . venv/bin/activate && ./start_web.sh
On-device GUI (Tkinter):
source venv/bin/activate
./start_agent.shAuto-start LLM & GUI Services:
./setup_services.shThen manage with sudo systemctl start|stop|restart bmo-ollama or bmo-gui.
All settings live in core/config.py. The most commonly changed values:
# LLM models (must be pulled via hailo-ollama)
LLM_MODEL = "qwen2.5-instruct:1.5b"
FAST_LLM_MODEL = "qwen2.5-instruct:1.5b"
# Vision model — runs directly via HailoRT Python API (not hailo-ollama)
VLM_HEF_PATH = "./models/Qwen2-VL-2B-Instruct.hef"
# Audio device for local hardware playback (run `aplay -l` to find yours)
# The USB speaker is typically on a different ALSA card from the mic — check both.
ALSA_DEVICE = "plughw:UACDemoV10,0"
# Microphone device index (run `python3 -c "import sounddevice as sd; print(sd.query_devices())"`)
MIC_DEVICE_INDEX = 1
MIC_SAMPLE_RATE = 48000
# STT: NPU path (Whisper-Small on Hailo-10H) and CPU fallback (whisper.cpp)
WHISPER_HEF_PATH = "./models/Whisper-Small.hef"
WHISPER_CMD = "./whisper.cpp/build/bin/whisper-cli"
WHISPER_MODEL = "./models/ggml-small.en.bin"Environment variables override any of these at runtime:
export ALSA_DEVICE="plughw:2,0"HailoRT 5.3 adds Qwen3-1.7B-Instruct (LLM) and Qwen3-VL-2B-Instruct (VLM). The Raspberry Pi apt repo lags behind upstream, so 5.3 isn't in apt yet — but the vendor packages are available directly from Hailo's CDN. A direct apt upgrade is blocked by package name conflicts (the Pi repo uses h10-hailort while upstream uses hailort), so the upgrade requires a purge-and-reinstall.
A script handles all of this automatically:
./upgrade_hailo53.shThe script:
- Stops BMO services
- Downloads the three upstream 5.3
.debfiles - Purges the Pi-repo 5.2 packages
- Installs the 5.3 vendor packages (runtime + DKMS PCIe driver + model zoo)
- Reloads the kernel module
- Pulls
qwen3-instruct:1.7bvia hailo-ollama - Downloads
Qwen3-VL-2B-Instruct.hef - Patches
core/config.pywith the new model names
The systemd service already passes OLLAMA_HOST as an environment variable, so the 5.3 config format change (JSON → env var) requires no changes to the service file.
Kernel module note: The PCIe driver ships as DKMS source and builds against whatever kernel is running. If
/dev/hailo0disappears after the upgrade,sudo rebootis all that's needed.
By default, all queries go to a single model (qwen2.5-instruct:1.5b). If you want to route longer or more complex queries to a larger model:
- Pull the larger model via
hailo-ollama - Set
LLM_MODELto the larger model name incore/config.py - Keep
FAST_LLM_MODELpointing toqwen2.5-instruct:1.5b
Short, simple prompts (under 15 words, no complex keywords) stay on the fast model. Longer or more complex ones go to LLM_MODEL. Note that swapping models on the Hailo-10H takes a few seconds on the first query after a switch.
If you have a Raspberry Pi Camera Module connected:
- Enable the camera interface in
raspi-config - Install camera tools if not already present:
sudo apt install -y libcamera-apps
- Say something like "Hey BMO, take a photo and tell me what you see" — the agent captures a frame with
rpicam-stilland sends it to the vision model (Qwen2-VL-2B-Instruct, orQwen3-VL-2B-Instructafter HailoRT upgrade) running natively on the NPU via the HailoRT Python API
The VLM runs as a separate process from the LLM server. Hailo's VDevice sharing allows both to coexist on the same NPU without conflicts. If the VLM HEF file isn't installed, BMO will politely say so rather than crashing.
BMO is pretty easy to make your own:
Personality: Edit get_system_prompt() in core/config.py. This is where BMO's voice, tone, and quirks are defined.
Faces: BMO's faces are rendered from 33 hand-crafted SVGs in svg_faces/ by generate_faces.py. The generator normalises each face — auto-detecting the content bounding box, centring it in the output, and gently scaling down any oversized expressions — so all 27 states appear at a consistent size on screen. Animations (blink, bounce, shake, mouth cycle) are applied by modifying SVG viewBox coordinates and eye ellipse geometry before rendering via cairosvg at 2× resolution (2560×1440) then LANCZOS-downsampling to 800×480. To regenerate all frames: python generate_faces.py.
Expressions: The LLM can trigger any expression by outputting {"action": "set_expression", "value": "happy"}. Available emotions:
| Expression | Description |
|---|---|
happy |
Upturned arc eyes with a bouncing smile |
sad |
Downturned slash eyes with a frown that droops |
angry |
Crossed slash eyes with a flat trembling mouth |
surprised |
Big round eyes with a pulsing O-shaped mouth |
sleepy |
Closed eyes with floating Z letters |
dizzy |
X-shaped eyes with a wavy squiggle mouth |
cheeky |
One open eye, one winking, wagging tongue |
heart |
Beating heart-shaped eyes (scales up and down) |
starry_eyed |
Spinning 4-point sparkle stars for eyes |
confused |
One oversized eye, one flat line, wiggly mouth |
daydream |
Eyes drifted up with floating thought bubbles (screensaver) |
bored |
Eyes shifting left and right (screensaver) |
jamming |
Closed eyes, big smile, bouncing musical notes (screensaver) |
curious |
One eye pulsing larger than the other, tilted look (screensaver) |
Sounds: Put .wav files in sounds/<category>/. BMO picks one at random per event.
Wake word: Replace wakeword.onnx with any OpenWakeWord-compatible model.
Image Generation: When BMO discusses highly visual topics (especially during screensaver musings or when explicitly asked), they use the local LLM to generate a descriptive prompt. This prompt is then sent to Pollinations.ai, a free community API that generates the image in the cloud and returns it to the Pi. BMO then applies a custom retro LCD border before displaying it on-screen. This keeps the Pi fast and responsive without needing to run heavy Diffusion models locally!
When BMO has been idle for 60 seconds, it enters screensaver mode and cycles through its expressions. Approximately every 30 minutes, BMO will "think out loud" by:
- Searching the web for a random topic (weather, news, fun facts, quotes, science, jokes)
- Feeding the search result to the on-device LLM with a special prompt
- Speaking the generated thought via Piper TTS
BMO stays quiet during:
- Night hours (10 PM – 8 AM)
- Recent interaction (within 60 seconds of your last conversation)
This all runs locally — search results go through DuckDuckGo and the LLM processes them on the Hailo NPU.
LLM shows as offline / can't connect to port 8000
Check if hailo-ollama is running:
sudo systemctl status bmo-ollamaIf the service isn't set up yet, start it manually:
export OLLAMA_HOST=0.0.0.0:8000
hailo-ollama serveIf hailo-ollama isn't found, re-run ./install.sh — it will build and install it from source.
Hailo NPU not detected (/dev/hailo0 missing)
This is usually caused by a driver conflict. The system ships with both hailo_pci (Hailo-8) and hailo1x_pci (Hailo-10H) drivers. If the old one loads first, it blocks the new one from creating the device node. Fix it by blacklisting the old driver:
echo "blacklist hailo_pci" | sudo tee /etc/modprobe.d/blacklist-hailo-legacy.conf
sudo rmmod hailo1x_pci 2>/dev/null; sudo rmmod hailo_pci 2>/dev/null
sudo modprobe hailo1x_pci
ls /dev/hailo0 # should now existThe setup script handles this automatically, but if you installed manually you may need to do it yourself.
Inference fails with HAILO_OUT_OF_PHYSICAL_DEVICES
This means /dev/hailo0 doesn't exist — see the fix above. Another cause is a process already holding the device; check with lsof /dev/hailo0.
VLM fails with HAILO_INVALID_OPERATION / HailoRTStatusException: 6
This usually means the VLM HEF file was compiled for a different HailoRT version. The HEF must match your installed runtime:
dpkg -l | grep hailort # check your version (e.g. 5.1.1)Re-download the matching HEF:
HAILORT_VER=$(dpkg-query -W -f='${Version}' h10-hailort)
wget -O models/Qwen2-VL-2B-Instruct.hef \
"https://dev-public.hailo.ai/v${HAILORT_VER}/blob/Qwen2-VL-2B-Instruct.hef"TTS Audio Stuttering / Staccato Speech
If Piper sounds like it's "tripping" or only playing short bursts of noise, it's likely an ALSA buffer underrun caused by high CPU/NPU load. The default ALSA buffer is 500ms (--buffer-time=500000) in agent_hailo.py. If it persists, ensure you are using the official 27W Power Supply.
Mic stops listening (Watchdog Trigger)
If BMO stops responding to the wake word, the mic stream may have stalled. We've added a 10-second watchdog in the agent_hailo.py ear loop that automatically restarts the stream if no data is received.
Persistent Memory
Chat history is now persisted to memory.json. BMO will remember your previous conversations even after a restart!
Camera vision says "my eyes aren't working"
If the VLM HEF is present but inference still fails, check that hailo_platform is importable:
source venv/bin/activate
python3 -c "from hailo_platform.genai import VLM; print('OK')"If it fails, ensure system site-packages are enabled: grep include-system venv/pyvenv.cfg should say true.
The original project is entirely the work of @brenpoly — the concept, the character, and the original implementation. This fork adds Hailo NPU support, the web interface, dual-interface core/ modules, and various fixes and improvements.
BMO's face artwork is by Cherry Honey, shared freely with the community via the Figma Community. The SVGs are rendered and animated programmatically by generate_faces.py.
"BMO" and "Adventure Time" are trademarks of Cartoon Network (Warner Bros. Discovery). This is a fan project for personal and educational use only, not affiliated with or endorsed by Cartoon Network.
MIT — see LICENSE.