English | 简体中文
A desktop overlay that helps you think through interview / written-exam problems. Capture a screenshot or speak into the mic, hit a hotkey, and a streaming LLM response renders in a translucent always-on-top window that is invisible to screen-capture / screen-share tools.
Built on Tauri 2 (Rust backend + React/Vite frontend). One binary on Windows + macOS.
- Screenshot solve —
Ctrl+Hto capture,Ctrl+Enterto ask. Streams a three-section answer (Problem Analysis / Solution / Key Points). - Voice solve —
Ctrl+Atoggles a live ASR session, thenCtrl+Enterfeeds the transcript + last N rounds of history to the LLM. Multi-round conversation with a sliding window. - Multi-provider LLM — OpenAI, Azure OpenAI (chat-completions API),
Azure OpenAI Responses (for
gpt-5.xdeployments), Anthropic Claude, OpenRouter, or any OpenAI-compatible custom endpoint. - Stealth overlay — on Windows,
SetWindowDisplayAffinitywithWDA_EXCLUDEFROMCAPTUREkeeps the overlay out of every standard screen recorder, browsergetDisplayMedia, and the OS Snipping Tool. On macOS, an NSPanel running an alpha-fade dance hides the overlay fromCGDisplayCreateImageduring the capture window. - Click-through, no-focus, no-taskbar — the overlay never steals focus, doesn't show up in Alt-Tab, and only the toolbar strip captures the cursor (the panel body is click-through).
- Hotkey everything — move / resize / scroll / opacity / theme / zoom / hide are all global shortcuts; see Shortcuts.
- Deep thinking toggle —
Ctrl+Tflips a hint passed to the prompt builder, encouraging the model to slow down on hard problems.
+----------------------+ IPC +-----------------------+
| React + Vite UI | <---------------> | Rust (Tauri 2) |
| (src-ui/) | bindings.ts | (src-tauri/) |
+----------------------+ +-----------------------+
|
+--------------------+-------------------+-------+
| | | |
v v v v
capture.rs audio/ llm/ window.rs
(screen + crop) (cpal mic + (4 providers, (stealth,
Doubao ASR streaming SSE click-through,
WebSocket) + cancel) hit-test)
The full command surface (24 Tauri commands) and the test tiers covering
them live in E2E_COVERAGE_MATRIX.yaml.
Prereqs: Node 18+, Rust 1.95+, Visual Studio 2022 with the Desktop
development with C++ workload. Full toolchain steps:
docs/SETUP.md.
# 1. Bootstrap: checks toolchain, runs npm install, seeds _local/config.toml
.\bootstrap.ps1
# 2. Fill in your LLM api_key in _local/config.toml (see Configuration below).
# 3. Run from a "Developer PowerShell for VS 2022" (so MSVC env is loaded):
.\src-ui\node_modules\.bin\tauri.cmd devIf you launch from a plain PowerShell, the MSVC libs aren't on LIB and
you'll hit could not open 'MSVCRT.lib' at link time. Either use the
Developer PowerShell, or wrap the launch in a .bat that calls
vcvarsall.bat x64 first — example in
docs/SETUP.md.
First compile takes 5-15 minutes (a few hundred crates). Subsequent incremental builds finish in seconds.
Prefer manual control? The steps bootstrap.ps1 runs (cd src-ui && npm install, then cp _local/config.toml.example _local/config.toml) are
fully documented in docs/SETUP.md.
Prereqs: Node 18+, Rust 1.95+, Xcode Command Line Tools.
xcode-select --install # one-time, if not already installed
./bootstrap.sh # checks toolchain, npm install, seed config
# Edit _local/config.toml with your LLM api_key (see Configuration below).
./src-ui/node_modules/.bin/tauri devmacOS 12.3+ is required (set in tauri.conf.json).
The first run will prompt for microphone + screen-recording permissions.
Config lives at _local/config.toml in debug builds (overrides the
OS path so a single file follows the source tree). In release builds:
| OS | Path |
|---|---|
| Windows | %APPDATA%\anti_interview\config.toml |
| macOS | ~/Library/Application Support/anti_interview/config.toml |
| Linux | ~/.config/anti_interview/config.toml |
[app]
language = "Auto" # "Auto" | "zh-CN" | "en-US"
programming_language = "Python"
max_history_rounds = 5 # sliding-window size for voice history
[overlay]
opacity = 0.95
position = "top-right"
[llm]
provider = "azure_responses"
api_key = "<your-azure-key>"
model = "gpt-5.3-codex" # the deployment name on Azure
base_url = "https://<resource-name>.openai.azure.com"
api_version = "2025-04-01-preview"
[asr]
provider = "volcano_ark"
access_key = "" # leave empty to disable voice solve
app_key = ""
language = "zh-CN"provider |
Required fields | Notes |
|---|---|---|
openai |
api_key, model |
base_url defaults to https://api.openai.com/v1. |
azure_openai |
api_key, model, base_url, api_version |
Chat-completions endpoint (gpt-4o etc). |
azure_responses |
api_key, model, base_url, api_version |
Required for gpt-5.x — those only expose /responses. |
anthropic |
api_key, model |
base_url defaults to https://api.anthropic.com. |
openrouter |
api_key, model |
Aggregator for 200+ models. |
custom |
api_key, model, base_url, extra_headers |
Any OpenAI-compatible server. |
For Azure Responses specifically: in the Azure portal, the Target URI on
the deployment page looks like
https://<name>.openai.azure.com/openai/responses?api-version=.... Put
https://<name>.openai.azure.com in base_url (without path) and the
api-version value in api_version. The provider builds the full URL at
request time.
Voice solve uses Volcano Doubao SAUC bigmodel WebSocket ASR. Without
access_key + app_key the voice path returns no_asr_key at toggle
time, but screenshot solve is unaffected. Credentials are issued in the
Volcano Engine console.
Windows uses Ctrl, macOS uses Option (Alt).
| Combo (Win / Mac) | Action |
|---|---|
Ctrl+H / Opt+H |
Capture screenshot |
Ctrl+Enter / Opt+Enter |
Solve (uses screenshot + transcript) |
Ctrl+A / Opt+A |
Toggle voice recording |
Ctrl+R / Opt+R |
Cancel in-flight + clear state |
Ctrl+M / Opt+M |
Hide / show overlay |
Ctrl+Q / Opt+Q |
Quit |
Ctrl+T / Opt+T |
Toggle deep-thinking hint |
Ctrl+5 / Opt+5 |
Cycle theme (dark / light) |
Ctrl+D / Opt+D |
Delete last final transcript segment |
Ctrl+[ Ctrl+] |
Opacity down / up |
Ctrl+8 Ctrl+9 |
Zoom out / in |
Ctrl+Arrows |
Move overlay (30 px steps) |
Alt+↑/↓ / Cmd+↑/↓ |
Scroll the answer panel |
Source of truth: src-ui/src/lib/shortcuts.ts.
The project ships a 5-tier test pyramid documented in
E2E_COVERAGE_MATRIX.yaml:
| Tier | What | How to run |
|---|---|---|
| T0 | Unit / contract (Rust) | cargo test -p anti-interview |
| T1 | Mocked frontend integration | cd src-ui && npx playwright test |
| T2 | Desktop runtime E2E (real binary) | cd src-ui && npx playwright test --project=desktop-runtime |
| T3 | Live provider smoke | Manual, with real Azure / Doubao keys |
| T4 | Manual / exploratory | Tracked as replacement tasks in the coverage matrix |
# Windows -> NSIS installer in src-tauri/target/release/bundle/nsis/
cd src-ui && npm install && cd ..
.\src-ui\node_modules\.bin\tauri.cmd build
# macOS -> .app + .dmg in src-tauri/target/release/bundle/
./src-ui/node_modules/.bin/tauri buildBundle config (NSIS / DMG, icon paths, entitlements) is in
src-tauri/tauri.conf.json.
- Voice ASR is Doubao-only for now. The provider seam in
src-tauri/src/audio/is built to host more, but only Volcano Doubao SAUC is wired in.
MIT — see LICENSE.