11 releases (4 breaking)
| new 0.5.1 | May 8, 2026 |
|---|---|
| 0.5.0 | May 1, 2026 |
| 0.4.0 | Apr 26, 2026 |
| 0.3.0 | Apr 26, 2026 |
| 0.1.6 | Apr 22, 2026 |
#212 in Unix APIs
370KB
6.5K
SLoC
wdotool
Input automation for Wayland. Send keystrokes, move the mouse, focus and close windows. Works as a command-line tool for shell scripts and ad-hoc use, and as a Rust library (wdotool-core) for tools that want to embed it. wflow is the first known library consumer.
Why
If you've been wanting xdotool on Wayland, this is the closest thing. wdotool covers most of the same operations (key, type, mouse, scroll, search, focus) using the protocols Wayland actually provides for this work: libei via the XDG RemoteDesktop portal, wlroots virtual-keyboard and virtual-pointer, KWin scripting on KDE, and a Shell extension on GNOME. It respects compositor focus and permissions; it does not bypass them like ydotool does with /dev/uinput.
"Closest thing" is doing real work in that sentence. wdotool is not a drop-in xdotool replacement. The CLI surface aims to be argv-compatible for the common commands so existing scripts can port without much editing, but full parity is not promised. See docs/xdotool-compat.md for an honest table of what works, what does not, and what is intentionally out of scope.
For comparison with the alternatives: xdotool is X11-only and does not work on Wayland at all. ydotool writes to /dev/uinput, which means root (or carefully tuned udev rules), no focus awareness, and no window management. It bypasses the compositor entirely, which breaks inside sandboxed sessions and loses any security boundary the compositor was enforcing. wdotool stays inside the compositor's permission model.
Status
Early but usable. Actively tested on Hyprland + wlroots. The KDE and GNOME backends are both experimental: code paths compile and unit tests pass, but neither has been smoke-tested end-to-end on real hardware of the target desktop yet. If you run Plasma 6, issue #1 tracks verification — docs/verification/kde-plasma-6.md is a 30–45 minute checklist that closes it. Same shape for GNOME at issue #4.
| Feature | libei | wlroots | kde | gnome | uinput |
|---|---|---|---|---|---|
key / keydown / keyup |
✅ | ✅ | ✅ | ✅ | ✅ |
type (Unicode via keymap) |
partial¹ | ✅ | partial¹ | partial¹ | partial² |
mousemove (relative) |
✅ | ✅ | ✅ | ✅ | ✅ |
mousemove (absolute) |
✅ | ✅ | ✅ | ✅ | ✅ |
click / mousedown / mouseup |
✅ | ✅ | ✅ | ✅ | ✅ |
scroll |
✅ | ✅ | ✅ | ✅ | ✅ |
search / getactivewindow |
— | ✅ | 🧪³ | 🧪⁴ | — |
windowactivate / windowclose |
— | ✅ | 🧪³ | 🧪⁴ | — |
getmouselocation |
— | — | 🧪³ | 🧪⁴ | — |
¹ libei (and kde / gnome, which use libei for input) is a sender context; the EIS server owns the keymap. Characters not in the active layout are skipped with a warning.
² uinput has the same limitation as libei — the kernel doesn't know about keymaps. Best-effort via the env-default xkb layout.
³ Implemented but unverified on a real Plasma session. The KDE backend uses kwin scripting over D-Bus the same way kdotool does; same machinery for getmouselocation reads workspace.cursorPos. Issue #1 tracks verification.
⁴ Requires the companion GNOME Shell extension at packaging/gnome-extension/wdotool@wdotool.github.io/. Shipped in v0.1.6 (window ops) and extended in v0.3.x (GetPointerPosition); not yet smoke-tested on a real GNOME session. Issue #4 tracks verification. Without the extension, gnome falls back to bare libei (input only).
Install
Linux only (Wayland protocols, /dev/uinput).
# Arch Linux (AUR) — three flavors, pick one:
yay -S wdotool # source build at the latest stable tag
yay -S wdotool-bin # prebuilt x86_64 binary (fastest install)
yay -S wdotool-git # rolling build from main
# Debian / Ubuntu (starting v0.2.0). Grab the .deb attached to the
# latest GitHub release: https://github.com/cushycush/wdotool/releases/latest
sudo dpkg -i wdotool_*_amd64.deb
# Fedora / openSUSE / RHEL (starting v0.2.0). Same idea, grab the
# .rpm attached to the latest release.
sudo dnf install ./wdotool-*.x86_64.rpm # Fedora / RHEL
sudo zypper install ./wdotool-*.x86_64.rpm # openSUSE
# Prebuilt binary via shell installer (any glibc x86_64 Linux)
curl -LsSf https://github.com/cushycush/wdotool/releases/latest/download/wdotool-installer.sh | sh
# Nix (flake)
nix run github:cushycush/wdotool -- --help
# or add to a NixOS/home-manager config:
# inputs.wdotool.url = "github:cushycush/wdotool";
# crates.io (Linux, any arch; builds on install)
cargo install wdotool
# From source
git clone https://github.com/cushycush/wdotool && cd wdotool
cargo build --release
./target/release/wdotool --help
Flathub submission is in progress. Until it lands, the manifest at packaging/flatpak/io.github.cushycush.wdotool.yml builds locally for anyone who wants to test it. See packaging/flatpak/README.md for the steps.
Runtime deps: libxkbcommon and libwayland-client. Both are universally present on Wayland systems.
Usage
Drop-in xdotool replacement for the common commands:
wdotool info # show detected backend + capabilities
wdotool diag # env + backend availability report
wdotool diag --copy # same, piped to the clipboard for bug reports
wdotool key ctrl+c # send Ctrl+C
wdotool keydown shift # press and hold Shift
wdotool keyup shift # release it
wdotool type "hello 世界 €" # Unicode works on wlroots
wdotool type --delay 30 "slow typing" # 30ms between chars
wdotool type --file script.txt # read text from a file
echo "hello" | wdotool type --file - # read from stdin
wdotool key --clearmodifiers ctrl+c # release stuck modifiers first
wdotool type --clearmodifiers "hi" # (works on key/keydown/keyup/type)
wdotool mousemove 500 400 # absolute
wdotool mousemove --relative 10 -5 # relative
wdotool click 1 # 1=left, 2=middle, 3=right, 8=back, 9=forward
wdotool scroll 0 3 # scroll 3 units down
wdotool search --name "Firefox" # substring match on title
wdotool search --class firefox # match on Wayland app_id
wdotool search --pid 1234 # match on process id
wdotool search --name 'Fire.*x' --regex # full regex semantics
wdotool search --name FOX --ignore-case # case-insensitive
wdotool search --name fox --pid 1234 --any # OR filters instead of AND
wdotool getactivewindow # print focused window's id
wdotool getwindowname <id> # print a window's title
wdotool getwindowpid <id> # print a window's PID
wdotool getwindowclassname <id> # print a window's app_id (X11 WM_CLASS equivalent)
wdotool windowactivate <id> # focus a window
wdotool windowclose <id> # close a window
wdotool record -o trace.json # capture input until Ctrl-C, write JSON
wdotool record --max-duration 30 # auto-stop after 30 seconds
wdotool record --backend evdev # /dev/input/event* (needs `input` group)
wdotool replay trace.json # replay a captured trace at original speed
wdotool replay trace.json --speed 4 # 4x faster
cat trace.json | wdotool replay - # read trace from stdin
wdotool prime & # hold virtual devices alive in the background
wdotool key a; wdotool key b # subsequent calls skip per-call setup latency
kill %1 # release devices when done
Global flags:
--backend <libei|wlroots|kde|gnome|uinput> # force a specific backend
-v, --verbose # show detection + bind logs
Heads up: wdotool injects keystrokes and pointer events into whichever window has focus when the command runs. Wayland's security model has no "target a specific window" API for input — focus the target first, then run. --clearmodifiers is approximate on Wayland for the same reason: we can't observe current modifier state, so the flag unconditionally releases Ctrl/Shift/Alt/Super/AltGr rather than doing xdotool's save-and-restore.
Backends
libei
Opens an org.freedesktop.portal.RemoteDesktop session via the XDG portal, negotiates a libei socket, handshakes as a sender context, and emits input through the compositor's own libei implementation. Full focus awareness and permission prompts. Preferred on GNOME and KDE.
The portal asks for consent on the first run (a "Allow this app to control your computer?" dialog). wdotool caches the issued restore_token at $XDG_STATE_HOME/wdotool/portal.token (mode 0600). Every later run presents the token and skips the dialog. The flow stays silent until the user revokes from the portal's settings UI, then the next run prompts again.
To reset the consent flow manually, delete the cache file:
rm ~/.local/state/wdotool/portal.token
Requirements:
xdg-desktop-portal+ a backend that exportsorg.freedesktop.portal.RemoteDesktop.xdg-desktop-portal-gnome(GNOME 46+) andxdg-desktop-portal-kde(KDE Plasma 6) both ship it.- On Hyprland,
xdg-desktop-portal-hyprland1.3.11 does not yet expose RemoteDesktop. Use the wlroots backend instead.
wlroots
Binds zwp_virtual_keyboard_v1, zwlr_virtual_pointer_v1, and zwlr_foreign_toplevel_management_v1 directly. Uploads a transient xkb keymap at type time (one keycode per unique character as a Level-0 Unicode keysym) — this is how arbitrary Unicode works without a server-side keymap change. Tracks wl_output modes so motion_absolute coordinates are real pixels on the primary output. Preferred on Sway, Hyprland, river, Wayfire.
Requirements:
- A wlroots-based compositor that exposes the three protocols above. Sway, Hyprland, river, and Wayfire all do by default.
kde
Composes libei (for input) with KWin-scripting window management over D-Bus. Generates small JS snippets, hands them to org.kde.KWin.Scripting.loadScriptFromText, runs them, and receives results on a transient com.wdotool.KdeBridge D-Bus service the backend registers at startup. Same trick kdotool uses, adapted to both the Plasma 6 (workspace.windowList, workspace.activeWindow) and Plasma 5 (workspace.clientList, workspace.activeClient) APIs. For Plasma 5/6 where you want window management alongside input.
Requirements:
xdg-desktop-portal-kdefor the libei input path.- KWin 5.22+ for the Scripting D-Bus interface.
gnome
Pairs libei (for input, via GNOME's RemoteDesktop portal) with a companion GNOME Shell extension that exposes ListWindows, GetActiveWindow, ActivateWindow, and CloseWindow on the session bus. GNOME Shell has no generic external window API, so the extension is the only way to do window management. Without the extension, the detector falls back to bare libei (input only) automatically.
Status: shipped in v0.1.6 but the project is maintained from Hyprland, so the GNOME backend has not been smoke-tested on a real session yet. If you run GNOME, install the extension and try a few wdotool commands. Bug reports go to #2. wdotool diag will tell you whether the extension is found.
Requirements:
xdg-desktop-portal-gnomefor the libei input path (GNOME 46+ ships it).- The
wdotool@wdotool.github.ioextension, installable frompackaging/gnome-extension/:
The extension targets GNOME Shell 45 through 48. It has no background activity; it only responds to D-Bus method calls from the wdotool CLI.cp -r packaging/gnome-extension/wdotool@wdotool.github.io ~/.local/share/gnome-shell/extensions/ # log out, log back in, then enable: gnome-extensions enable wdotool@wdotool.github.io
uinput
Creates a virtual input device via /dev/uinput and writes raw input_event structs through the kernel. Compositor-agnostic — works on Wayland, X11, or a bare framebuffer session — but has no focus awareness and no window API. Fallback for environments without libei or wlroots.
Requirements:
- Write access to
/dev/uinput. The usual setup isusermod -aG uinput $USER(orinputon some distros) plus a udev rule if the device isn't created with the group by default:
On Arch withKERNEL=="uinput", GROUP="uinput", MODE="0660"systemd-tmpfiles,uaccesstags work too. - xkb configuration present on the system (for keysym → keycode translation). Installed by default on every Wayland/X11 install.
Supported compositors
| Compositor | Input backend | Window backend |
|---|---|---|
| Hyprland | wlroots | wlroots |
| Sway | wlroots | wlroots |
| river | wlroots | wlroots |
| Wayfire | wlroots | wlroots |
| GNOME (46+) | libei | companion Shell extension (D-Bus) — #2 |
| KDE Plasma 6 | libei | KWin scripting (D-Bus) — #1 |
| Anything else | uinput | — |
Backend selection is automatic based on XDG_CURRENT_DESKTOP and compositor hints (SWAYSOCK, HYPRLAND_INSTANCE_SIGNATURE, etc.). The detector tries the preferred backend first and falls through to alternatives if it fails to bootstrap — use --backend to force a specific one.
Use as a library
The engine lives in a separate wdotool-core crate, so other Rust projects can drive input directly without spawning a subprocess. The CLI is a thin wrapper around the same crate.
# Cargo.toml
[dependencies]
wdotool-core = "0.3"
Each backend is gated behind a Cargo feature (libei, wlroots, kde, gnome, uinput). Default-on enables all five. To skip a backend, opt out:
# Drop uinput's input-linux + libc deps; useful in sandboxed builds (Flatpak)
# where /dev/uinput is not reachable.
wdotool-core = { version = "0.3", default-features = false, features = ["libei", "wlroots", "kde", "gnome"] }
Public API (intentionally small):
use wdotool_core::{detector, Backend, KeyDirection, MouseButton};
let env = detector::Environment::detect();
let backend = detector::build(&env, None).await?;
backend.key("ctrl+c", KeyDirection::PressRelease).await?;
The full list: Backend trait, DynBackend, detector::{build, Environment, BackendKind}, keysym::parse_chain, the value types (Capabilities, WindowInfo, etc.), and WdoError / Result. Per-backend types stay private; you build through detector::build.
For JS or TS tools that parse wdotool capabilities output, the typed schema ships separately on npm as @wdotool/capabilities. It exports the JSON Schema document plus hand-written TypeScript types and a small isCapabilitiesReport runtime guard, so a typical consumer goes from JSON.parse(stdout) (untyped) to a fully typed CapabilitiesReport with autocomplete on the locked enums (type_unicode, match_by, etc.).
Building
Requires Rust 1.82+ (the CLI uses Option::is_none_or). Builds cleanly on stable.
cargo build # dev (workspace root builds both crates)
cargo build --release # release binary at target/release/wdotool
cargo test # ~135 unit + integration tests across the workspace
For the testing strategy (unit tests, mock-backend integration tests, the planned headless-compositor harness, and the pre-release manual matrix), see docs/testing.md.
System libraries at build time: libxkbcommon-dev and libwayland-dev (Debian/Ubuntu names; same underlying libraries elsewhere).
Known limitations
- Unicode
typeworks fully on wlroots via transient-keymap injection. On libei, kde, and uinput it's best-effort — the compositor or kernel owns the keymap, so characters outside the active layout are skipped with a warning. - Multi-output absolute pointer on wlroots uses whichever
wl_outputreplied first. Fine for single-monitor setups; targeting a specific output would need a new CLI arg. - GNOME window management (
search/windowactivate/windowclose) ships a companion Shell extension atpackaging/gnome-extension/wdotool@wdotool.github.io/. Unverified on a live GNOME session — see issue #2. - KDE backend is implemented but unverified on a real Plasma session. See issue #1.
- No multi-seat handling — the first seat gets everything.
--clearmodifierson Wayland can't do xdotool's save-and-restore dance (we can't observe current modifier state); see the Usage note above.
Project layout
The repo is a Cargo workspace with two crates.
wdotool-core/ # library: the engine other Rust code can use
src/
lib.rs # public API re-exports
backend/
mod.rs # Backend trait (async_trait)
detector.rs # runtime backend selection
libei.rs # RemoteDesktop portal + reis
wlroots.rs # virtual-keyboard/pointer + foreign-toplevel + wl_output
kde.rs # libei input + KWin-scripting window ops via D-Bus
gnome.rs # libei input + GNOME Shell extension D-Bus bridge
uinput.rs # /dev/uinput fallback
types.rs # Capabilities, KeyDirection, MouseButton, WindowInfo
error.rs # WdoError, Result
keysym.rs # ctrl+shift+a parser
portal_token.rs # libei restore_token cache (no more consent spam)
wdotool/ # binary: the CLI
src/
main.rs # tokio entrypoint
cli.rs # clap subcommand definitions
diag.rs # `wdotool diag` (env + backend availability)
Every real backend runs on a dedicated OS thread because the underlying event streams (reis::EiConvertEventStream, wayland_client::EventQueue) are not Send. Input ops dispatch to those threads via channels.
Packaging artifacts live under packaging/: the AUR PKGBUILD templates (packaging/aur/), the Flathub manifest plus AppStream metainfo (packaging/flatpak/), the GNOME Shell extension that powers the gnome backend's window ops (packaging/gnome-extension/), and the @wdotool/capabilities npm package (packaging/npm/).
See CHANGELOG.md for release notes and docs/xdotool-compat.md for the honest xdotool parity table.
Contributing
Pull requests and issue reports welcome. Two open issues specifically want outside help, both about verifying backends on real desktops since the project is maintained from Hyprland:
- #1
help wanted,kde,needs-testing. The KDE backend is implemented but has never been exercised against an actual Plasma session. Runningwdotool --backend kde infoon Plasma 5 or 6 and posting the output (or awdotool diag --copypaste) would meaningfully de-risk the backend. - #2
help wanted,gnome,needs-testing. The GNOME backend is fully shipped, including the companion Shell extension atpackaging/gnome-extension/wdotool@wdotool.github.io/. What's missing is real-session smoke testing on GNOME 46+. Install the extension, run a fewwdotoolcommands, and post the result.
For other bugs or features, open an issue first for anything non-trivial so the shape can be agreed before code lands.
License
MIT OR Apache-2.0
Dependencies
~11–23MB
~347K SLoC