CanvasClip

Chrome extension (Manifest V3). A clipboard bridge for browser-based remote desktops that render the session into an HTML <canvas> — Apache Guacamole, noVNC, AWS WorkSpaces Web, Citrix HDX in-browser, Parsec web client, Catonetworks, and similar. These tools run the remote screen as pixels painted on a canvas: there's no DOM input to target, no text to select, and the browser's clipboard is usually not forwarded into the remote session. Copy-paste between the host OS and the remote machine simply doesn't work.

CanvasClip fixes this in both directions:

Paste-in (Typer tab) — synthesizes real keyboard events that the canvas-based client picks up and forwards over the wire, the same way it forwards real keystrokes from your physical keyboard.
Copy-out (OCR tab) — screenshots a user-drawn rectangle of the tab and runs Tesseract.js on it, so you can extract text rendered as pixels in the remote session.

Why this exists

If you've used a browser-based remote desktop for any length of time, you know the drill:

You need to type a 40-character password or an API key into the remote session.
You copy it on your host OS.
You paste into the remote canvas — nothing happens, because the canvas isn't an input field and the remote protocol (RDP/VNC/PCoIP) has clipboard passthrough disabled by admin policy.
You end up typing it by hand, character by character, hoping you don't mistype.

Or the reverse: an error dialog, a long filename, a UUID rendered in the remote session. You can see it but not select it. You re-type it into a local text file while squinting at the screen.

CanvasClip automates both of those by dispatching real KeyboardEvents into the canvas (or into document / window as a fallback, which is where most canvas-based clients install their keyboard handlers) for outgoing text, and by running Tesseract.js locally on a screenshot of the region for incoming text.

Privacy

Nothing leaves your browser. There's no server, no telemetry, no error reporting. The only external request the extension makes is a one-time download of Tesseract's language data files (eng.traineddata.gz, ita.traineddata.gz, etc. — a few MB each) from tessdata.projectnaptha.com, which are static binary weights cached by the browser after the first use. No text you type or OCR is ever transmitted anywhere.

What works

Typer tab:

Canvas-based remote desktops: Guacamole, noVNC, AWS WorkSpaces Web, Citrix HDX HTML5, Parsec Web, etc. Click into the session canvas once to give it focus, then open CanvasClip and Type It.
Plain HTML forms: <input>, <textarea>, contenteditable. Works with React-controlled inputs — the native value setter is used to bypass React's synthetic-event value tracker, so controlled components see the change.
Same-origin nested iframes.
Enter → keyCode: 13; Tab → keyCode: 9.
Configurable per-character delay (default 40ms). Raise it if the remote is laggy or throttling input.

OCR tab:

Draw a rectangle on the current tab, extract text with Tesseract.js.
Language selector — English + Italian by default, plus Spanish, French, German, Portuguese, Dutch, Russian, Japanese, Chinese (Simplified / Traditional), Korean, Arabic, and a few English-paired combos.
One-click copy to clipboard.

Limits

Cross-origin iframes are unreachable by browser security policy. If your remote desktop webapp embeds the session canvas in a cross-origin iframe, CanvasClip can't type into it without a broader host-permissions model.
Modifier-key combos (Ctrl+C, Shift+Tab, Alt+F4, etc.) are not supported — the Typer handles printable characters, Enter, and Tab only. The goal is pasting text, not scripting keyboard shortcuts.
Some canvas clients install their keyboard listener on a very specific element. If Type It reports Done but nothing appears in the remote session, click directly into the canvas first and try again. Raising the delay helps with clients that throttle rapid keystrokes.
OCR language data is fetched from the Tesseract.js CDN on first use of each language (~5–10MB). One-time cost per language, then cached by the browser.
OCR accuracy is Tesseract-level — good for clean rendered text, flaky for small fonts, antialiased subpixel text, or unusual color schemes. Zoom the remote session before selecting, or raise the DPI of the remote display, if the text is small.

Install (unpacked)

Download the latest zip from the Releases page, or grab the build artifact from the Actions tab.
Unzip it somewhere.
Open chrome://extensions, enable Developer mode, click Load unpacked, and pick the unzipped folder.

Usage — Typer

Open your browser-based remote desktop tab and click into the session canvas to give it focus.
Click the CanvasClip toolbar icon.
Paste the text, tune the delay if needed (40ms default; try 80–120ms for laggy remotes), click Type It.
The popup stays open during typing and shows live progress.

Usage — OCR

Open the CanvasClip popup and switch to the 📷 OCR tab.
Pick a language (or combo), click Select Area. The popup closes and a crosshair overlay appears.
Click-drag a rectangle around the text you want. Release — the overlay disappears, the tab is screenshotted, and the cropped region is sent to Tesseract.
Reopen the popup to see the extracted text. A progress bar shows OCR % while processing.
Click Copy to put the text into your clipboard.

Files

manifest.json — MV3, permissions: activeTab, scripting, tabs, storage, offscreen, clipboardWrite. CSP adds 'wasm-unsafe-eval' so Tesseract's WASM can instantiate.
popup.html / popup.js — UI with two tabs (Typer / OCR) and the language selector.
content.js — injected on every page with all_frames: true at document_start; runs the focus tracker and does the actual typing.
background.js — service worker; handles captureVisibleTab and orchestrates the offscreen document.
overlay.js — injected on demand into the active tab; draws the selection rectangle.
offscreen.html / offscreen.js — runs Tesseract.js in an offscreen document (service workers can't spawn Web Workers or use the Image API).
vendor/ — Tesseract.js v5 bits vendored locally: tesseract.min.js (~65KB wrapper), worker.min.js (~120KB internal worker), tesseract-core-simd-lstm.wasm.js (~3.8MB, SIMD+LSTM core with WASM inlined). Vendored because MV3's script-src 'self' CSP applies transitively to spawned Workers. Only language data is fetched from tessdata.projectnaptha.com at runtime (via fetch(), which falls under the unrestricted connect-src).
icons/ — toolbar icons (16 / 48 / 128 px) plus a Pillow script (build_icons.py) to regenerate them.

License

MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CanvasClip

Why this exists

Privacy

What works

Limits

Install (unpacked)

Usage — Typer

Usage — OCR

Files

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
icons		icons
vendor		vendor
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
background.js		background.js
content.js		content.js
manifest.json		manifest.json
offscreen.html		offscreen.html
offscreen.js		offscreen.js
overlay.js		overlay.js
popup.html		popup.html
popup.js		popup.js

Folders and files

Latest commit

History

Repository files navigation

CanvasClip

Why this exists

Privacy

What works

Limits

Install (unpacked)

Usage — Typer

Usage — OCR

Files

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages