On-Device Prompt Chain Agent

An interactive, on-device AI agent platform that runs a ReAct (Reasoning + Action) loop inside a Web Worker. It leverages Chrome's experimental built-in Gemini Prompt API for private, local, and cost-free inference, combining custom tools, modular skills, and persistent long-term memory.

Key Features

On-Device LLM Inference: Runs entirely in the browser using Chrome's built-in LanguageModel API (window.LanguageModel), eliminating the need for external API keys or network latency.
Asynchronous Web Worker Architecture:
- prompt-chain-host.js runs on the main browser thread to manage the LLM session.
- prompt-chain-worker.js runs in a background thread to orchestrate the agent loop, execute tools, and handle errors, keeping the user interface completely responsive.
Dynamic Skill & Tool Retrieval (Lightweight RAG): Matches the user prompt against loaded skills and tools using a token-overlap scorer, feeding only relevant context to the prompt and preserving token limits.
Long-term Memory with Auto-Summarization:
- Uses agent-memory.js to store conversation histories locally in the browser via IndexedDB.
- Implements automatic conversation summarization (defined in utils.js) once the chat history exceeds 5 turns, ensuring the context window remains optimized.
Interactive UI Stream: A sleek interface built with HTML/CSS that displays the real-time agent reasoning steps (Thoughts, Actions, and Observations) alongside the final response.

File Directory & Architecture

index.html & styles.css: The frontend user interface containing input fields, suggestion chips, reasoning stream log viewports, and loaded skills indicators.
my-agent.js: The Web Worker entry point. It instantiates the worker, defines global tools (like Calculator and FetchData), and loads dynamic skills.
prompt-chain-worker.js: The core ReAct loop manager. Parses LLM output JSON structure, calls tool execution logic, handles timeouts/retries, and updates IndexedDB.
prompt-chain-host.js: Manages main thread events, initializes Chrome's built-in model, translates LLM requests from the worker, and dispatches log streams to the UI.
prompt-template.js: Assembles the prompt structure including system rules, few-shot examples, relevant tools, active skill instructions, and prior history summary.
skills/:
- weather/: A sample modular skill containing:
  - SKILL.md: Markdown file containing attributes (YAML frontmatter) and system instructions.
  - tools.js: Local implementation of weather-fetching mock tools.

Prerequisites (How to Setup Chrome Built-in AI)

This project requires a Chrome version (or Chromium-based browser like Chrome Canary) with the experimental Prompt API enabled.

Open Google Chrome.
Navigate to chrome://flags/#optimization-guide-on-device-model and set it to Enabled BypassPrefRequirement (or Enabled).
Navigate to chrome://flags/#prompt-api-for-gemini-nano and set it to Enabled.
Relaunch Chrome.
Wait for the on-device model to download in the background (you can verify it by opening DevTools console and checking if window.LanguageModel is defined).

How to Run the Project

Because the project loads ES6 modules (import/export) and spins up Web Workers dynamically, opening index.html directly from your file system (file:// protocol) will fail due to CORS security policies. You must serve it using a local web server.

Option 1: Using Node.js (npx)

If you have Node.js installed, open a terminal in the project directory and run:

npx serve

Or:

npx http-server

Then navigate to the URL provided in the console (usually http://localhost:3000 or http://localhost:8080).

Option 2: Using Python

If you have Python installed, run the following command in your terminal:

python -m http.server 8000

Then open your browser and navigate to http://localhost:8000.

Option 3: Using VS Code Live Server

If you are using Visual Studio Code, you can install the Live Server extension, open the project workspace, and click the Go Live button in the status bar.

Usage Guide

Make sure your browser has the Prompt API enabled.
Launch the local server and open the page.
The page will display the WeatherExpert skill loaded in the sidebar.
Select one of the pre-defined suggestions (e.g., Weather in Tokyo) or type your own query.
Click Run Agent.
Follow the Agent Reasoning Stream to see the agent's thought process, how it chooses to execute the weather or math tools, and its final response.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
skills/weather		skills/weather
LICENSE		LICENSE
agent-memory.js		agent-memory.js
consts.js		consts.js
index.html		index.html
my-agent.js		my-agent.js
prompt-chain-host.js		prompt-chain-host.js
prompt-chain-worker.js		prompt-chain-worker.js
prompt-template.js		prompt-template.js
readme.md		readme.md
skill-retriever.js		skill-retriever.js
skill.js		skill.js
styles.css		styles.css
tool-retriever.js		tool-retriever.js
utils.js		utils.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On-Device Prompt Chain Agent

Key Features

File Directory & Architecture

Prerequisites (How to Setup Chrome Built-in AI)

How to Run the Project

Option 1: Using Node.js (npx)

Option 2: Using Python

Option 3: Using VS Code Live Server

Usage Guide

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

On-Device Prompt Chain Agent

Key Features

File Directory & Architecture

Prerequisites (How to Setup Chrome Built-in AI)

How to Run the Project

Option 1: Using Node.js (npx)

Option 2: Using Python

Option 3: Using VS Code Live Server

Usage Guide

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages