Open Enrich

The open-source enrichment engine for go-to-market teams. A LangGraph agent swarm wired into Bright Data's web infrastructure. Drop in a CSV, get back company intelligence, person contacts, and buying signals, with a citation on every cell.

▶ Demo

Why this exists

Commercial enrichment tools cost $0.50 to $2.00 per row, lock data behind contracts, and rarely tell you where a field came from. Generic scrapers are cheap but ship raw HTML — you still have to build the agent, the extraction logic, the confidence scoring, the CRM mapping.

Open Enrich is the missing layer in between. The web is the dataset, Bright Data is the access layer, LangGraph is the orchestrator, and the output is a CSV your SDRs can paste into Salesforce or HubSpot at 9am Monday.

	Commercial enrichment tools	Generic scrapers	Open Enrich
Cost per row	$0.50 to $2.00	low, but you build everything	~$0.01 to $0.05 in API calls
Source attribution	rare	none	every cell, with URL + snippet
LinkedIn / Crunchbase	yes (contracted)	blocked or unreliable	yes (Bright Data structured scrapers)
Person-level emails	yes (gated)	no	yes (SERP pattern discovery + LinkedIn fallback)
Self-hosted	no	yes	yes
CRM-formatted export	yes	no	yes (Salesforce + HubSpot mappings)

How it works

A single CSV row becomes a stateful LangGraph run. Discovery first, then specialist agents fan out in parallel, then a validator merges results.

Each agent is a typed LangChain tool() graph with structured Zod output. Tools wrap Bright Data's SERP API, Web Unlocker, LinkedIn / Crunchbase scrapers, and Deep Lookup. Failures are isolated per-agent so one bad scrape never kills a row.

What you can extract

ICP qualification — employee count, revenue, industry, HQ, company type, funding stage Buying signals — recent funding, hiring velocity, leadership changes, job postings, tech changes Personalization hooks — company mission, recent news, blog topics, conference activity Contact data — work emails (SERP pattern + verification), phone, LinkedIn, GitHub, Twitter, title, seniority, location Social / web — LinkedIn URL, Twitter, GitHub org, company blog, careers page Custom fields — describe what you want in plain English, an agent does the research

Confidence scoring is built in. A field corroborated by three independent sources scores 0.95; a single-source heuristic scores 0.50. Every value carries the URL it came from.

The monorepo

Three packages, one engine.

Package	What it is	Install
`@brightdata/enrich-core`	The engine. LangGraph, agents, Bright Data tool wrappers, Zod schemas. Used by the other two.	workspace only
`@brightdata/enrich`	Headless CLI. `enrich leads.csv --describe "..."`. NDJSON streaming, resumable runs, dry-run cost estimates.	`npm i -g @brightdata/enrich`
`open-enrich` web	Next.js 16 web app. Upload, configure, watch agents run live, export to CRM.	self-hosted or hosted demo

Quick start

Prerequisites

You'll need a Bright Data account with two zones provisioned: SERP API and Web Unlocker. Plus an OpenRouter (or OpenAI / Anthropic) key for the LLM layer.

# 1. Clone and install
git clone https://github.com/brightdata/open-enrich.git
cd open-enrich
pnpm install

# 2. Configure
cp packages/web/.env.example packages/web/.env.local
# fill in:
#   BRIGHT_DATA_API_KEY
#   BRIGHT_DATA_SERP_ZONE
#   BRIGHT_DATA_UNLOCKER_ZONE
#   OPENROUTER_API_KEY

# 3. Run the web app
pnpm dev:web
# open http://localhost:3000

Or skip the UI

# CLI: describe what you want in plain English
npx @brightdata/enrich leads.csv \
  --describe "company size, recent funding, work email of contact"

# Or pick from presets
npx @brightdata/enrich leads.csv \
  --fields employee_count,industry,funding_stage,person_email

# Programmatic estimate before kicking off a 5k-row run
npx @brightdata/enrich leads.csv --describe "..." --dry-run

Or embed the engine

import { runEnrichment } from "@brightdata/enrich-core";

for await (const event of runEnrichment({
  rows: csvRows,
  fields: ["employee_count", "funding_stage", "person_email"],
  identifierColumn: "email",
  credentials: { brightDataApiKey, serpZone, unlockerZone, openrouterKey },
})) {
  if (event.type === "result") console.log(event.row, event.fields);
}

Or drive it from your coding agent

Open Enrich ships an Agent Skill at skills/enriching-tables that teaches Claude Code, Cursor, Codex, Gemini CLI, and other agents how to enrich any table with the @brightdata/enrich CLI — onboarding, prerequisites, the dry-run-then-confirm workflow, the full field catalog, and troubleshooting.

# Install just this skill into your agent
npx skills add brightdata/open-enrich/skills/enriching-tables

# Or every skill in the repo
npx skills add brightdata/open-enrich

It's listed on skills.sh — discovery is automatic from the skills/<name>/SKILL.md layout, so no submission step is needed. After it's installed, just ask your agent to "enrich this CSV with company size and funding" and it takes over.

CSV-aware from the first row

Open Enrich fingerprints the file before it asks you anything. Drop a Salesforce export, a HubSpot contact list, a LinkedIn Sales Navigator dump, or any CRM CSV. It detects the format, picks the best identifier column, and skips fields you already have.

Detected: HubSpot Export
Identifier: Company Domain (98% coverage)
Skipping 4 fields already present in your CSV
Enriching 2,847 of 2,900 rows with 9 new fields
Estimated cost: $42.71  |  Estimated time: 18m

Coverage gauges, format-specific presets, smart column mapping. The boring part of enrichment, handled.

Source transparency

The trust layer most commercial tools don't ship. Every enriched cell carries:

the source URL the value came from
the quote that supports it
a confidence score (0 to 1) factoring corroboration across sources
a freshness timestamp (amber after 24h, red after 7d)

Hover a cell, see the receipts. Click a row, get a full evidence panel side by side with the original CSV data. Export a Markdown audit report for sales ops.

Outbound-ready, not just enriched

Post-enrichment is where most tools dump you. Open Enrich segments the run into three queues:

Fully enriched — every requested field at high confidence. Ship today.
Has buying signal — funding, hiring spike, leadership change. Prioritize.
Needs review — partial data or low confidence. Manual triage.

Filter, search, segment. Then export with field mappings that drop into Salesforce or HubSpot without column-rename gymnastics. Or copy as Google Sheets / TSV / JSON.

Tech stack

Layer	Choice	Why
Data access	Bright Data (SERP API, Web Unlocker, structured scrapers, Deep Lookup)	150M+ residential IPs, dedicated LinkedIn / Crunchbase scrapers, AI search across 1000+ sources
Orchestration	LangGraph 1.2 + Deep Agents 1.9	stateful agent graphs, parallel fan-out, auto-eviction for large scrapes, subagent task delegation
LLM layer	OpenRouter (default), works with OpenAI / Anthropic / any compatible provider	model-agnostic, swap providers via env var
Validation	Zod 4	typed structured output on every agent response
Web app	Next.js 16 + React 19 + Tailwind v4 + Framer Motion	App Router, RSC, SSE streaming
CLI	commander + tsup	bin distribution, NDJSON output, resumable runs
Demo quota	Turso libSQL	edge-friendly per-IP lifetime limits

Cost

Bright Data bills per tool call at roughly $0.0015 each, uniform across SERP, Unlocker, and structured scrapers. A typical row touches 5 to 15 tool calls depending on what you ask for. LLM cost is on top, paid to your provider.

Field set	Tool calls / row	Bright Data cost	LLM cost (OpenRouter / GPT-4o-mini)
Quick CRM fill (5 fields)	~4	$0.006	~$0.003
Startup prospecting (14 fields)	~10	$0.015	~$0.012
Enterprise research (11 fields, deep)	~15	$0.023	~$0.018
With person enrichment	+6	+$0.009	+$0.005

Compare to $0.50 to $2.00 per row at commercial enrichment vendors. Run enrich --dry-run for a per-run estimate before you commit.

Configuration

All config lives in env vars. The CLI also accepts a stored config via enrich login.

# Required
BRIGHT_DATA_API_KEY=          # https://brightdata.com/cp/setting/users
BRIGHT_DATA_SERP_ZONE=        # zone name from your Bright Data dashboard
BRIGHT_DATA_UNLOCKER_ZONE=    # zone name from your Bright Data dashboard
OPENROUTER_API_KEY=           # https://openrouter.ai/keys

# Optional (web app)
NEXT_PUBLIC_BASE_PATH=        # serve under /open-enrich for subpath deploys
DEMO_MODE=1                   # enable 50-row-per-IP lifetime cap (requires Turso)
TURSO_DATABASE_URL=
TURSO_AUTH_TOKEN=

Provider-agnostic LLM: swap OPENROUTER_API_KEY for OPENAI_API_KEY or ANTHROPIC_API_KEY and createLLM() picks the right adapter.

Project layout

open-enrich/
├── packages/
│   ├── core/                 @brightdata/enrich-core   ← the engine
│   │   ├── agents/           LangGraph nodes, Deep Agents, prompts
│   │   ├── tools/            Bright Data tool wrappers
│   │   ├── services/         SERP, Web Unlocker, structured scrapers
│   │   ├── schemas/          Zod schemas for structured output
│   │   ├── csv/              parsing, format detection, CRM export
│   │   └── run-enrichment.ts public API entrypoint
│   │
│   ├── cli/                  @brightdata/enrich        ← terminal UX
│   │   └── src/              commander commands, login, NDJSON streaming
│   │
│   └── web/                  open-enrich               ← Next.js 16 app
│       ├── app/              App Router pages + API routes (SSE)
│       ├── components/       upload, config, table, modals, panels
│       ├── hooks/            useEnrichment, useCSVParser, useFilters
│       └── lib/              demo-mode, MCP, quality metrics
│
└── docs/
    └── assets/               README assets (logos, demo GIF)

Roadmap

Done and shipping today: company enrichment, person enrichment, source attribution, confidence scoring, CSV format detection, CRM export, MCP server integration, cost tracking, demo quota.

batch resumable runs for 100k+ row CSVs from the web UI
a webhook / queue mode for always-on enrichment from a CRM source
richer signal detection (G2 mentions, podcast appearances, conference rosters)
agent observability via LangSmith traces in the UI

See TASKS.md for the full historical changelog.

Contributing

PRs welcome. Three things worth knowing before you open one:

Run typecheck and build before pushing. pnpm typecheck && pnpm build.
The web app uses Next.js 16, which has breaking API changes from older versions. Read the relevant doc in node_modules/next/dist/docs/ before touching App Router internals.
Don't add agents speculatively. New agents should solve a named extraction problem the current set can't handle, with a benchmark to prove it.

Issues are the place to file bugs and propose features.

License

MIT. Use it, fork it, ship it.

_{Built on Bright Data's web infrastructure. The internet is the dataset.}

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.github/workflows		.github/workflows
docs/assets		docs/assets
examples		examples
packages		packages
skills/enriching-tables		skills/enriching-tables
.gitignore		.gitignore
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Open Enrich

▶ Demo

Why this exists

How it works

What you can extract

The monorepo

Quick start

Prerequisites

Or skip the UI

Or embed the engine

Or drive it from your coding agent

CSV-aware from the first row

Source transparency

Outbound-ready, not just enriched

Tech stack

Cost

Configuration

Project layout

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Open Enrich

▶ Demo

Why this exists

How it works

What you can extract

The monorepo

Quick start

Prerequisites

Or skip the UI

Or embed the engine

Or drive it from your coding agent

CSV-aware from the first row

Source transparency

Outbound-ready, not just enriched

Tech stack

Cost

Configuration

Project layout

Roadmap

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages