Define the output schema, pass the image, pick the AI model, and get parsed structured output back instead of free-form text.
β If Viscribe helps your project, please leave a star. β
Python:
pip install viscribeTypeScript:
npm install viscribe- πΌοΈ One schema-driven image extraction API for documents, screenshots, product photos, visual checks, and agent tools
- π Structured output with Pydantic models, JSON Schema, or simple field definitions
- π Python sync and async helpers:
extractandaextract - βοΈ Reusable Python and TypeScript clients with matching
images.extractnamespaces - π§© OpenAI-compatible model configuration so you can bring your provider setup
- π Local image paths, base64 images, and remote image URLs
- β‘ Automatic retries through the underlying provider client
The first example uses a Pydantic schema so you can see the intended shape: the
schema is the contract, and extract fills it from the image.
from pydantic import BaseModel, Field
from viscribe.images import extract
class Receipt(BaseModel):
merchant_name: str | None = Field(description="Store or business name")
total_amount: float | None = Field(description="Final total on the receipt")
date: str | None = Field(description="Receipt date if visible")
line_items: list[str] = Field(description="Visible purchased items")
result = extract(
image_path="examples/receipt.png",
output_schema=Receipt,
instruction="Extract the receipt fields visible in the image.",
model_config={
"model": "gpt-5-mini",
"api_key": "sk-...",
"temperature": 1,
},
)
print(result.data.model_dump())TypeScript
import { images } from "viscribe";
const result = await images.extract({
imagePath: "examples/receipt.png",
outputSchema: [
{ name: "merchant_name", type: "text", description: "Store or business name" },
{ name: "total_amount", type: "number", description: "Final total on the receipt" },
{ name: "date", type: "text", description: "Receipt date if visible" },
{ name: "line_items", type: "array_text", description: "Visible purchased items" },
],
instruction: "Extract the receipt fields visible in the image.",
modelConfig: {
model: "gpt-5-mini",
apiKey: "sk-...",
temperature: 1,
},
});
console.log(result.data);Note: Viscribe works with OpenAI-compatible chat completion providers with vision support. It is recommended to load your API key from an environment variable instead of hardcoding it in your code.
Use extract for every image workflow. Change the schema and instruction to
change the job.
| Workflow | Schema fields you might request |
|---|---|
| Receipt or invoice parsing | merchant_name, total_amount, date, line_items |
| Product cataloging | name, brand, category, attributes, tags |
| Visual summaries | summary, visible_objects, scene_type, evidence |
| Category routing | category, confidence, rationale |
| Visual checks | status, issues, review_notes, requires_review |
| Agent tools | answer, evidence, next_action |
from viscribe.images import extract
result = extract(
image_path="examples/venice.png",
output_schema=[
{"name": "location", "type": "text", "description": "Likely place shown"},
{
"name": "visible_elements",
"type": "array_text",
"description": "Objects and structures",
},
{"name": "colors", "type": "array_text", "description": "Dominant colors"},
],
instruction="Extract useful scene metadata for a travel catalog.",
)
print(result.data)import { images } from "viscribe";
const result = await images.extract({
imagePath: "examples/venice.png",
outputSchema: [
{ name: "location", type: "text", description: "Likely place shown" },
{
name: "visible_elements",
type: "array_text",
description: "Objects and structures",
},
{ name: "colors", type: "array_text", description: "Dominant colors" },
],
instruction: "Extract useful scene metadata for a travel catalog.",
});
console.log(result.data);Python accepts simple field definitions, JSON Schema dictionaries, and Pydantic model classes.
TypeScript accepts simple field definitions and JSON Schema objects.
Simple field types:
text: single text valuenumber: single numeric valuearray_text: array of text valuesarray_number: array of numeric values
from pydantic import BaseModel, Field
from viscribe.images import extract
class Scene(BaseModel):
location: str | None = Field(description="Likely city, place, or landmark")
visible_elements: list[str] = Field(description="Objects and structures")
colors: list[str] = Field(description="Dominant colors")
evidence: list[str] = Field(description="Visible clues used for the output")
result = extract(
image_path="examples/venice.png",
output_schema=Scene,
instruction="Extract scene metadata for indexing.",
)
print(result.data.model_dump())Python has direct async support with aextract:
import asyncio
from viscribe.images import aextract
async def main() -> None:
result = await aextract(
image_path="examples/receipt.png",
output_schema=[{"name": "total_amount", "type": "number"}],
instruction="Extract the visible receipt total.",
)
print(result.data)
asyncio.run(main())You can also reuse a client:
from viscribe import ViscribeAI
client = ViscribeAI(model_config={"model": "gpt-5-mini", "temperature": 1})
result = client.images.extract(
image_path="examples/receipt.png",
output_schema=[{"name": "total_amount", "type": "number"}],
)TypeScript client
import { ViscribeAI } from "viscribe";
const client = new ViscribeAI({
modelConfig: { model: "gpt-5-mini", temperature: 1 },
});
const result = await client.images.extract({
imagePath: "examples/receipt.png",
outputSchema: [{ name: "total_amount", type: "number" }],
});
console.log(result.data);For detailed documentation, visit docs.viscribe.ai.
Python:
cd python
uv sync
uv run ruff check .
uv run python -m pytest
uv buildTypeScript:
cd typescript
npm install
npm test
npm run typecheck
npm run build
npm pack --dry-runDocs:
python3 -m json.tool docs/docs.json >/dev/null
npm run format:check
git diff --checkFor more about setting up the development environment and contributing to the project, see the Contributing Guide.
- π§ Email: support@viscribe.ai
- π» GitHub Issues: Create an issue
- π Feature Requests: Request a feature
- π¬ Discord: Join the community
Feel free to contribute and join our Discord server to discuss improvements, share use cases, and give suggestions.
Please see the contributing guidelines.
This project is licensed under the MIT License. See the LICENSE file for details.
β If Viscribe helps your project, please leave a star. β
Made with β€οΈ by ViscribeAI