Skip to content

itsperini/viscribe

Repository files navigation

ViscribeAI

ViscribeAI

Extract structured data from images using AI models.

X @itsperini LinkedIn itsperini Discord Docs docs.viscribe.ai Python 3.10+ Node.js 20+ License MIT

Define the output schema, pass the image, pick the AI model, and get parsed structured output back instead of free-form text.

⭐ If Viscribe helps your project, please leave a star. ⭐

πŸ“¦ Installation

Python:

pip install viscribe

TypeScript:

npm install viscribe

πŸš€ Features

  • πŸ–ΌοΈ One schema-driven image extraction API for documents, screenshots, product photos, visual checks, and agent tools
  • πŸ“Š Structured output with Pydantic models, JSON Schema, or simple field definitions
  • πŸ”„ Python sync and async helpers: extract and aextract
  • βš™οΈ Reusable Python and TypeScript clients with matching images.extract namespaces
  • 🧩 OpenAI-compatible model configuration so you can bring your provider setup
  • πŸ“ Local image paths, base64 images, and remote image URLs
  • ⚑ Automatic retries through the underlying provider client

🎯 Quick Start

The first example uses a Pydantic schema so you can see the intended shape: the schema is the contract, and extract fills it from the image.

from pydantic import BaseModel, Field
from viscribe.images import extract


class Receipt(BaseModel):
    merchant_name: str | None = Field(description="Store or business name")
    total_amount: float | None = Field(description="Final total on the receipt")
    date: str | None = Field(description="Receipt date if visible")
    line_items: list[str] = Field(description="Visible purchased items")


result = extract(
    image_path="examples/receipt.png",
    output_schema=Receipt,
    instruction="Extract the receipt fields visible in the image.",
    model_config={
        "model": "gpt-5-mini",
        "api_key": "sk-...",
        "temperature": 1,
    },
)

print(result.data.model_dump())
TypeScript
import { images } from "viscribe";

const result = await images.extract({
  imagePath: "examples/receipt.png",
  outputSchema: [
    { name: "merchant_name", type: "text", description: "Store or business name" },
    { name: "total_amount", type: "number", description: "Final total on the receipt" },
    { name: "date", type: "text", description: "Receipt date if visible" },
    { name: "line_items", type: "array_text", description: "Visible purchased items" },
  ],
  instruction: "Extract the receipt fields visible in the image.",
  modelConfig: {
    model: "gpt-5-mini",
    apiKey: "sk-...",
    temperature: 1,
  },
});

console.log(result.data);

Note: Viscribe works with OpenAI-compatible chat completion providers with vision support. It is recommended to load your API key from an environment variable instead of hardcoding it in your code.

πŸ“š Extract API

Use extract for every image workflow. Change the schema and instruction to change the job.

Workflow Schema fields you might request
Receipt or invoice parsing merchant_name, total_amount, date, line_items
Product cataloging name, brand, category, attributes, tags
Visual summaries summary, visible_objects, scene_type, evidence
Category routing category, confidence, rationale
Visual checks status, issues, review_notes, requires_review
Agent tools answer, evidence, next_action

Python

from viscribe.images import extract

result = extract(
    image_path="examples/venice.png",
    output_schema=[
        {"name": "location", "type": "text", "description": "Likely place shown"},
        {
            "name": "visible_elements",
            "type": "array_text",
            "description": "Objects and structures",
        },
        {"name": "colors", "type": "array_text", "description": "Dominant colors"},
    ],
    instruction="Extract useful scene metadata for a travel catalog.",
)

print(result.data)

TypeScript

import { images } from "viscribe";

const result = await images.extract({
  imagePath: "examples/venice.png",
  outputSchema: [
    { name: "location", type: "text", description: "Likely place shown" },
    {
      name: "visible_elements",
      type: "array_text",
      description: "Objects and structures",
    },
    { name: "colors", type: "array_text", description: "Dominant colors" },
  ],
  instruction: "Extract useful scene metadata for a travel catalog.",
});

console.log(result.data);

🧱 Schema Options

Python accepts simple field definitions, JSON Schema dictionaries, and Pydantic model classes.

TypeScript accepts simple field definitions and JSON Schema objects.

Simple field types:

  • text: single text value
  • number: single numeric value
  • array_text: array of text values
  • array_number: array of numeric values

More Complex Python Schema

from pydantic import BaseModel, Field
from viscribe.images import extract


class Scene(BaseModel):
    location: str | None = Field(description="Likely city, place, or landmark")
    visible_elements: list[str] = Field(description="Objects and structures")
    colors: list[str] = Field(description="Dominant colors")
    evidence: list[str] = Field(description="Visible clues used for the output")


result = extract(
    image_path="examples/venice.png",
    output_schema=Scene,
    instruction="Extract scene metadata for indexing.",
)

print(result.data.model_dump())

⚑ Async Usage

Python has direct async support with aextract:

import asyncio
from viscribe.images import aextract


async def main() -> None:
    result = await aextract(
        image_path="examples/receipt.png",
        output_schema=[{"name": "total_amount", "type": "number"}],
        instruction="Extract the visible receipt total.",
    )

    print(result.data)


asyncio.run(main())

You can also reuse a client:

from viscribe import ViscribeAI

client = ViscribeAI(model_config={"model": "gpt-5-mini", "temperature": 1})

result = client.images.extract(
    image_path="examples/receipt.png",
    output_schema=[{"name": "total_amount", "type": "number"}],
)
TypeScript client
import { ViscribeAI } from "viscribe";

const client = new ViscribeAI({
  modelConfig: { model: "gpt-5-mini", temperature: 1 },
});

const result = await client.images.extract({
  imagePath: "examples/receipt.png",
  outputSchema: [{ name: "total_amount", type: "number" }],
});

console.log(result.data);

πŸ“– Documentation

For detailed documentation, visit docs.viscribe.ai.

πŸ› οΈ Development

Python:

cd python
uv sync
uv run ruff check .
uv run python -m pytest
uv build

TypeScript:

cd typescript
npm install
npm test
npm run typecheck
npm run build
npm pack --dry-run

Docs:

python3 -m json.tool docs/docs.json >/dev/null
npm run format:check
git diff --check

For more about setting up the development environment and contributing to the project, see the Contributing Guide.

πŸ’¬ Support & Feedback

🀝 Contributing

Feel free to contribute and join our Discord server to discuss improvements, share use cases, and give suggestions.

Please see the contributing guidelines.

Discord LinkedIn X

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

πŸ”— Links

⭐ If Viscribe helps your project, please leave a star. ⭐


Made with ❀️ by ViscribeAI

About

Image intelligence layer for AI agents

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors