Skip to content

dzzie/ai4vb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ai4vb

Agentic AI for VB6. A small, swappable multi-provider AI client suite for VB6, plus a set of examples that put a model to work against live things — an object graph, a database, an image, several models at once, a photo library, a folder of documents. The unifying idea: give a model direct access to something live and stay out of its way. Cloud or fully local.

Note: each AI class still has a Form1.list1.additem call for debugging, you can just nuke it, ive kept it so far


Repository layout

The base classes live at the root — the reusable AI client suite, shared by everything. Each example is a peer folder that builds its own host, prompt, and domain on top of them. No example is privileged; they're six different domains driven by the same backends.

/                      base classes: the four AI backend clients + CJSON, CLogger, CFileSystem2
/1_automation_example  agent over a live VB6 object graph via the JScript bridge   (see its README)
/2_db_example          agent given dynamic access to a SQLite database through the app
/3_image_example       upload an image, ask a question about it
/4_multiagent_chat     round-robin debate across multiple backends
/5_img_tagging         resumable image-catalog + vision-tagging app
/6_text_tagging        resumable text-catalog + structured-output extraction app

What it does

ai4vb is two things at once:

  • A multi-provider AI client suite — four backend clients behind one interface, swappable without touching the code that uses them.
  • Six worked examples — different ways to hand a model live access to data and let it answer questions, from reasoning over VB6 objects to cataloging a folder of photos to pulling structured fields out of a stack of documents.

Four backends sit behind one interface. Each example wires in whichever it needs:

  • OpenAI/v1/responses, server-side context via previous_response_id.
  • Anthropic/v1/messages, client-maintained history (their API is stateless).
  • Google Gemini — native generateContent.
  • Ollama — local /api/chat, no API key, nothing leaves the box.

That last one carries weight: with Ollama the entire loop — model, data, images, reasoning — runs on your own hardware. Sensitive structures never touch a cloud service, there's no key in the product, and no service that can be discontinued.

The deepest pattern here — an iterative loop where the model drives a live object graph through a JScript bridge — is demonstrated by 1_automation_example. Its mechanics (the run loop, the host.describe contract, the earned gotchas) live in that folder's own README.


The base classes

The reusable substrate at the root. Everything else is built per example.

File Role
COpenAI.cls OpenAI client. Async HTTP with poll-loop and cancel support. Uses previous_response_id for context.
CClaudeAI.cls Anthropic client. Same surface as COpenAI; maintains conversation history in-memory because Anthropic's API is stateless.
CGemini.cls Google Gemini client. Native generateContent; auth via x-goog-api-key, model in the URL path, vision via inline_data.
COllama.cls Local Ollama client. Native /api/chat, keyless, stream:false; vision via base64 image array. Localhost or a remote box via the RemoteIP property.
CJSON.cls JSON parser/path-accessor over ScriptControl.
CLogger.cls Append-only logger with timestamps and section headers. Every run produces a re-readable trail.
CFileSystem2.cls File/folder helpers (exists, read, enumerate).

The two layers

The framework layer (the base classes): a swappable multi-provider AI client suite plus the utilities every example leans on — JSON parsing, logging, filesystem access. Backend-agnostic, domain-agnostic.

The example layer: each folder adds its own host form, prompt, domain, and — where it uses the JScript bridge — its own bridge module (modProtoGen in the automation example, modDbHost in the db example). This is where domain expertise plugs in.

The framework doesn't try to make the AI smart. It tries to give the AI everything it needs and stay out of the way.


The examples

Six domains, one set of base classes. Each folder is self-contained with its own host form, prompt, and supporting modules.

1_automation_example/ — the original demo and the reference implementation of the core pattern: an agent reasoning over a live VB6 object graph (CManager / CUser / CProject) through a JScript bridge, backed by any of the four clients. The run loop, the introspection contract, and the prompt gotchas are documented in its own README.

2_db_example/ — gives the AI dynamic access to a SQLite database through the app to answer user questions about the data. modDbHost exposes the live DB to the bridge (cSQLiteTable, cSQLiteField, CJsonParser); the agent introspects tables and columns and queries through the host rather than walking a hand-built object graph. Proof the "model" can be a data source, not just VB classes.

3_image_example/ — the simplest case. Upload an image and ask a user question about it: image in, answer out, one shot through the backend clients. No agent loop, no bridge — the minimal multimodal path.

4_multiagent_chat/frmAiChatRoom: several backends in a round-robin debate over one shared transcript, each assigned a fixed role under an anti-convergence prompt, with mid-run user interjection. Where the single-agent loop reasons, this one argues — and one agent catching another's fabricated statistic turned out to be an emergent feature, not a designed one.

5_img_tagging/ — a complete image-catalog application (a direct VB6 app on the base clients — no bridge). CImageCatalog recursively scans a folder, MD5-hashes each file (modHash), tags it with a vision model (local qwen2.5-vl via Ollama, or Gemini), and stores everything in SQLite (images.db), searchable by tag. Two-phase and resumable: a mechanical scan pass, then a separate AI classify pass you can stop and restart over thousands of images. The prompt is tuned to name the primary subject first and refuse invented scene tags — earned from watching the model miss the car and the house and hallucinate "outdoor" onto studio shots.

6_text_tagging/ — the text sibling of 5_img_tagging (a direct VB6 app on the base clients — no bridge). CTextCatalog scans a folder of text files, MD5-hashes each (shared modHash), and runs each through a schema-constrained extraction prompt: the model returns a JSON object — category, priority, sentiment, summary, keywords — which the harness parses with CJSON, validates against the allowed enum values, and stores in SQLite (docs.db), searchable by category or keyword. Two-phase and resumable like the image app. The load-bearing idea is that the harness never trusts the model's format: it strips markdown fences, coerces any out-of-range enum to a fallback, and records both a valid flag and the raw JSON, so a model that emits malformed output is caught and auditable rather than silently wrong. A bundled labeled test set run through four backends made the point — the models agreed on the categorization; what separated them was JSON well-formedness and priority calibration, not comprehension.

A longer writeup of the underlying approach is in Agentic_Coding_Against_Live_Object_Models.pdf.


Setup

Each example is self-contained. Open its Project1.vbp in VB6, set your API keys in the form (one per cloud backend; keys persist to the registry, Ollama needs ip/model), pick a backend, and run. The bridge examples (automation, db) also need a Regen to generate their protos. See each example's own notes for the specifics.


What this is not

  • Not a chatbot. The single-agent examples run to completion or to MAX_STAGES; they're not for free-form conversation. (The multi-agent room is the one deliberate exception.)
  • Not a code generator. The agent introspects and reasons about live data, not source files.
  • Not async-first. The HTTP layer is async with polling, but the agent loops are otherwise synchronous and run on the UI thread.
  • Not a typelib reflector. The bridge examples use source-parsed protos rather than ITypeInfo, because Standard EXE VB6 projects don't register typelibs and this works the same on Standard EXE, ActiveX EXE, and ActiveX DLL alike.

Status

The framework is solid and has outgrown its original single example. Tested end-to-end across the cloud and local backends on single-task and multi-task prompts. Self-discovery works (the agent finds new classes added to a project without prompt updates). Cancel works. Hermetic per-stage execution works. Logging captures enough for full post-hoc debugging.

The pattern has now been carried — unchanged at the core — onto a live object graph, a database, a vision model, two resumable cataloging pipelines (image and text), and a multi-agent room. Which was the whole bet: deliver a live model to the agent and stay out of the way.

About

AI integration classes for Claude, ChatGPT, Gemini, and Ollama with demos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors