This repository contains a TypeScript implementation of a "Chain of Experts" (CoE) LLM application featuring dynamic expert management, resilient processing with retries, cloud deployment using Terragrunt/Terraform (AWS primary, GCP optional), and observability using Langfuse.
It's a experimentional project, not menat to be fully functional.
The application implements a Chain of Experts pattern where multiple specialized components (experts) process data sequentially. Key features include:
- Dynamic Expert Management: A UI allows creating, configuring, and removing custom experts at runtime.
- Resilient Processing: Includes retry logic with exponential backoff for handling transient errors during expert execution.
- Parallel Execution: Supports running experts concurrently for tasks that allow parallel processing (e.g., processing multiple inputs independently).
- Multi-Provider LLM Support: A flexible abstraction layer supports multiple LLM providers (OpenAI, Gemini) with provider selection strategies and per-expert configurations.
- Vector Database Integration: Uses ChromaDB for document retrieval (can be extended).
- Multi-Cloud Deployment: Infrastructure managed by Terragrunt and Terraform modules for AWS (primary) and GCP (optional).
- Observability: Integrated with Langfuse for detailed tracing and monitoring of chain executions and LLM calls.
The default chain includes:
- Data Retrieval Expert: Retrieves relevant documents.
- LLM Summarization Expert: Summarizes retrieved documents.
See the Architecture Guide and LLM Providers Guide for more details.
- Node.js 20.x
- npm (or yarn)
- Docker and Docker Compose (for running ChromaDB locally)
- Terraform v1.5+
- Terragrunt v0.45+ (for cloud deployment)
- AWS CLI (configured with credentials)
- Google Cloud SDK (optional, configured if deploying to GCP)
- Langfuse account and API keys
- OpenAI API key (and/or Google API key if using Gemini)
-
Clone the repository:
git clone <repository-url> cd <repository-name>
-
Install dependencies:
npm install
-
Create a
.envfile in the project root with the required environment variables:# Langfuse Configuration LANGFUSE_SECRET_KEY=your_langfuse_secret_key LANGFUSE_PUBLIC_KEY=your_langfuse_public_key LANGFUSE_BASEURL=https://cloud.langfuse.com # Or your self-hosted URL # LLM Provider Configuration DEFAULT_LLM_PROVIDER=openai # Options: openai, gemini DEFAULT_LLM_STRATEGY=fallback-default # Provider selection strategy # OpenAI Configuration OPENAI_API_KEY=your_openai_api_key OPENAI_MODEL=gpt-4o # Options: gpt-4o, gpt-4-turbo, gpt-3.5-turbo # Google Gemini Configuration (Optional) GOOGLE_API_KEY=your_gemini_api_key GEMINI_MODEL=gemini-1.5-pro # Options: gemini-1.5-pro, gemini-1.5-flash, gemini-1.0-pro # Per-Expert LLM Configuration (Optional) SUMMARIZATION_PROVIDER=openai # Provider for summarization expert SUMMARIZATION_MODEL=gpt-4o # Model for summarization expert SUMMARIZATION_STRATEGY=fallback-default # Strategy for summarization expert QUERY_REFORMULATION_PROVIDER=openai # Provider for query reformulation expert QUERY_REFORMULATION_MODEL=gpt-4o # Model for query reformulation expert QUERY_REFORMULATION_STRATEGY=quality-based # Strategy for query reformulation expert # Optional: For local Terraform testing (if applicable) # AWS_ACCESS_KEY_ID=... # AWS_SECRET_ACCESS_KEY=... # AWS_REGION=...
-
Start the ChromaDB vector database:
npm run db:start
-
Initialize the vector database with sample documents:
npm run db:init
-
Run the development server:
npm run dev
Alternatively, you can run all of the above with a single command:
npm run start:all
-
The API will be available at http://localhost:8080 (or the port specified in
config.ts).
The application includes a UI section (accessible via the sidebar) for managing experts:
- View currently registered experts and their configurations.
- Create new custom experts (Note: requires backend implementation for custom logic).
- Edit the description and parameters of custom experts.
- Delete custom experts (built-in experts are protected).
A visual representation of the selected expert chain is displayed in the sidebar, showing the flow and status of each expert during processing.
The ChainManager automatically retries failed expert processing steps using exponential backoff (default 3 attempts) to handle transient errors.
Health check endpoint. Returns {"status": "ok"}.
Returns a list of configurations for all available experts. Response Example:
{
"experts": [
{
"name": "data-retrieval",
"description": "Retrieves relevant documents based on a query",
"parameters": {},
"factory": "Function"
},
{
"name": "llm-summarization",
"description": "Summarizes documents using an LLM",
"parameters": {},
"factory": "Function"
}
// ... other experts
]
}Returns the configuration for a specific expert.
Registers a new custom expert (requires backend logic for the factory).
Request Body: { "name": "my-custom-expert", "description": "...", "parameters": {...} }
Updates the description and parameters of an existing custom expert.
Request Body: { "description": "...", "parameters": {...} }
Deletes a custom expert (cannot delete built-in experts).
Processes input through the specified chain of experts. Handles retries internally. Supports sequential (default) and parallel execution modes.
Request Body Example (Parallel):
{
"input": { ... },
"expertNames": ["expertA", "expertB"],
"options": {
"executionMode": "parallel"
}
}Success Response (Parallel Example):
{
"result": {
"expertA": { /* output from expertA */ },
"expertB": { /* output from expertB */ }
},
"success": true
}Request Body:
{
"input": {
"type": "query",
"query": "What is the Chain of Experts pattern?"
// Add other input fields as needed by experts
},
"expertNames": ["data-retrieval", "llm-summarization"], // Order matters
"userId": "user-123", // Optional
"sessionId": "session-456" // Optional
}Success Response (Example):
{
"result": { // Output from the *last* expert in the chain
"summary": "The Chain of Experts pattern involves sequential processing...",
"summaryLength": 123,
"tokenUsage": {
"promptTokens": 50,
"completionTokens": 73,
"totalTokens": 123
}
},
"success": true
}Error Response (Example):
{
"result": null,
"success": false,
"error": "Error in expert 'llm-summarization': LLM failed to generate a summary."
}The application uses ChromaDB as a vector database for storing and retrieving documents based on semantic similarity. The DataRetrievalExpert queries this database to find documents relevant to the user's query.
- Collection:
sample_documents- Contains sample documents about various topics - Document Format: Each document has text content and metadata (title, category, source)
- Embedding Model: OpenAI's text-embedding-ada-002 model is used for creating embeddings
- Start ChromaDB:
npm run db:start(runs ChromaDB in a Docker container) - Stop ChromaDB:
npm run db:stop(stops and removes the Docker container) - Initialize Database:
npm run db:init(checks if ChromaDB is running and populates it with sample documents) - Populate Database:
npm run db:populate(adds sample documents to the database)
To add your own documents to the vector database, modify the sampleDocuments array in src/vectordb/populateDb.ts or create a new script that uses the addDocuments function from src/vectordb/chromaClient.ts.
The application supports multiple LLM providers with different selection strategies:
- Default Strategy: Uses the preferred provider if specified, otherwise uses the first available provider.
- Fallback Strategy: Tries the primary provider first, then falls back to other providers if the primary fails.
- Cost-Based Strategy: Selects the cheapest provider that meets the requirements.
- Quality-Based Strategy: Selects the highest quality provider for the specific task.
Each expert can be configured to use a specific LLM provider, model, and selection strategy:
{
"expertName": "llm-summarization",
"provider": "openai",
"model": "gpt-4o",
"selectionStrategy": "fallback-default",
"priority": "quality"
}See the LLM Providers Guide for detailed configuration options and implementation details.
The project uses Jest for automated testing of the backend components.
- Run all tests:
npm run test - Run tests with coverage:
npm run test -- --coverage - Run specific test file:
npm run test -- src/tests/experts/dataRetrievalExpert.spec.ts
Test files are located in the src/tests directory.
The application is deployed using Terragrunt, which orchestrates Terraform modules for AWS (primary) and GCP (optional). See the Deployment Guide for detailed instructions on setting up the backend and running Terragrunt commands.
A GitHub Actions workflow in .github/workflows/deploy.yml is provided for automated CI/CD.
The application is instrumented with Langfuse for LLM observability. See the Monitoring Guide for details on how to use the Langfuse dashboard and evaluators.
MIT