Turn any data source into an MCP server in 5 minutes.
Build knowledge bases that AI assistants like Claude and Cursor can query directly. No infrastructure needed.
This SDK lets you create MCP (Model Context Protocol) servers from any data source. Your docs, PDFs, websites, or any text can become a queryable knowledge base that AI assistants can access directly.
Use cases:
- π Make your documentation searchable by Cursor/Claude
- π Build RAG (Retrieval-Augmented Generation) pipelines
- π€ Create custom AI assistants with domain knowledge
- π Index research papers, guides, or any text content
npm install akyn-aiimport { KnowledgeBase } from 'akyn-ai'
// Create a knowledge base
const kb = new KnowledgeBase({
name: 'my-docs',
description: 'My project documentation',
})
// Add your content
await kb.addDirectory('./docs') // Add all docs from a folder
await kb.addFile('./README.md') // Add a specific file
await kb.addURL('https://docs.example.com') // Scrape a URL
await kb.addText('Important info here') // Add raw text
// Serve as MCP server
kb.serveStdio() // For Cursor/Claude DesktopAdd to your .cursor/mcp.json:
{
"mcpServers": {
"my-docs": {
"command": "npx",
"args": ["ts-node", "./my-kb.ts"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"my-docs": {
"command": "npx",
"args": ["ts-node", "/path/to/my-kb.ts"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}// Files (PDF, DOCX, TXT, Markdown)
await kb.addFile('./guide.pdf')
await kb.addFile('./manual.docx')
// Directories (recursive)
await kb.addDirectory('./docs', {
recursive: true,
extensions: ['.md', '.txt', '.pdf'],
})
// URLs
await kb.addURL('https://docs.example.com')
await kb.addURLs([
'https://example.com/page1',
'https://example.com/page2',
])
// Raw text
await kb.addText('Custom content here', 'My Notes')Text is automatically split into optimal chunks for embedding:
const kb = new KnowledgeBase({
name: 'my-kb',
chunking: {
maxSize: 1000, // Max characters per chunk
overlap: 200, // Overlap between chunks for context
},
})Uses OpenAI by default, but you can bring your own:
import { KnowledgeBase, type EmbeddingsProvider } from 'akyn-ai'
// Use OpenAI (default)
const kb = new KnowledgeBase({ name: 'my-kb' })
// Or customize OpenAI settings
import { OpenAIEmbeddings } from 'akyn-ai'
const kb = new KnowledgeBase({
name: 'my-kb',
embeddings: new OpenAIEmbeddings({
model: 'text-embedding-3-large', // Better quality
apiKey: 'sk-...',
}),
})
// Or bring your own provider
class MyEmbeddings implements EmbeddingsProvider {
readonly dimensions = 384
async embed(text: string) {
// Your embedding logic here
return { embedding: [...], tokenCount: 100 }
}
async embedBatch(texts: string[]) {
return Promise.all(texts.map(t => this.embed(t)))
}
}
const kb = new KnowledgeBase({
name: 'my-kb',
embeddings: new MyEmbeddings(),
})Perfect for development and small datasets:
import { InMemoryVectorStore } from 'akyn-ai'
const kb = new KnowledgeBase({
name: 'my-kb',
vectorStore: new InMemoryVectorStore({
persistPath: './kb-data.json', // Optional: save to disk
}),
})For production workloads, use Qdrant - a high-performance vector database:
import { KnowledgeBase, QdrantVectorStore } from 'akyn-ai'
const kb = new KnowledgeBase({
name: 'my-kb',
vectorStore: new QdrantVectorStore(), // That's it!
})Local Setup (Docker)
# Start Qdrant with one command
docker run -p 6333:6333 qdrant/qdrant
# With persistent storage
docker run -p 6333:6333 -v ./qdrant_data:/qdrant/storage qdrant/qdrantQdrant Cloud
For managed hosting, use Qdrant Cloud:
const kb = new KnowledgeBase({
name: 'my-kb',
vectorStore: new QdrantVectorStore({
url: 'https://your-cluster.cloud.qdrant.io',
apiKey: process.env.QDRANT_API_KEY,
collection: 'my-docs', // Optional: defaults to 'akyn_documents'
}),
})| Option | Type | Default | Description |
|---|---|---|---|
url |
string | http://localhost:6333 |
Qdrant server URL |
apiKey |
string | - | API key (required for Qdrant Cloud) |
collection |
string | akyn_documents |
Collection name |
dimensions |
number | auto-detected | Vector dimensions |
Implement the VectorStore interface for other databases (Pinecone, Weaviate, etc.):
import type { VectorStore } from 'akyn-ai'
class MyVectorStore implements VectorStore {
async add(document) { /* ... */ }
async addBatch(documents) { /* ... */ }
async search(embedding, options) { /* ... */ }
async delete(id) { /* ... */ }
async clear() { /* ... */ }
async count() { /* ... */ }
}// Stdio (for Cursor/Claude Desktop)
kb.serveStdio()
// HTTP (for web clients)
await kb.serveHttp({ port: 3000 })You can also use the CLI without writing code:
# Index a directory
npx akyn-ai --dir ./docs --name "My Docs"
# Use a config file
npx akyn-ai --config ./kb-config.json
# Run as HTTP server
npx akyn-ai --dir ./docs --http 3000{
"name": "My Knowledge Base",
"description": "Project documentation",
"sources": [
{ "type": "directory", "path": "./docs" },
{ "type": "file", "path": "./README.md" },
{ "type": "url", "url": "https://docs.example.com" }
]
}Main class for creating and managing knowledge bases.
const kb = new KnowledgeBase({
name: string, // Required: Name of the knowledge base
description?: string, // Optional: Description
version?: string, // Optional: Version (default: '1.0.0')
embeddings?: EmbeddingsProvider, // Optional: Custom embeddings
vectorStore?: VectorStore, // Optional: Custom vector store
chunking?: ChunkOptions, // Optional: Chunking settings
retrieval?: RetrievalOptions, // Optional: Retrieval settings
})Control how many results are returned and their minimum quality. These options are configured in your code (not exposed to AI agents), giving you full control over retrieval behavior.
const kb = new KnowledgeBase({
name: 'my-kb',
retrieval: {
topK: 10, // Return up to 10 chunks per query
threshold: 0.5, // Only return chunks with similarity score >= 0.5
},
})| Option | Type | Default | Description |
|---|---|---|---|
topK |
number | 5 | Maximum number of chunks to retrieve per query |
threshold |
number | 0 | Minimum similarity score (0-1). Set to 0 to return all results, or higher (e.g. 0.5, 0.7) to filter out less relevant chunks |
| Method | Description |
|---|---|
addText(text, name?) |
Add raw text content |
addFile(path, name?) |
Add a file (PDF, DOCX, TXT, MD) |
addDirectory(path, options?) |
Add all files from a directory |
addURL(url, name?) |
Add content from a URL |
addURLs(urls) |
Add multiple URLs |
query(question, options?) |
Query the knowledge base |
listSources() |
List all indexed sources |
serveStdio(options?) |
Start stdio MCP server |
serveHttp(options?) |
Start HTTP MCP server |
await kb.serveHttp({
port: 3000, // Port to listen on (default: 3000)
host: '0.0.0.0', // Host to bind to (default: '0.0.0.0')
cors: true, // Enable CORS (default: true)
corsOrigin: '*', // CORS origin (default: '*')
debug: false, // Enable debug logging (default: false)
})The SDK also exports utilities you can use independently:
import {
// Text processing
normalizeText,
chunkText,
extractTextFromHTML,
stripMarkdown,
// File loading
loadFile,
loadDirectory,
loadURL,
// Embeddings
OpenAIEmbeddings,
cosineSimilarity,
// Vector stores
InMemoryVectorStore,
QdrantVectorStore,
} from 'akyn-ai'When connected via MCP, your knowledge base exposes these tools:
Search the knowledge base with a natural language question.
{
"name": "query",
"arguments": {
"question": "How do I authenticate?"
}
}| Parameter | Type | Description |
|---|---|---|
question |
string | The question to search for |
Note: The number of results and similarity threshold are configured via the
retrievaloption when creating the KnowledgeBase. See Retrieval Options.
List all indexed sources in the knowledge base.
{
"name": "list_sources",
"arguments": {}
}See the examples directory for more:
- Node.js 18+
- OpenAI API key (or custom embeddings provider)
Building something bigger? Check out Akyn for:
- βοΈ Hosted knowledge bases
- π₯ Team collaboration
- π Usage analytics
- π° Monetization (charge for queries)
- π API key management
Contributions welcome! Please read our contributing guidelines first.
MIT Β© Akyn AI