- Generate code install in GOBIN:
make generate-code-design
make install-bin- Configure the following in Claude Desktop settings:
{
"mcpServers": {
"greetingmcp": {
"command": "$GOPATH/bin/mcp-rag-vector",
"args": ["mcp"]
}
}
}-
Install ollama and run
ollama serve -
Start the stack
docker compose up -d --build-
Wait for the model to be downloaded (2 GB)
-
Query the API
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2:3b",
"messages": [
{ "role": "user", "content": "Use the greet tool with my name thomas, return what it says" }
],
"stream": false
}'- Start the API
docker-commpose up or make run-server
- Use
make http-call-mcp