RAG Architecture for CIDOC-CRM

Graph-based RAG (Retrieval-Augmented Generation) system for querying CIDOC-CRM RDF data.

Repository Structure

CRM_RAG/
├── config/              Configuration files
│   ├── .env.openai.example
│   ├── .env.claude.example
│   ├── .env.r1.example
│   ├── .env.ollama.example
│   ├── .env.secrets.example
│   ├── interface.yaml        # Chat interface customization
│   └── README.md             # Configuration guide
├── data/                All data files
│   ├── ontologies/      CIDOC-CRM, VIR, CRMdig ontology files
│   ├── labels/          Extracted labels (auto-generated)
│   ├── cache/           Document graph and vector index (auto-generated)
│   └── documents/       Entity documents (auto-generated)
├── docs/                Documentation
│   ├── ARCHITECTURE.md
│   └── REORGANIZATION_PLAN.md
├── scripts/             Utility scripts
│   └── extract_ontology_labels.py
├── logs/                Application logs
├── static/              Web interface CSS and JavaScript
├── templates/           Web interface HTML templates
└── *.py                 Python application files

Setup

1. Install Dependencies

pip install -r requirements.txt

2. Configure API Keys

Create configuration files from templates in the config/ directory:

# Copy the secrets template (for API keys)
cp config/.env.secrets.example config/.env.secrets

# Edit config/.env.secrets and add your actual API keys
OPENAI_API_KEY=your_actual_openai_key_here
ANTHROPIC_API_KEY=your_actual_anthropic_key_here
R1_API_KEY=your_actual_r1_key_here

# Copy the provider configuration you want to use
cp config/.env.openai.example config/.env.openai
# OR
cp config/.env.claude.example config/.env.claude
# OR
cp config/.env.r1.example config/.env.r1
# OR
cp config/.env.ollama.example config/.env.ollama

3. Configure SPARQL Endpoint

The default SPARQL endpoint is http://localhost:3030/asinou/sparql.

To change the endpoint: Edit the FUSEKI_ENDPOINT variable in your config/.env.* files:

# Example: config/.env.openai, config/.env.claude, etc.
FUSEKI_ENDPOINT=http://your-server:3030/your-dataset/sparql

4. Extract Ontology Labels

Extract English labels from ontology files (required on first run):

python scripts/extract_ontology_labels.py

This creates label files in data/labels/ used by the RAG system.

5. Customize Chat Interface (Optional)

Customize the chatbot title, welcome message, and example questions by editing config/interface.yaml:

page_title: "Your Dataset Chat"
header_title: "Your Custom Chatbot"
welcome_message: "Hello! Ask me about your dataset..."
example_questions:
  - "Your first example question?"
  - "Your second example question?"

See config/README.md for detailed customization options.

6. Start Your SPARQL Endpoint

Ensure your SPARQL server is running with your CIDOC-CRM dataset loaded at the configured endpoint.

Usage

# Run with OpenAI
python main.py --env .env.openai

# Run with Claude
python main.py --env .env.claude

# Run with R1
python main.py --env .env.r1

# Run with Ollama (no API key needed)
python main.py --env .env.ollama

# Force rebuild of document graph and vector store
python main.py --env .env.openai --rebuild

Access the chat interface at http://localhost:5001

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
config		config
data/ontologies		data/ontologies
docs		docs
scripts		scripts
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
config_loader.py		config_loader.py
graph_document_store.py		graph_document_store.py
llm_providers.py		llm_providers.py
main.py		main.py
requirements.txt		requirements.txt
universal_rag_system.py		universal_rag_system.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Architecture for CIDOC-CRM

Repository Structure

Setup

1. Install Dependencies

2. Configure API Keys

3. Configure SPARQL Endpoint

4. Extract Ontology Labels

5. Customize Chat Interface (Optional)

6. Start Your SPARQL Endpoint

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ncarboni/CRM-RAG

Folders and files

Latest commit

History

Repository files navigation

RAG Architecture for CIDOC-CRM

Repository Structure

Setup

1. Install Dependencies

2. Configure API Keys

3. Configure SPARQL Endpoint

4. Extract Ontology Labels

5. Customize Chat Interface (Optional)

6. Start Your SPARQL Endpoint

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages