🧠🔗 CoPCA: Capturing Symbolic Knowledge of Constraints and Incompleteness to Guide Inductive Learning in Neuro-Symbolic Knowledge Graph Completion

Welcome to the official repository for CoPCA, a novel framework that integrates symbolic constraints and incomplete knowledge to guide neuro-symbolic learning. This pipeline enhances the quality of Knowledge Graph Embeddings (KGEs) through logical rule mining, heuristic categorization, and constraint-based learning — paving the way for more explainable and robust downstream tasks such as link prediction.

📚 Overview

The CoPCA Pipeline follows these major steps:

Validation of the Knowledge Graph (KG) using SHACL constraints.
Mining of Horn rules over the KG using AMIE.
CoPCA model: logical rules into valid and invalid heuristics.
Transformation of the input KG into a refined KG′ using symbolic knowledge.
Numerical Knowledge Graph Embedding of KG′ using state-of-the-art KGE models.
Downstream Tasks: Link Prediction such as link prediction on vectorized KG representations.

Example downstream task: Predicting whether a football player (yago:Ronaldo) is affiliated with a particular sports team (yago:Portugal).

📁 Repository Structure

├── KG/                     # Original, valid, and invalid KGs for benchmarks
│   ├── french_royalty/        
│   ├── YAGO3-10/
│   └── DB100K/
│
├── Rules/                 # AMIE-mined Horn rules over KGs
│
├── Constraints/           # SHACL constraints for respective KGs
│
├── Symbolic Learning/     # Scripts for heuristic transformation and categorization
│
├── Numerical Learning/    # KGE pipeline scripts (kge.py) and configs (input.json)
├── requirements.txt       # Necessary dependencies  
└── README.md

📊 Benchmark Statistics

KG Size	Benchmark	#Triples	#Entities	#Relations
Large	DB100K	695,572	99,604	470
Medium	YAGO3-10	1,080,264	123,086	37
Small	French Royalty	10,526	2,601	12

KG Size	Benchmark	#Constraints	#Valid	#Invalid
Large	DB100K	6	406,533	45,842
Medium	YAGO3-10	4	407,480	44,444
Small	French Royalty	2	1,979	243

📈 Evaluation Metrics

We evaluate KG completion using embedding models:

TransE, TransH, TransD
RotatE, ComplEx, TuckER
CompGCN

Metrics reported:

Hits@1, Hits@3, Hits@5, Hits@10
Mean Reciprocal Rank (MRR)

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/SDM-TIB/CoPCA.git

2. Create a Virtual Environment and Install Dependencies

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Step-by-Step Instructions

Naviagte to Symbolic Learning/ directory and follow the steps below:

🔍 Step 1: Validate KG with SHACL

python Validation.py

This script will excute SHACL constraints over repsective benchmark KG.

🔍 Step 2: Calculate constraint-aware PCA

python constraint-driven-pca-calculator.py --input input.json

This script estimate the metrics PCA_valid and PCA_invalid.

🔍 Step 3: Enriching vaidated KG with symbolic predictions

python Symbolic_predictions.py --input input-symbolicPred.json

This script generates the enriched KG based on the selection of valid and invalid rules. Further, this KG is utilized to measure the performance of numerical learning models.

Now navigate to Numerical Learning directory to execute KGE models

🔍 Step 4: Enriching vaidated KG with symbolic predictions

python kge.py

This script will take input.json as an input to select respective KGE models, becnhmark KGs and the path to store the results (e.g., H@1).

📊 Benchmarks Included

French Royalty KG
YAGO3-10
DB100K

Each benchmark includes original, validated, and constraint-filtered variants of the KG. Find DB100K and YAGO3-10 benchmarks in Leibniz Data Manager: https://doi.org/10.57702/y3f76e2h

🔍 Referenced Works

PyKEEN – _Ali et al., 2021: paper
SPaRKLE – Purohit et al., 2023: paper
VISE – Purohit et al., 2024: paper
VANILLA – Purohit et al., 2025: paper

👨‍💻 Authors & Contact

CoPCA has been developed by members of the Scientific Data Management Group at TIB, as an ongoing research effort. The development is co-ordinated and supervised by Maria-Esther Vidal. Developed and maintained by:

Disha Purohit and Yashrajsinh Chudasama
Feel free to reach out for any issues related to reproducibility or implementation at: disha.purohit@tib.eu

✅ License

This project is licensed under the MIT License.

🙏 Acknowledgements

This project builds upon contributions and tools from the neuro-symbolic and knowledge representation communities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠🔗 CoPCA: Capturing Symbolic Knowledge of Constraints and Incompleteness to Guide Inductive Learning in Neuro-Symbolic Knowledge Graph Completion

📚 Overview

📁 Repository Structure

📊 Benchmark Statistics

📈 Evaluation Metrics

🚀 Getting Started

1. Clone the Repository

2. Create a Virtual Environment and Install Dependencies

Step-by-Step Instructions

🔍 Step 1: Validate KG with SHACL

🔍 Step 2: Calculate constraint-aware PCA

🔍 Step 3: Enriching vaidated KG with symbolic predictions

🔍 Step 4: Enriching vaidated KG with symbolic predictions

📊 Benchmarks Included

🔍 Referenced Works

👨‍💻 Authors & Contact

✅ License

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.idea		.idea
Constraints		Constraints
KG		KG
Numerical Learning		Numerical Learning
Rules		Rules
Symbolic Learning		Symbolic Learning
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

SDM-TIB/CoPCA

Folders and files

Latest commit

History

Repository files navigation

🧠🔗 CoPCA: Capturing Symbolic Knowledge of Constraints and Incompleteness to Guide Inductive Learning in Neuro-Symbolic Knowledge Graph Completion

📚 Overview

📁 Repository Structure

📊 Benchmark Statistics

📈 Evaluation Metrics

🚀 Getting Started

1. Clone the Repository

2. Create a Virtual Environment and Install Dependencies

Step-by-Step Instructions

🔍 Step 1: Validate KG with SHACL

🔍 Step 2: Calculate constraint-aware PCA

🔍 Step 3: Enriching vaidated KG with symbolic predictions

🔍 Step 4: Enriching vaidated KG with symbolic predictions

📊 Benchmarks Included

🔍 Referenced Works

👨‍💻 Authors & Contact

✅ License

🙏 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages