Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph

A PyTorch implementation of Knowledge Graph Pseudo-Labeling (KGPL), inspired by the paper: "Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph" by Riku Togashi, Mayu Otani, and Shin’ichi Satoh. (https://arxiv.org/abs/2011.05061)

Overview

This project provides a functional PyTorch implementation of the KGPL model, which addresses cold-start problems in personalized recommendation systems by pseudo-labeling over knowledge graphs.

The recommendation model uses a knowledge graph to identify potential positive items for each user by focusing on neighbors in the graph structure, and treats unobserved user-item interactions as weakly-positive instances via pseudo-labeling. To mitigate popularity bias, the model uses an improved negative sampling strategy. The recommender also implements a co-training approach with dual student models to improve learning stability and robustness.

Results

The PyTorch reimplementation of KGPL demonstrates stable co-training dynamics and successfully replicates the original model’s behavior. Both student models (f and g) showed synchronized convergence over 40 epochs, with training loss decreasing from ~5.04 to ~1.81, confirming effective learning from both observed and pseudo-labeled instances.

Validation Performance

Recall@20 increased from ~0.67% (epoch 1) to ~15.3% (epoch 40)
Recall@10 reached ~9.4%
Recall@5 reached ~6.2%

Most learning occurred in the first 20 epochs, followed by gradual fine-tuning. Validation metrics plateaued without decline, indicating no overfitting.

Cold-Start Analysis

Users with ≤1 interaction: Recall@20 ~8.3%
Users with ≤2 interactions: Recall@20 ~20.3%

Performance improves steadily as interaction history increases, showing that the KGPL model effectively mitigates cold-start issues using pseudo-labeling.

Top-K Test Set Evaluation

Metric	PyTorch Implementation	TensorFlow Implementation
Recall@5	7.1%	9.93%
Recall@10	12.4%	15.47%
Recall@20	17.6%	22.25%
Precision@20	2.0%	2.3%

The PyTorch model's metrics are closely aligned with the TensorFlow version, showing consistent performance trends. The small differences are likely due to minor implementation or environment setup differences, and overall, the reimplementation was successful.

Repository Structure

KGPL-PyTorch/
├── conf/                   # Configuration files for experiments
├── data/                   # Datasets and data loaders
├── preprocess/             # Data preprocessing scripts
├── utils/                  # Utility functions and evaluation metrics
├── model.py                # Implementation of the KGPL model
├── KGPL_MUSIC_FINAL_40.ipynb  # Example notebook demonstrating usage
├── requirements.txt        # Python dependencies
├── CHANGELOG.md            # Record of changes and updates
├── LICENSE                 # MIT License
└── README.md               # Project overview and instructions

Getting Started

Run the following commands to clone the repository, create a virtual environment, and install the required packages to set up the model environment.

git clone https://github.com/dna-witch/KGPL-PyTorch.git
cd KGPL-PyTorch
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Usage and Workflow

The KGPL_MUSIC_FINAL_40.ipynb notebook provides a step-by-step example of preprocessing data, co-training, and evaluating the KGPL model on a benchmark dataset. It's a great starting point to understand the workflow and experiment with the recommender model!

Contributors

Shakuntala Mitra @dna-witch

Taylor Hawks @taylorhawks

Changelog

Adding to the changelog

First, identify the last commit hash recorded in CHANGELOG.md. Then, use the following command (replacing LAST_COMMIT_HASH with the actual hash):

git log --pretty=format:"## %h%n #### %ad %n%n%s%n%n%b%n" --date=short LAST_COMMIT_HASH..HEAD >> CHANGELOG.md

This appends all new commits since LAST_COMMIT_HASH to the end of the changelog.

Citation

If you find this implementation useful for your research, please cite the original paper:

@article{togashi2020alleviating,
  title={Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph},
  author={Togashi, Riku and Otani, Mayu and Satoh, Shin’ichi},
  journal={arXiv preprint arXiv:2011.05061},
  year={2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph

Overview

Results

Validation Performance

Cold-Start Analysis

Top-K Test Set Evaluation

Repository Structure

Getting Started

Usage and Workflow

Contributors

Changelog

Adding to the changelog

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
conf		conf
data		data
preprocess		preprocess
utils		utils
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
KGPL_MUSIC_FINAL_40.ipynb		KGPL_MUSIC_FINAL_40.ipynb
LICENSE		LICENSE
README.md		README.md
final_report.pdf		final_report.pdf
model.py		model.py
model_f_validation_recall.png		model_f_validation_recall.png
partial_tf_training_logs..txt		partial_tf_training_logs..txt
requirements.txt		requirements.txt
train_loss_model_f.png		train_loss_model_f.png
update_changelog.sh		update_changelog.sh

License

dna-witch/KGPL-PyTorch

Folders and files

Latest commit

History

Repository files navigation

Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph

Overview

Results

Validation Performance

Cold-Start Analysis

Top-K Test Set Evaluation

Repository Structure

Getting Started

Usage and Workflow

Contributors

Changelog

Adding to the changelog

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages