Query Context-Aware Sequential Ranking

Provides the implementation and evaluation of a query-context-aware learning-to-rank model for sequential recommendation. The model enhances recommendation accuracy by incorporating the user's target query context into the ranking with the CLM modeling.

Publications

UMAP '26 Short Research Paper Track: Harnessing Query Context for Personalized User Intent-Aware Sequential Recommendation
Extended Version: arXiv
CARS @ RecSys '25: workshop, presentation

Folder structure

├── datasets/                # Contains preprocessing notebooks
│   ├── {preprocessing notebooks for Taobao and RetailRocket}
├── files/                   # Stores processed dataset files
├── model.py                 # Model definition
├── train_model.py           # Script for model training
└── README.md                # Documentation

Data

The notebooks in the datasets directory contain code to automatically download and preprocess the datasets. After running these notebooks, the processed training and test data will be saved as Parquet files in the files directory.

Training procedure

The entry point to start training the model is the train_model.py.

Parameters:

--dataset: either taobao or retailrocket, to select the respective dataset
--train_path: path to the local folder, storing the training datasets
--test_path: path to the local folder, storing the training datasets
--output_data_path: path for saving the resulting model artifact
--integration: one of the following options (default: NO_QUERY_CONTEXT):
- NO_QUERY_CONTEXT for no query context (the respective input will be ignored, and only the items sequence will be used)
- OUTSIDE for query context outside the transformer blocks
- IN_INPUT for implementation where query context is concatenated with the preceding item representation in the input
- LAST_LAYER_AND_OUTSIDE for query context included in the last layer's query position and outside the transformer blocks
--num_epochs: number of training epochs (default: 30)
--num_samples: number of samples to use from dataset; use -1 for all samples (default: -1)
--seq_len: maximum sequence length for input data (default: 100)
--batch_size: batch size for training (default: 128)
--past_query_context_in_test: whether to use past query context information during evaluation (0=only the current context is used, 1=historical context is used as well, default: 0)
--query_context_dropout_rate: dropout rate for query context information; applicable only for IN_INPUT integration type (default: 0.0)
--query_context_dropout_in_train: whether to apply query context dropout during training; applicable only for IN_INPUT integration typ (0=disabled, 1=enabled, default: 0)

Usage

Below are example commands for training the model with different integration types and options. Replace paths as needed.

Quick experiment with a small subset:

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration IN_INPUT --num_epochs 2 --num_samples 10000 --batch_size 32

No query context (NO_QUERY_CONTEXT):

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration NO_QUERY_CONTEXT

Query context outside transformer blocks (OUTSIDE):

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration OUTSIDE

Query context concatenated in input (IN_INPUT):

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration IN_INPUT

IN_INPUT with query context dropout (rate 0.25, enabled during training):

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration IN_INPUT --query_context_dropout_rate 0.25 --query_context_dropout_in_train 1

IN_INPUT using historical query context during evaluation:

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration IN_INPUT --past_query_context_in_test 1

Query context in last layer and outside transformer blocks (LAST_LAYER_AND_OUTSIDE):

python train_model.py --dataset taobao --train_path datasets/files --test_path datasets/files --output_data_path ./model_output --integration LAST_LAYER_AND_OUTSIDE

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
datasets		datasets
.gitignore		.gitignore
README.md		README.md
model.py		model.py
model_runner.sh		model_runner.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train_model.py		train_model.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Query Context-Aware Sequential Ranking

Publications

Folder structure

Data

Training procedure

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Query Context-Aware Sequential Ranking

Publications

Folder structure

Data

Training procedure

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages