Skip to content

miladrayka/hca_ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python 3.10 License: MIT

Interpretable Machine Learning Unveils Carbonic Anhydrase Inhibition via Conformal and Counterfactual Prediction

All the codes to reproduce the paper.

Citation

For now, please cite the preprint version.

Contact

Install

1- Clone hca_ml Github repository.

git clone https://github.com/miladrayka/hca_ml.git

2- Change directory to hca_ml and make a new environment from the cheminf_env.yaml file by Mamba package manager:

mamba env create -f cheminf_env.yaml

Usage

To reproduce all results, tables, and figures, uncompress the Data.tar.xz and Results.tar.xz folders and refer to workflow.ipynb.

CAInsight GUI

drawing

CAInsight is an interpretable and uncertainty-aware machine learning software designed to predict the activity of human carbonic anhydrase (hCA) isoforms. Specifically, we focus on predicting the activity of three isoforms: hCA II, hCA IX, and hCA XII.

The primary model relies on a Support Vector Machine (SVM) in conjunction with an Extended Connectivity Fingerprint (ECFP). Each hCA isoform has its own SVM-ECFP binary classifier that returns labels indicating whether they are active or inactive. We enhance our models with conformal prediction (CP), which quantifies the uncertainty in our predictions. In this context, CP can return an active label, an inactive label, a combination of both labels, or an empty set, depending on a specified epsilon value. Lastly, we employ counterfactual explainability (see exmol) to enhance the interpretability of our model.

To run CAInsight, change directory to hca_ml, then type the following in the terminal:

streamlit run gui.py

Copy Right

Copyright (c) 2025, Milad Rayka, Masoumeh Shams