Pseudomonas aeruginosa dataset

Version 1.2

License

The dataset is released under the Public Domain license, i.e., unrestricted. We ask that you cite the original, unaltered dataset as

J. Stokes (March 2020), Pseudomonas aeruginosa dataset.
https://www.aicures.mit.edu/data

About the data

The files train.csv and test.csv contain the SMILES of molecules tested for activity against Pseudomonas aeruginosa.

To compare your results against our numbers, we have included the 10 splits used for cross-validation in the folder train_cv.

Submitting your predictions

We ask that you predict the activity of molecules included in the test set test.csv.

A sample submission file is included: test_predictions_sample.csv. The activity column should be a float between 0 and 1. Your predictions will be evaluated against the golden labels to compute an AUC.

Changelog

1.2: CV splits are now in the form of subdirectories.

1.1: Updated test.csv and test_predictions_sample.csv to accept SMILES as index for predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
train_cv		train_cv
.gitignore		.gitignore
Feature_Extraction_noH.ipynb		Feature_Extraction_noH.ipynb
Feature_Extraction_withH.ipynb		Feature_Extraction_withH.ipynb
README.md		README.md
Rgroup.py		Rgroup.py
Rgroup_decomp.ipynb		Rgroup_decomp.ipynb
SMILES.ipynb		SMILES.ipynb
SMILES_function.ipynb		SMILES_function.ipynb
atom_unique_feature.txt		atom_unique_feature.txt
ele2idx.txt		ele2idx.txt
evaluate_roc_auc.py		evaluate_roc_auc.py
freq_atom_unique_features		freq_atom_unique_features
gcn_utils.py		gcn_utils.py
gdb_9_clean.tsv		gdb_9_clean.tsv
model.py		model.py
smile_to_graph.py		smile_to_graph.py
test.csv		test.csv
test_predictions_sample.csv		test_predictions_sample.csv
train.csv		train.csv
train.py		train.py
train_dev.ipynb		train_dev.ipynb
train_mlp.py		train_mlp.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pseudomonas aeruginosa dataset

License

About the data

Submitting your predictions

Changelog

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

YuanC233/COVID-19GCN

Folders and files

Latest commit

History

Repository files navigation

Pseudomonas aeruginosa dataset

License

About the data

Submitting your predictions

Changelog

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages