Open Resources of LucaProt
Welcome to our data repository. You can download the datasets and model files below:
All_Contigs/
Including all contig sequences.
All_Protein_Sequences/
All input protein sequences of prediction (length >= 300aa).
Benchmark/
LucaProtApp: prediction on unlabeled sequences and performance metrics.
Known_RdRPs/
All known RdRPs (5979) for ClstrSearch and LucaProt.
LucaProt/
Raw data, training datasets, model code, trained models, and identification results.
Results/
All the results obtained in our study.
SG_predicted_protein_structure_supplementation/
AlphaFold2 predicted structures.
Self_Sequencing_Proteins/
Translated protein sequences from self-sequenced data.
Self_Sequencing_Reads/
Raw RNA/DNA reads. (Also available on NCBI SRA: PRJNA956286, PRJNA956287)
Serratus/
Serratus RdRPs and LucaProt prediction results.
README.md
Project documentation.