0% found this document useful (0 votes)

47 views10 pages

Reliability 6

This document describes the development of reliability scores for five membrane protein topology prediction algorithms (TMHMM, HMMTOP, MEMSAT, PHD, and TopPred). The reliability scores were designed to estimate the probability that a predicted topology is correct. They were evaluated on a test set of 92 bacterial proteins with known topologies, and on predicted topologies from three fully sequenced genomes. The results show that the reliability scores worked well for TMHMM and MEMSAT, allowing the likelihood of a correct prediction to be estimated. Limited experimental data, like the location of protein termini, was also found to improve prediction accuracy by up to 10 percentage points.

Uploaded by

Shampa Sen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views10 pages

Reliability 6

Uploaded by

Shampa Sen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

doi:10.1016/S0022-2836(03)00182-7 J. Mol. Biol.

(2003) 327, 735–744

Reliability Measures for Membrane Protein Topology

Prediction Algorithms
Karin Melén1, Anders Krogh2 and Gunnar von Heijne1*
1
Department of Biochemistry We have developed reliability scores for five widely used membrane
and Biophysics, Stockholm protein topology prediction methods, and have applied them both on a
Bioinformatics Center test set of 92 bacterial plasma membrane proteins with experimentally
Stockholm University determined topologies and on all predicted helix bundle membrane
SE-106 91 Stockholm, Sweden proteins in three fully sequenced genomes: Escherichia coli, Saccharomyces
2 cerevisiae and Caenorhabditis elegans. We show that the reliability scores
Department of Molecular
work well for the TMHMM and MEMSAT methods, and that they allow
Biology, Bioinformatics Centre
the probability that the predicted topology is correct to be estimated for
University of Copenhagen
any protein. We further show that the available test set is biased towards
Universitetsparken 15
high-scoring proteins when compared to the genome-wide data sets, and
DK-2100 Copenhagen
provide estimates for the expected prediction accuracy of TMHMM across
Denmark the three genomes. Finally, we show that the performance of TMHMM is
considerably better when limited experimental information (such as the
in/out location of a protein’s C terminus) is available, and estimate that
at least ten percentage points in overall accuracy in whole-genome predic-
tions can be gained in this way.
q 2003 Elsevier Science Ltd. All rights reserved
*Corresponding author Keywords: membrane protein; topology prediction; bioinformatics

Introduction helix bundle proteins, in which one or several

a-helices span the membrane, and the b-barrel
It is estimated that some 20– 25% of all open proteins, in which eight or more anti-parallel trans-
reading frames (ORFs) in fully sequenced genomes membrane b-strands form a closed barrel. The
encode integral membrane proteins.1 Strikingly, b-barrel membrane proteins have so far been
however, considerably less than 1% of all 3D found only in the outer membranes of Gram-
protein structures deposited in the Protein Data negative bacteria, mitochondria, and chloroplasts,
Bank2 are of membrane proteins. Theoretical whereas the a-helical membrane proteins are
structure prediction methods are thus of particular present in all types of membranes. Here, we con-
importance for membrane proteins. Most current sider only methods for predicting the topology of
methods in this field do not deal with predicting helix bundle membrane proteins.
the 3D structure, but rather try to predict the most The best current topology prediction methods
likely topology of the protein, i.e. the in/out are claimed to predict the correct topology for
location of the N and C termini relative to the some 70– 85% of all proteins, although, as will be
membrane, and the number and positions of the shown below, this is an overestimate. Rather, we
membrane-spanning regions. Topology infor- estimate an overall prediction accuracy of 55 – 60%
mation can be generated experimentally by correctly predicted topologies when entire pro-
different approaches such as gene fusion, proteo- teomes are analyzed. Importantly, none of the
lytic digestion in situ, antibody binding, and most widely used methods (except PHD, see
chemical modification. A good topology model is below) provides any estimate of the reliability of a
a necessary prerequisite for experimental struc- given prediction, i.e. some measure of whether the
ture–function studies and can be used as a starting topology of a particular protein is more or less
point for attempts to model the 3D structure. likely to be correct than average.
From a structural point of view, there are two In this study, we have tried to construct useful
major groups of integral membrane proteins: the reliability scores for five widely used topology
prediction methods: TMHMM,1 HMMTOP,3
Abbreviations used: ORF, open reading frame. MEMSAT,4 PHD5 and TopPred.6 The goal has
E-mail address of the corresponding author: been to use these scores to compare performance
gunnar@dbb.su.se characteristics on a test set of proteins with

0022-2836/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved
736 Topology Prediction

experimentally determined topologies with per- probability values close to the borders between
formance characteristics on three complete gen- different classes often are low, even though the
omes, Escherichia coli, Saccharomyces cerevisiae and exact point of transition between one class and
Caenorhabditis elegans, and to assess to what extent another generally makes no difference to the
limited, easily obtainable experimental topology overall topology, we mask out a small number
information can be used to improve the theoretical of residues (three, five, seven, nine) on each side
predictions. of each border before locating the minimum
probability value. For the score evaluation pre-
sented below, we masked out nine residues at
each side of each border; the results are essen-
Results tially the same in the whole interval three to
nine masked residues (data not shown).
Construction of reliability scores
S3: The quotient p(best topology)/p(all possible
Judging by published bench-marking studies, topologies), calculated after a masking step as
TMHMM, HMMTOP and MEMSAT seem to have described below. The two probability values are
the best overall performance characteristics of the included in the standard TMHMM output,
available topology prediction programs.7,8 Two where p(best topology) is calculated with the
less well performing but widely used methods, N-best algorithm and p(all possible topologies)
PHD and TopPred, have been included for com- is calculated with the forward algorithm, as
parison. Each method is described below in some described.1 A quotient close to 1 implies that the
detail, together with a discussion of the reliability best path through the model (i.e. the predicted
scores that we have constructed from the raw out- topology) is much more probable than all
put from each program. alternative paths (i.e. all other topologies).
TMHMM can generate a list of several high-
scoring paths where the top ones frequently
TMHMM have very similar topologies (corresponding to
TMHMM is based on a hidden Markov model shifts of one or a few residues at the borders
with seven types of states (helix core, helix caps between different classes that do not change the
on either side of the membrane, short loop on cyto- overall topology). Since the exact borders
plasmic side/inside, short and long loop on non- between the classes are not generally known
cytoplasmic side/outside, and a globular domain even for the experimentally determined top-
state). Each type of state has a probability ologies, it is reasonable to mask out some
distribution over the 20 amino acids that have residues (we have used ten) on each side of a
been estimated from membrane proteins with class border and consider all topologies com-
experimentally known topologies. TMHMM out- patible with the “best” topology after masking
puts the most probable topology of the protein as the same prediction. We thus sum the prob-
given the model. The output is a labelled sequence abilities for all paths that give the same topology
of the three classes i (inside or cytoplasmic), h prediction after masking as the best path before
(helix) and o (outside or extra-cytoplasmic) that dividing by p(all possible paths) as obtained
obeys the “biological grammar” that a helix must from the raw output.
be followed by a loop and that inside and outside
loops must alternate. Posterior probabilities for HMMTOP
being in the three classes ( p(i), p(h), and p(o)) are
calculated for every residue in the sequence. We HMMTOP is a hidden Markov model with five
have constructed three different reliability scores states (inside loop, inside helix tail, helix, outside
(S1– S3) for TMHMM (see Methods). helix tail and outside loop). For a given amino
acid sequence it finds the most probable path
S1: The mean posterior probability of the through the model. Instead of taking into account
labelled sequence. A high mean posterior prob- only the absolute amino acid composition in the
ability indicates that most of the residues have a separate parts of the protein, it searches for the
high probability for their assigned classes and combination of states that gives the highest differ-
thus that the overall prediction might be con- ence in the amino acid distributions. The idea is
sidered reliable. The posterior probability values that a switch in the topology should be reflected
for each residue are calculated as described.1 A in a large amino acid distribution change (maxi-
possible shortcoming of this score is that a small mum divergence). In the raw output, numbers are
region with low probabilities embedded in a given for the entropy of the best path (i.e. the
long sequence with generally high scores will most probable topology) and the entropy of the
not greatly affect S1, even though it indicates an whole model. We have used the difference in
uncertainty in the prediction. entropy (i.e. entropy of best path 2 entropy of
S2: The minimum posterior probability in the model) as a measure of the reliability. The smaller
sequence of labelled residues. A low S2 score the difference, the better the best path represents
indicates that there is at least one part of the pro- the whole model, and the more likely to be correct
tein where the prediction is doubtful. Since the the predicted topology should be.
Topology Prediction 737

Figure 1. Relation between test

set cumulative coverage and the
fraction of correct topology predic-
tions for five different prediction
methods over a set of 92 prokaryo-
tic membrane proteins with experi-
mentally determined topologies.
TMHMM S3 score, filled squares;
MEMSAT, open squares; HMMTOP,
open circles; PHDhtm (web version,
multi-sequence mode), filled circles;
PHDhtm (single-sequence mode),
filled triangles; TopPred, open tri-
angles (for TopPred, many
sequences did not generate more
than one topology. For those cases
no reliability score could be calcu-
lated, which explains the total
TopPred coverage of only 36%).

MEMSAT membrane regions (the model). Finally, the overall

orientation of the protein in the membrane is pre-
MEMSAT is based on a model with five struc-
dicted by applying the “positive-inside” rule.11,12
tural states (inside loop, inside helix end, helix
PHDhtm is the only method in our study that
middle, outside helix end, outside loop). Each
automatically provides some sort of reliability
state is associated with a statistical table (log likeli-
measure. In the output, there is one reliability
hoods) of the frequency of the 20 amino acids. The
index for the model (i.e. for the number and
tables have been constructed from membrane
locations of the transmembrane regions) that is
proteins of known topologies and treat single- and
based on a comparison between the two highest-
multispanning membrane proteins separately. A
scoring models, and a second reliability index for
dynamic programming algorithm solves the
the orientation that is proportional to the charge
problem of finding the optimal state assignments
difference between the outside and inside parts of
for the query sequence. The algorithm computes
the protein. Both indices range from 0 (low) to 9
scores for all possible topologies starting with one
(high). However, the two indices are not combined
helix, and then increases the number of helices
into a single reliability score for the overall top-
one at a time until the scores become too low. The
ology. We have evaluated both the two existing
output produces a list of topologies representing
indices and the mean value of the two indices as
all possible number of TM helices (in both orien-
reliability scores.
tations) and their scores. The topology with the
Because the other four methods only use infor-
highest score is the final prediction. To assess the
mation in a single query sequence (and not infor-
reliability, we have calculated the difference in
mation from homologous sequences) we decided
scores between the best and the second best predic-
to run PHDhtm in single-sequence mode for the
tion. If the difference is high, the top-scoring
main analysis. However, we have also used the
topology should be more likely to be correct.
multi-sequence mode for comparison.

PHD
TopPred
PHD is a general tool for predicting secondary
structure of proteins, and the PHDhtm routine is TopPred was the first topology prediction
the part handling membrane proteins. It is method that combined hydrophobicity analysis
designed to use information from homologous and the positive-inside rule. It first calculates a
proteins. The first step in the method is a BLAST standard hydrophobicity profile for the query
search9 against the SWISSPROT database.10 A protein. Peaks above an upper cut-off (i.e. regions
multiple sequence alignment of the hits is con- rich in hydrophobic residues) are considered to be
structed and a neural network then estimates the confident transmembrane helix predictions
preference for each residue to be in a trans- whereas peaks between the upper and a lower
membrane helix or in a loop. The highest-scoring cut-off are regarded as putative transmembrane
putative transmembrane segment is used in a helices. Consequently, several topologies can be
second step to decide whether the protein is a constructed with or without the putative
helix bundle integral membrane protein. The third helix/helices. Out of these possible topologies, the
step is a dynamic program algorithm that finds one with the largest difference in the number of
the optimal number and locations of trans- positively charged amino acids between the two
738 Topology Prediction

sides of the membrane is given as the best predic-

tion. We have calculated a reliability score as the
difference between the charge-difference values
for the two top-scoring topologies. If no putative
helices are identified from the hydrophobicity
plot, only one topology is predicted, and thus no
reliability score can be calculated in such cases.

Reliability scores correlate with

prediction accuracy
The five methods and their corresponding
reliability scores described above were evaluated
over a previously collected test set (see Methods)
composed of 92 prokaryotic helix bundle mem-
brane proteins with experimentally determined
topologies. For each method and score, the 92 top-
ology predictions were ranked from high to low
scores. The results are summarized in Figure 1 in
the form of a plot of prediction accuracy versus Figure 2. TMHMM S3 and MEMSAT scores for 92 test
cumulative coverage of the test set. set proteins. Open circles, both predictions correct; filled
circles, both predictions false; open squares, TMHMM
As is clear from this Figure, TMHMM and prediction correct, MEMSAT prediction false; filled
MEMSAT have the best prediction characteristics squares, TMHMM prediction false, MEMSAT prediction
according to this test (for TMHMM, only the S3 correct.
score is shown, as the S1 and S2 scores yield essen-
tially the same results). For both methods, , 50% of
the predictions have reliability scores correspond-
ing to a prediction accuracy of , 90%, and , 70% the available experimental data, and a significant
of the proteins have scores corresponding to a fraction of the proteins in our test set have been
prediction accuracy , 80%. If the entire test set is used in the original construction of the different
considered (100% coverage), the prediction prediction methods. This has made it difficult to
accuracy is 65 –70%. obtain realistic estimates of the expected perform-
For HMMTOP, PHDhtm, and TopPred, our ance characteristics when the methods are applied
definitions of reliability scores do not seem very to previously uncharacterized proteins, and differ-
useful. We repeated the PHDhtm analysis by run- ent authors come to different conclusions on this
ning the web version in multi-sequence mode, point.7,8 From a couple of recent studies,13,14 it is
which improved the overall accuracy on the clear, however, that the available test sets of pro-
whole test set from 51% to 63%, but did not teins with experimentally determined topologies
improve the discrimination between good and bad is biased, although the extent of the bias is
predictions based on the reliability score. The two unknown.
individual reliability indices given by PHDhtm The reliability scores constructed here make it
were no better than the mean reliability score possible to address this question using whole-
shown in the Figure (data not shown). genome data. We have therefore calculated the
Interestingly, the top-scoring proteins are, to a TMHMM S3 score distributions for the predicted
significant extent, different for the two best helix bundle membrane protein proteomes of one
methods, TMHMM and MEMSAT. By simply com- prokaryotic, E. coli15 and two eukaryotic,
bining the two scores as shown in Figure 2 S. cerevisiae16 and C. elegans,17 organisms, and have
(TMHMM score S3 . 0.7 and/or MEMSAT score compared these distributions to the distributions
. 4) we reach a prediction accuracy of , 95% for obtained for the test set.
the , 60% top-scoring proteins in the test set. How- As TMHMM has been shown to be able to dis-
ever, this apparent improvement needs to be criminate between soluble and integral membrane
confirmed on a larger data set. A more elaborate proteins with very great accuracy,1 the three mem-
scheme for combining different topology predic- brane protein proteomes were defined as all ORFs
tion methods has been presented,8 and it is for which TMHMM predicts at least two trans-
possible that one can find “optimized” combi- membrane helices. Predicted single-spanning
nations of reliability scores that perform better proteins were not included, since cleavable signal
than the individual scores discussed here. peptides are often predicted as transmembrane
helices, thus erroneously identifying many
secreted proteins as single-spanning membrane
Proteins with known topology constitute a
proteins. Even so, an unknown proportion of the
biased set compared to full-size proteomes
membrane proteins identified in this way will
The development and evaluation of topology contain cleavable signal peptides, in contrast to
prediction methods is, to some extent, limited by the test set proteins, which all lack cleavable signal
Topology Prediction 739

Figure 3. TMHMM S3 score

distributions. The fraction of all
predicted membrane proteins with
two or more TM helices in each
genome or in the test set (76
proteins) and for each score interval
is shown.

peptides. This may reduce the S3 scores slightly for tion accuracies of 70– 85% are serious
some of the predicted proteins, but we consider it overestimates.
unlikely that this is enough to explain the There are several possible explanations for the
differences between the proteome sets and the test test set bias. First, even though jack-knife pro-
set reported below. cedures were used in the development of the
The results are presented in Figure 3, where the prediction methods, there are many subtle ways
percentages of membrane proteins are plotted for in which the methods may have been overtrained.
different score intervals. To be able to compare the It is quite likely that the proteins for which experi-
score distributions for the three proteomes with mental topologies have been reported have some
the test set, we removed all single-spanning characteristics such as unusually hydrophobic
sequences in the test set, ending up with 76 transmembrane segments that simultaneously
sequences and a TMHMM accuracy for this simplify both experimental mapping and predic-
reduced set of 63%. The most striking result is that tion. There are many families of membrane
there is a much larger fraction of high-scoring proteins for which no experimental topology is
proteins in the test set compared to the three available and which have thus not been seen by
proteomes, and thus that the overall prediction the prediction methods.
accuracy of , 66% reported in Figure 1 is a clear Looking more carefully at the results for the
overestimate. To obtain a more realistic estimate, individual genomes (Figure 3), it is interesting to
we first derived an empirical relation between the note that S. cerevisiae has a particularly large frac-
prediction accuracy and the S3 score by dividing tion of low-scoring proteins, while C. elegans and
the 92 test set predictions, ranked from high to E. coli have more similar score distributions. We
low scores, into four equal-size groups and then did not expect C. elegans to have the greatest pre-
plotting the average prediction accuracy in each dicted accuracy, since it is a eukaryote and the
group against the mean score for that group, relationship A ¼ 80 £ S3 þ 20 was derived from pro-
Figure 4(A). The accuracy/score relation is karyotic proteins. However, we suspected that the
reasonably well described by the straight line family of 7TM-receptors, known to be exception-
A ¼ 80 £ S3 þ 20. Using this relation, we calcu- ally large in C. elegans,18 might have contributed to
lated the expected A-values for all proteins in the the results. We therefore identified all C. elegans
respective membrane protein proteomes, which is proteins predicted to have seven transmembrane
plotted against the cumulative coverage in helices and an extracellular N terminus (985 out of
Figure 4(B). As a control, we plotted the real mean totally 4059) and analyzed the 7TM and non-7TM
accuracy and the calculated accuracy (A) for the sets separately. The 7TM set was found to have a
test set; the two latter curves agree well and we score distribution similar to that of the test set,
thus conclude that the expected accuracy A is a whereas the score distribution for the remaining
reasonable representation of the real data. The C. elegans membrane proteins almost coincided
mean prediction accuracies estimated in this with that of E. coli (data not shown). Finally, the
way for the whole proteomes (56% for E. coli, 53% combination of the TMHMM S3 and MEMSAT
for S. cerevisiae and 59% for C. elegans) are sig- scores discussed above (Figure 2), gave the follow-
nificantly lower than the , 66% obtained for the ing coverages for the three proteomes: 45% for
test set, suggesting that the widely quoted predic- E. coli, 46% for S. cerevisiae and 56% for C. elegans,
740 Topology Prediction

Figure 4. Expected performance

of TMHMM over all predicted
membrane proteins with two or
more TM helices in each genome.
(A) Mean fraction of correctly pre-
dicted proteins versus the mean
TMHMM S3 score for each quartile
of the test set of 92 proteins. The
least-squares fit is given by
A ¼ 80 £ S3 þ 20, where A is the
expected accuracy (i.e. the prob-
ability that a prediction with a
given S3 score is correct). (B) Esti-
mated relation between cumulative
coverage and the fraction of correct
topology predictions for the test set
of 92 proteins and for all predicted
membrane proteins with two or
more TM helices in each genome.
Test set (original data), open circles;
test set (calculated data), filled cir-
cles; C. elegans, filled triangles;
E. coli, open squares; S. cerevisiae,
filled squares.

which should be compared to the 60% coverage of certain cases.19 With the introduction of reliability
the test set. scores, it is now possible to extend this strategy to
entire proteomes. The basic TMHMM algorithm
allows one to fix the class-assignment for any pos-
Inclusion of limited experimental information: ition in the sequence by setting the probability for
a strategy for large-scale topology mapping a position to belong to a certain class to 1.0 a priori.
Given the rather low estimates for the expected If the C-terminal residue of each protein in the test
mean prediction accuracy over full-size proteomes set is assigned to its experimentally known class,
discussed above, it is clear that topology predic- the relation between accuracy and coverage
tions, in general, provide only a rough guide to becomes much more favourable and the overall
the true topology of a protein. On the other hand, mean accuracy increases from 66% to 77% (Figure
the reliability scores presented here can be used to 5(B)). Similarly, if the N terminus is fixed, the over-
reduce considerably the necessary experimental all mean accuracy increases to 79%, and if both
work required to reach a satisfactory level of pre- termini are fixed it reaches 88% (data not shown).
diction accuracy. Again, there is an approximately linear relation-
We have shown that limited experimental infor- ship between the accuracy and the S3 score; with a
mation such as a determination of the in/out fixed C terminus, the relation is A c ¼ 70 £ S3c þ 30
location of the C terminus of a protein can be used (data not shown).
in conjunction with topology prediction to rapidly Finally, we tried to estimate how much the pre-
provide a very reliable topology model, at least in diction accuracy across the E. coli, S. cerevisiae and
Topology Prediction 741

Figure 5. Influence of experimen-

tal information on TMHMM per-
formance. (A) Relation between
increase in S3 score for the test set
of 92 proteins with the C-terminal
residue fixed to its experimentally
known location and the value of
p(last aa); DS3c ¼ 20.57 £ p(last
aa) þ 0.57. (B) Relation between
cumulative coverage and fraction
of correct predictions. Observed
accuracy for the test set with fixed
C-terminal locations, filled circles;
and with fixed N-terminal
locations, open circles. Expected
accuracy, A cp, for the three genomes
assuming that the C-terminal
location is known: C. elegans, filled
triangles; E. coli, open squares;
S. cerevisiae, filled squares.

C. elegans membrane protein proteomes would p(last aa), the larger is the mean increase in the S3
improve if the location of each protein’s C termi- score when the C-terminal residue is assigned to
nus was known. To this end, we used the test set its known class. This expression was used for esti-
to measure the difference in S3 score, DS3c, mating the increase in S3 score for all proteins in
between the score obtained with the C terminus the three proteomes from which the estimated S3c
fixed and the score obtained in the absence of any scores can be calculated; S3cp ¼ S3 þ DS3c, assum-
experimental information (DS3c ¼ S3c 2 S3) and ing that the C-terminal location is known. The
plotted DS3c versus the probability value for the expected accuracy, A cp, was then calculated from
location of the C-terminal residue obtained in the the expression for A c above. The results are shown
absence of experimental information, p(last aa), i.e. in Figure 5(B). The estimated increase in overall
the probability value for the assigned class of the accuracy for the proteomes is from 56% to 67% for
last amino acid in the sequence (Figure 5(A)). E. coli, from 53% to 67% for S. cerevisiae, and
Although the data are rather scattered, there is a from 59% to 71% for C. elegans. It should be
linear trend described by DS3c ¼ 2 0.57 £ p(last emphasised that these numbers are only rough
aa) þ 0.57. In other words, the smaller the value of estimates, but they nevertheless suggest that
742 Topology Prediction

prediction performance would improve signifi- Finally, we have tried to estimate the expected
cantly if C-terminal mapping data were available. improvement in prediction accuracy if the in/out
Generally applicable methods for determining location of the C terminus of every protein in a
the location of the C-terminal end of a protein on proteome was known from experimental data,
the basis of either reporter fusions or engineered since relatively rapid methods for such determi-
acceptor sites for N-linked glycosylation exist for nations are now available. For all three proteomes,
E. coli, S. cerevisiae and mammalian proteins,20 – 22 we find that TMHMM will predict the correct top-
and we have shown that such methods can be ology for , 70% of all membrane proteins, given
used on a relatively large scale (our unpublished that the C-terminal location is known. Again, the
work). On the basis of TMHMM-predictions,1 we likelihood that a given prediction is correct can be
have estimated that the membrane protein pro- estimated from the reliability score.
teome of E. coli consists of 769 proteins with two In summary, we describe new reliability scores
or more transmembrane helices, and that of for TMHMM and MEMSAT, two of the currently
S. cerevisiae of 847 such proteins. The results pre- best-performing topology prediction methods, that
sented above suggest that highly reliable topology make it possible to estimate the likelihood that a
models for a majority of these proteins should be given prediction is correct and that can be used in
obtainable from a simple experimental determi- conjunction with limited experimental information
nation of the C-terminal location. to provide high-quality topology models for entire
proteomes.

Discussion
Methods
Membrane protein topology prediction is an
important area in contemporary bioinformatics, Prediction methods
and provides a useful starting point for experimen- TMHMM2.0,1 HMMTOP2.0,3,23 MEMSAT version 1.8,4
tal studies of membrane proteins. While the overall PHDhtm version 1998.015 and TopPred version 1.06
performance of different topology prediction were used in single-sequence mode and with default
methods has been much discussed lately,7,8,13 essen- parameter settings. PHDhtm was also run in its multiple
tially no work has been done trying to estimate the sequence alignment mode on the website.†
reliability of individual predictions. Here, we have
constructed simple reliability scores for five widely Definition of reliability scores
used methods, TMHMM, HMMTOP, MEMSAT,
PHDhtm and TopPred, and have applied them to
a test set of 92 prokaryotic proteins with experi-
TMHMM S1 ¼ ðp1 ðlabelÞ þ p2 ðlabelÞ þ · · · þ pN ðlabelÞÞ=N
mentally determined topologies and to the full-
size membrane protein proteomes from E. coli, where N is the sequence length and pi(label) is the
S. cerevisiae and C. elegans. posterior probability for the assigned class (label ¼ i, o
For TMHMM and MEMSAT, there is a good cor- or h) for residue i.
relation between the reliability scores we have
defined and the expected accuracy of a prediction. TMHMM S2 ¼ min½p1 ðlabelÞ; p2 ðlabelÞ; …; pN ðlabelÞ
For both methods, , 50% of the predictions have
reliability scores corresponding to a prediction TMHMM S3 ¼ pðbest topologyÞ=pðall possible topologiesÞ
accuracy of , 90%, and , 70% of the proteins have
scores corresponding to a prediction accuracy of To calculate p(best topology) we first identify all high-
, 80% over the test set. For the remaining three scoring predictions that are compatible with the highest-
scoring one by masking ten residues on either side of
methods, we were unable to derive useful each class border. All predictions that have the same
reliability scores. class assignments as the highest-scoring one after mask-
We have further used the TMHMM reliability ing are considered as being the same, and p(best top-
score to assess the degree of bias in the test set as ology) is the summed probabilities (as given by
compared to the predicted membrane protein pro- TMHMM) for these predictions. These individual prob-
teomes of E. coli, S. cerevisiae and C. elegans. In con- abilities as well as p(all possible topologies) are calcu-
formity with the results of two recent studies,13,14 lated as described.1
we find that the test set is biased towards high-
HMMTOP : score ¼ entropyðbest pathÞ 2 entropyðmodelÞ
scoring proteins, and we estimate that only some
53 –59% of all predicted topologies for these pro-
teomes are correct, compared to 63% for the test MEMSAT : score ¼ scoreðbest topologyÞ
set when only proteins with two or more trans-
membrane helices are considered (or 66% for the 2 scoreðsecond best topologyÞ
whole test set). The reliability scores make it
possible to estimate the likelihood that a given pre- PHDhtm : score ¼ ððindexðmodelÞ þ indexðorientationÞÞ=2
diction is correct, allowing experimental topology
mapping efforts to be focused on proteins with
low reliability scores. † http://cubic.bioc.columbia.edu/predictprotein/
Topology Prediction 743

TopPred : score 3. Tusnady, G. E. & Simon, I. (1998). Principles govern-

ing amino acid composition of integral membrane
¼ D positive chargesðbest topologyÞ proteins: application to topology prediction. J. Mol.
Biol. 283, 489–506.
2 D positive chargesðsecond best topologyÞ 4. Jones, D. T., Taylor, W. R. & Thornton, J. M. (1994). A
model recognition approach to the prediction of all-
helical membrane protein structure and topology.
Definition of correct predictions Biochemistry, 33, 3038–3049.
5. Rost, B., Fariselli, P. & Casadio, R. (1996). Topology
A predicted topology is considered correct if it has the
prediction for helical transmembrane proteins at
correct number of transmembrane segments and the cor-
86% accuracy. Protein Sci. 5, 1704– 1718.
rect location of the N terminus.
6. von Heijne, G. (1992). Membrane protein structure
prediction—hydrophobicity analysis and the
Data sets positive-inside rule. J. Mol. Biol. 225, 487– 494.
7. Möller, S., Croning, M. & Apweiler, R. (2001).
The test set used is a collection of 92 prokaryotic helix Evaluations of methods for the predictive evaluation
bundle membrane proteins with experimentally known of membrane spanning regions. Bioinformatics, 17,
topologies.24 We selected proteins belonging to “trust 646 –653.
levels” A, B and C, but excluded level C proteins with 8. Ikeda, M., Arai, M., Lao, D. & Shimizu, T. (2001).
only partial topologies. We removed all sequences that Transmembrane topology prediction methods: a re-
were annotated to contain an N-terminal signal or a assessment and improvement by a consensus
pro-peptide. method using a data-set of experimentally character-
The highest level of sequence identity (as determined ized transmembrane topologies. In Silico Biol. 2,
by ClustalW25 alignments) between any two proteins in 1 – 15.
the test set was 59%, and 71 sequences had less than 9. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. &
30% mutual identity as determined by the Hobohm 2 Lipman, D. J. (1990). Basic local alignment search
algorithm.26
tool. J. Mol. Biol. 215, 403– 410.
For the proteome analysis, all predicted ORFs from 10. O’Donovan, C., Martin, M. J., Gattiker, A., Gasteiger,
three fully sequenced genomes, E. coli†, S. cerevisiae‡ E., Bairoch, A. & Apweiler, R. (2002). High-quality
and C. elegans§, were downloaded. protein knowledge resource: SWISS-PROT and
To extract the membrane proteins, TMHMM was run TrEMBL. Brief. Bioinformat. 3, 275– 284.
on all ORFs in the respective genomes and all proteins 11. von Heijne, G. (1986). The distribution of positively
with two or more predicted transmembrane segments charged residues in bacterial inner membrane pro-
were retained. Proteins with a single predicted trans- teins correlates with the trans-membrane topology.
membrane segment were not included, since a consider-
EMBO J. 5, 3021– 3027.
able but unknown fraction of these segments are
12. von Heijne, G. (1989). Control of topology and mode
cleavable signal peptides rather than transmembrane
of assembly of a polytopic membrane protein by
helices.1 The numbers of proteins analyzed were 749 for
positively charged residues. Nature, 341, 456– 458.
E. coli, 847 for S. cerevisiae and 4059 for C. elegans.
13. Käll, L. & Sonnhammer, E. (2002). Reliability of
transmembrane predictions in whole-genome data.
FEBS Letters, 532, 415–418.
14. Nilsson, J., Persson, B. & von Heijne, G. (2002). Pre-
Acknowledgements diction of partial membrane protein topologies
using a consensus approach. Protein Sci., 11,
This work was supported by a grant from the 2974 –2980.
Swedish Knowledge Foundation via the Research 15. Blattner, F. R., Plunkett, G., Bloch, C. A., Perna, N. T.,
School of Medical Bioinformatics and AstraZeneca Burland, V., Riley, M. et al. (1997). The complete
to K.M., and by grants from the Foundation for genome sequence of Escherichia coli K-12. Science,
Strategic Research and the Swedish Research 277, 1453– 1462.
16. Goffeau, A., Aert, R., Agostini-Carbone, M., Ahmed,
Council to G.v.H.
A., Aigle, M., Alberghina, L. et al. (1997). The yeast
genome directory. Nature, suppl. 387, 1– 105.
17. Stein, L., Sternberg, P., Durbin, R., Thierry-Mieg, J. &
References Spieth, J. (2001). WormBase: network access to the
genome and biology of Caenorhabditis elegans. Nucl.
1. Krogh, A., Larsson, B., von Heijne, G. &
Acids Res., 29, 82 – 86.
Sonnhammer, E. (2001). Predicting transmembrane
protein topology with a hidden Markov model. 18. Bargmann, C. (1998). Neurobiology of the
Application to complete genomes. J. Mol. Biol. 305, Caenorhabditis elegans genome. Science, 282, 2028–2033.
567–580. 19. Drew, D., Sjöstrand, D., Nilsson, J., Urbig, T., Chin,
2. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., C. N., de Gier, J. W. & von Heijne, G. (2002). Rapid
Bhat, T. N., Weissig, H. et al. (2000). The Protein topology mapping of Escherichia coli inner-mem-
Data Bank. Nucl. Acids Res. 28, 235–242. brane proteins by prediction and PhoA/GFP
fusion analysis. Proc. Natl Acad. Sci. USA, 99,
2690 – 2695.
† http://bmb.med.miami.edu/EcoGene/EcoWeb/ 20. Manoil, C. (1991). Analysis of membrane protein
‡ ftp://genome-ftp.stanford.edu/pub/yeast/ topology using alkaline phosphatase and b-
yeast_ORFs/ galactosidase gene fusions. Methods Cell. Biol. 34,
§ ftp://ftp.sanger.ac.uk/pub/wormbase/ 61 – 75.
744 Topology Prediction

21. Deak, R. & Wolf, D. (2001). Membrane topology and 24. Möller, S., Kriventseva, E. & Apweiler, R. (2000). A
function of Der3/Hrd1p as a ubiquitin-protein ligase collection of well-characterised integral membrane
(E3) involved in endoplasmic reticulum degradation. proteins. Bioinformatics, 16, 1159–1160.
J. Biol. Chem. 276, 10663 –10669. 25. Thompson, J. D., Higgins, D. G. & Gibson, T. J.
22. Popov, M., Tam, L. Y., Li, J. & Reithmeier, R. A. F. (1994). CLUSTAL W: improving the sensitivity of
(1997). Mapping the ends of transmembrane seg- progressive multiple sequence alignment through
ments in a polytopic membrane protein—scanning sequence weighting, position-specific gap penalties
N-glycosylation mutagenesis of extracytosolic loops and weight matrix choice. Nucl. Acids Res. 22,
in the anion exchanger, Band 3. J. Biol. Chem. 272, 4673– 4680.
18325– 18332. 26. Hobohm, U., Scharf, M., Schneider, R. & Sander, C.
23. Tusnady, G. E. & Simon, I. (2001). The HMMTOP (1992). Selection of representative protein data sets.
transmembrane topology prediction server. Protein Sci. 1, 409– 417.
Bioinformatics, 17, 849– 850.

Edited by F. E. Cohen

(Received 19 November 2002; received in revised form 30 January 2003; accepted 31 January 2003)

An Improved Topology Prediction of Alpha-Helical Transmembrane Protein Based On Deep Multi-Scale Convolutional Neural Network
No ratings yet
An Improved Topology Prediction of Alpha-Helical Transmembrane Protein Based On Deep Multi-Scale Convolutional Neural Network
10 pages
Three Problems of Hidden Markov Models: 1) Scoring Problem
No ratings yet
Three Problems of Hidden Markov Models: 1) Scoring Problem
11 pages
HMMs in Computational Biology
No ratings yet
HMMs in Computational Biology
12 pages
Training Problem
No ratings yet
Training Problem
9 pages
HMM Lecture Notes
No ratings yet
HMM Lecture Notes
7 pages
Hidden Markov Models Sean R Eddy: Analysis Has
No ratings yet
Hidden Markov Models Sean R Eddy: Analysis Has
5 pages
Bioinf
No ratings yet
Bioinf
2 pages
Iq-Tree - Lam-Tung Nguyen
No ratings yet
Iq-Tree - Lam-Tung Nguyen
7 pages
Membrane Protein Prediction Tool
No ratings yet
Membrane Protein Prediction Tool
2 pages
1.1. An Example of A HMM For Protein Sequences: Output Prob
No ratings yet
1.1. An Example of A HMM For Protein Sequences: Output Prob
16 pages
Hidden Markov Models and Their Applications in Biological Sequence Analysis Byung-Jun Yoon
No ratings yet
Hidden Markov Models and Their Applications in Biological Sequence Analysis Byung-Jun Yoon
30 pages
Lab Report 07
100% (1)
Lab Report 07
19 pages
HMM-Mona Singh
No ratings yet
HMM-Mona Singh
11 pages
133 Thando Tshaka Presentation
No ratings yet
133 Thando Tshaka Presentation
8 pages
Agriculture 5
No ratings yet
Agriculture 5
3 pages
RNA & Protein Structure Analysis
No ratings yet
RNA & Protein Structure Analysis
51 pages
Profile HMMs in Sequence Alignment
No ratings yet
Profile HMMs in Sequence Alignment
4 pages
Marhon IEEETCBB
No ratings yet
Marhon IEEETCBB
12 pages
PSSM
No ratings yet
PSSM
17 pages
Chapter4.4 HMM
No ratings yet
Chapter4.4 HMM
20 pages
Clustering of Protein Domains For Functional and Evolutionary Studies
No ratings yet
Clustering of Protein Domains For Functional and Evolutionary Studies
11 pages
2005 in Silico Biol 5 227-37
No ratings yet
2005 in Silico Biol 5 227-37
12 pages
1 Secondary Structure Prediction
No ratings yet
1 Secondary Structure Prediction
15 pages
Bif401 Solved Final Papers 2017
100% (1)
Bif401 Solved Final Papers 2017
8 pages
A Parameterized Probabilistic Model of Network Evolution For Supervised Link Prediction
No ratings yet
A Parameterized Probabilistic Model of Network Evolution For Supervised Link Prediction
10 pages
Proteins: Remote Homology Detection of Integral Membrane Proteins Using Conserved Sequence Features
No ratings yet
Proteins: Remote Homology Detection of Integral Membrane Proteins Using Conserved Sequence Features
13 pages
Machine Learning Based Prediction Methods in Bioinformatics
No ratings yet
Machine Learning Based Prediction Methods in Bioinformatics
34 pages
SVM in Bioinformatics: Understandin G
No ratings yet
SVM in Bioinformatics: Understandin G
49 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
Prediction of Protein Tertiary Structural Classes Based On Ensemble Learning
No ratings yet
Prediction of Protein Tertiary Structural Classes Based On Ensemble Learning
4 pages
2015 Article 14 Twilight Zone
No ratings yet
2015 Article 14 Twilight Zone
11 pages
Protein Stability Prediction-16
No ratings yet
Protein Stability Prediction-16
68 pages
EMGT - 891 Project Term 4 Cleaned
No ratings yet
EMGT - 891 Project Term 4 Cleaned
10 pages
EMGT - 891 Project Term 4 Final
No ratings yet
EMGT - 891 Project Term 4 Final
14 pages
Prediction of Betaturns With Learning Machines
No ratings yet
Prediction of Betaturns With Learning Machines
5 pages
CG 10 402 PDF
No ratings yet
CG 10 402 PDF
14 pages
Name: Gaurav Bhargava Roll No: IBI2009011
No ratings yet
Name: Gaurav Bhargava Roll No: IBI2009011
3 pages
The Threading Approach To Tertiary Structure Prediction
No ratings yet
The Threading Approach To Tertiary Structure Prediction
6 pages
Protein Structure Prediction Methods
No ratings yet
Protein Structure Prediction Methods
6 pages
Markov Chain Monte Carlo Computation of Confidence Intervals For Substitution-Rate Variation in Proteins
No ratings yet
Markov Chain Monte Carlo Computation of Confidence Intervals For Substitution-Rate Variation in Proteins
12 pages
Persistent Sheaf Laplacian Analysis of Protein Flexibility: Keywords
No ratings yet
Persistent Sheaf Laplacian Analysis of Protein Flexibility: Keywords
14 pages
Dingo Optimized Fuzzy CNN Technique For Efficient Protein Structure Prediction
No ratings yet
Dingo Optimized Fuzzy CNN Technique For Efficient Protein Structure Prediction
9 pages
BMC Bioinformatics: How Accurate and Statistically Robust Are Catalytic Site Predictions Based On Closeness Centrality?
No ratings yet
BMC Bioinformatics: How Accurate and Statistically Robust Are Catalytic Site Predictions Based On Closeness Centrality?
14 pages
GZR 066
No ratings yet
GZR 066
8 pages
2023 07 17 549396v1 Full
No ratings yet
2023 07 17 549396v1 Full
26 pages
Nmeth 1818
No ratings yet
Nmeth 1818
6 pages
Level Set Trees For Applied Statistics
100% (1)
Level Set Trees For Applied Statistics
124 pages
Computational Biology and Bioinformatics 1st Edition Marvin Zelkowitz Ph.D. Ms Bs. Download
No ratings yet
Computational Biology and Bioinformatics 1st Edition Marvin Zelkowitz Ph.D. Ms Bs. Download
62 pages
Improved Protein Structure Prediction Using Potentials From Deep Learning
No ratings yet
Improved Protein Structure Prediction Using Potentials From Deep Learning
22 pages
A New Heuristic of The Decision Tree Induction: Ning Li, Li Zhao, Ai-Xia Chen, Qing-Wu Meng, Guo-Fang Zhang
No ratings yet
A New Heuristic of The Decision Tree Induction: Ning Li, Li Zhao, Ai-Xia Chen, Qing-Wu Meng, Guo-Fang Zhang
6 pages
P (X P (Y: 1. Open Areas For Research in Hmms in Biology
No ratings yet
P (X P (Y: 1. Open Areas For Research in Hmms in Biology
1 page
Protein Structure Geometry Topology and Classification 1st Edition William R. Taylor Updated 2025
No ratings yet
Protein Structure Geometry Topology and Classification 1st Edition William R. Taylor Updated 2025
160 pages
Automatic Problem-Specific
No ratings yet
Automatic Problem-Specific
53 pages
Dca HP RBM
No ratings yet
Dca HP RBM
26 pages
알파폴드1논문
No ratings yet
알파폴드1논문
27 pages
Bioinformatics: 3D-Jury: A Simple Approach To Improve Protein Structure Predictions
No ratings yet
Bioinformatics: 3D-Jury: A Simple Approach To Improve Protein Structure Predictions
4 pages
Protein Structure Prediction
No ratings yet
Protein Structure Prediction
13 pages
Moreta 2019
No ratings yet
Moreta 2019
5 pages
Biotech Bioeng
No ratings yet
Biotech Bioeng
20 pages
Schwartz 2015
No ratings yet
Schwartz 2015
5 pages
Targeting Target Cancer Metabolism
No ratings yet
Targeting Target Cancer Metabolism
12 pages
Targeting Target Cancer Metabolism
No ratings yet
Targeting Target Cancer Metabolism
12 pages
Metabolic Profiling of Cell Growth PDF
No ratings yet
Metabolic Profiling of Cell Growth PDF
9 pages
Author's Accepted Manuscript: Metabolic Engineering
No ratings yet
Author's Accepted Manuscript: Metabolic Engineering
28 pages
Mol Cell Proteomics-2002-Deane-349-56
No ratings yet
Mol Cell Proteomics-2002-Deane-349-56
8 pages
Reliability 7
No ratings yet
Reliability 7
10 pages
Efficient Metabolic Engineering of GM3 On Tumor Cells by NPhenylacetyl
No ratings yet
Efficient Metabolic Engineering of GM3 On Tumor Cells by NPhenylacetyl
19 pages
How Reliable Are Experimental Protein - Protein Interaction Data?
No ratings yet
How Reliable Are Experimental Protein - Protein Interaction Data?
5 pages
Construction and Characterization of The Korean Whole Saliva Proteome To Determine Ethnic Differences in Human Saliva Proteome
No ratings yet
Construction and Characterization of The Korean Whole Saliva Proteome To Determine Ethnic Differences in Human Saliva Proteome
20 pages
Prediction of Protein Secondary Structure With A Reliability Score Estimated by Local Sequence Clustering
No ratings yet
Prediction of Protein Secondary Structure With A Reliability Score Estimated by Local Sequence Clustering
7 pages
Chen 2016
No ratings yet
Chen 2016
9 pages
Research Article: Topological Indices Study of Molecular Structure in Anticancer Drugs
No ratings yet
Research Article: Topological Indices Study of Molecular Structure in Anticancer Drugs
9 pages
Bioinformatic Prediction of The Epitopes Of: Echinococcus Granulosus Antigen 5
No ratings yet
Bioinformatic Prediction of The Epitopes Of: Echinococcus Granulosus Antigen 5
7 pages
Research Article Myoz3 Gene: Molecular Cloning, Expression Profiling, and Marker Validation of The Chicken
No ratings yet
Research Article Myoz3 Gene: Molecular Cloning, Expression Profiling, and Marker Validation of The Chicken
11 pages
Author's Accepted Manuscript: Journal of Theoretical Biology
No ratings yet
Author's Accepted Manuscript: Journal of Theoretical Biology
14 pages
Categorizing Governance of Projects (Müller-Lecoeuvre in Pess)
No ratings yet
Categorizing Governance of Projects (Müller-Lecoeuvre in Pess)
12 pages
Uji Normalitas
No ratings yet
Uji Normalitas
4 pages
Group 8 Ocampo ED 203 MidTerm Exam
No ratings yet
Group 8 Ocampo ED 203 MidTerm Exam
6 pages
Emerging Literacy and Language Assessment (ELLA)
No ratings yet
Emerging Literacy and Language Assessment (ELLA)
12 pages
Assessing Bias in Studies of Prognostic Factors
No ratings yet
Assessing Bias in Studies of Prognostic Factors
9 pages
49
No ratings yet
49
9 pages
An Auditor Quzi
No ratings yet
An Auditor Quzi
13 pages
2020examining Scientific Attitude Scales in India Development and Validation of A New Scale 8557
No ratings yet
2020examining Scientific Attitude Scales in India Development and Validation of A New Scale 8557
14 pages
Oldat Sempro
No ratings yet
Oldat Sempro
31 pages
Lab Report 2
No ratings yet
Lab Report 2
18 pages
2023 Depth Study Marking Rubric Gunkelman Tadhg
No ratings yet
2023 Depth Study Marking Rubric Gunkelman Tadhg
2 pages
Self-Concept: Validation of Construct Interpretations
No ratings yet
Self-Concept: Validation of Construct Interpretations
35 pages
The Quiet Ego and Its Predictors in Turkish Culture
No ratings yet
The Quiet Ego and Its Predictors in Turkish Culture
13 pages
GREAt The Relationship o FProfessional Skepticism To The Risks of
No ratings yet
GREAt The Relationship o FProfessional Skepticism To The Risks of
14 pages
2 PB
No ratings yet
2 PB
9 pages
BTS Meal: Celebrity Influence
No ratings yet
BTS Meal: Celebrity Influence
7 pages
NR449 Quiz 3 Review
No ratings yet
NR449 Quiz 3 Review
1 page
Lesson 2 - Assessment Learning 2
No ratings yet
Lesson 2 - Assessment Learning 2
20 pages
Brief Assessment of Mealtime Behavior in Children
0% (1)
Brief Assessment of Mealtime Behavior in Children
16 pages
Computational Aspects of Psychometric Methods With R - 1st Edition Scribd Full Download
100% (12)
Computational Aspects of Psychometric Methods With R - 1st Edition Scribd Full Download
15 pages
M.A. Psychology IV Semester Schedule
No ratings yet
M.A. Psychology IV Semester Schedule
2 pages
Adoption
No ratings yet
Adoption
4 pages
Lec 4 Cost Behavior
No ratings yet
Lec 4 Cost Behavior
37 pages
Aesthetic Factor Analysis of Interior Space
No ratings yet
Aesthetic Factor Analysis of Interior Space
14 pages
Factors Affecting The Interest of Accounting Students in Career Selection
No ratings yet
Factors Affecting The Interest of Accounting Students in Career Selection
12 pages
Impact of Review Valence and Perceived Uncertainty On Purchase of Time-Constrained and Discounted Search Goods
No ratings yet
Impact of Review Valence and Perceived Uncertainty On Purchase of Time-Constrained and Discounted Search Goods
12 pages
Sr. Bernarda Thesis
No ratings yet
Sr. Bernarda Thesis
125 pages
The Feeling of Love Toward A Brand: Concept and Measurement
No ratings yet
The Feeling of Love Toward A Brand: Concept and Measurement
10 pages
Customer Expectations and Perceptions Across The Indian Banking Industry and The Resultant Financial Implications
No ratings yet
Customer Expectations and Perceptions Across The Indian Banking Industry and The Resultant Financial Implications
20 pages
Investigating The Impact of Vocabulary Instruction On Developing Reading Comprehension
No ratings yet
Investigating The Impact of Vocabulary Instruction On Developing Reading Comprehension
26 pages

Reliability 6

Uploaded by

Reliability 6

Uploaded by

doi:10.1016/S0022-2836(03)00182-7 J. Mol. Biol.

(2003) 327, 735–744

Reliability Measures for Membrane Protein Topology

Introduction helix bundle proteins, in which one or several

Figure 1. Relation between test

MEMSAT membrane regions (the model). Finally, the overall

sides of the membrane is given as the best predic-

Reliability scores correlate with

Figure 3. TMHMM S3 score

Figure 4. Expected performance

Figure 5. Influence of experimen-

TopPred : score 3. Tusnady, G. E. & Simon, I. (1998). Principles govern-

You might also like