8 How to evaluate the quality of a model
8.1 General considerations.
- A model
is considered wrong if at least part of its structural features a
miss-placed relatively to the rest of the model. Errors of that type
can very easily slip into a model when erroneous sequence alignments
are used during the building procedure. Such models can nevertheless
have proper stereochemistry if one gives great care to this aspect
during the building procedure.
- In absolute terms a model can be declared inaccurate or imprecise
if its atomic co-ordinates are not within 0.5 A rmsd of a control
experimental structure. This value comes from the structure/sequences
similarity study of Chothia and Lesk [6], in which they demonstrate
that different structures of a same protein can deviate by as much as
0.5 A. This criterion can however only be assessed after the fact, and
is thereby not usable. In relative terms, however, a model can be
considered "accurate enough" or as "accurate as you can get"
when its rmsd is within the spread of deviations observed for
experimental structures displaying a similar sequence identity level as
the target and template sequences.
Another source of inaccuracy is the deviation from ideal stereochemical
values for bond lengths and angles. Such inaccuracies can be easily
detected with the program WhatCheck developed by G. Vriend at the EMBL
(Can be reached from the PDB Web site).
- It is crucial to realise that proper stereochemistry as can be
assessed with WhatCheck is not a criteria for model correctness. In
other terms, it is possible, to build models which would comply with
such criteria and have strictly no biological meaning.
- Empirical pair-potentials allow, to some degree, the detection of
such errors in models. These algorithms are indeed not sensitive enough
to detect subtle differences in conformation but are quite efficient at
pointing out regions where sequence and structure do not fit.
8.2 What are the sources of errors and inaccuracies?
The quality of a model is determined by two criteria, which will define its applicability (see Part IV):
- The correctness of a model is essentially dictated by the
quality of the sequence alignment used to guide the modelling process.
If the sequence alignment is wrong in some regions, then the spatial
arrangement of the residues in this portion of the model will be
incorrect.
- The accuracy of a model is essentially limited by the
deviation of the used template structure(s) relative to the (a future)
experimental control structure. This limitation is inherent to the
methods used, since the models result from an extrapolation. As a
consequence, the C atoms of protein models which share 35 to 50%
sequence identity with their templates, will generally deviate by 1.0
to 1.5 Å from their experimental counter parts, as do similarly related
experimental structures [6]. Furthermore, structural differences
between predicted and experimental structures have two sources:
- The errors inherent to the modelling procedures.
- The variations caused by the molecular environment and data
collection method incorporated into experimentally elucidated
structures which will be used as modelling templates. Indeed,
crystallographic structures of identical proteins can vary not only
because of experimental errors and differences in data collection
conditions (illustrated in [32]) and refinement, but also because of
different crystal lattice contacts and the presence or absence of
ligands. One of the most interesting examples in which several
structures of the same protein, determined by different methods, were
compared involves interleukin-4 (IL-4) [33 and references therein].
This cytokine consists of a 130 residue four helix bundle, and its
structure was elucidated by x-ray crystallography as well as by NMR.
The backbones of three IL-4 crystal structures (PDB entries 1RCB, 2INT
and 1HIK) show an rmsd of 0.4 to 0.9 Å, while those of three IL-4 NMR
forms (PDB entries 1ITM, 1CYL and 2CYK) give rmsd of 1.2 to 2.6 Å.
These values illustrate the structural differences due to experimental
procedures and the molecular environment at the time of data
collection. Therefore, "a protein model derived by comparative
methods cannot be more accurate than the difference between the NMR and
crystallographic structure of the same protein." [33].
8.3 Protein core and loops.
Almost
every protein model contains non-conserved loops which are expected to
be the least reliable portions of a protein model. Indeed, these loops
often deviate markedly from experimentally determined control
structures. In many cases, however, these loops also correspond to the
most flexible parts of the structure as evidenced by their high
crystallographic temperature factors (or multiple solutions in NMR
experiments). On the other hand, the core residues - the least variable
in any given protein family - are usually found in essentially the same
orientation as in experimental control structures, while far larger
deviations are observed for surface amino acids. This is expected since
the core residues are generally well conserved and the conformation of
their side chains are constrained by neighbouring residues. In
contrast, the more variable surface amino acids will tend to show more
deviations since there are few steric constraints imposed upon them.
8.4 Detecting major errors using empirical pair potentials.
Some
structural aspects of a protein model can be verified using methods
based on the inverse folding approach. Two of them, namely the 3D-1D
profile based verification method [15] and ProsaII developed by
M. Sippl [16], are widely used. The 3D-1D profile of a protein
structure is calculated by adding the probability of occurrence for
each residue in its 3D-context [15]. Each of the twenty amino acids has
a certain probability to be located in any environmental classes
(defined by criteria such as solvent-accessible surface, buried polar,
exposed non-polar area and secondary structure) defined by Eisenberg
and colleagues. In contrast, ProsaII [16] relies on empirical
energy potentials derived from the pairwise interactions observed in
well defined protein structures. These terms are summed over all
residues in a model and result in a more or less favourable energy.
Both methods can detect a global sequence to structure incompatibility
and errors corresponding to topological differences between template
and target. They also allow the detection of more localised errors such
as b-strands
that are "out of register" or buried charged residues. These methods
are however unable to detect the more subtle structural inconsistencies
often localised in non-conserved loops, and cannot provide an
assessment of the correctness of their geometry.
8.5 Applicability of model structures.
Protein model obtained with comparative modelling methods can be classified into three broad categories:
- Models which are based on incorrect alignments between target and template sequences.
Such alignment errors, which generally reside in the inaccurate
positioning of insertions and deletions, are caused by the weaknesses
of the alignment algorithms and can generally not be resolved in the
absence of a control experimental structure. It is however often
possible to correct such errors by producing several models based on
alignment variants and by selecting the most "sensible" solution.
Nevertheless, it turns out that such models are often useful as the
errors are not located in the area of interest, such as within a well
conserved active site.
- Models based on correct alignments are of course much
better, but their accuracy can still be medium to low as the templates
used during the modelling process have a medium to low sequence
similarity with the target sequence. Such models, as the ones described
above, are however very useful tools for the rational mutagenesis
experiment design. They are however of very limited assistance during
detailed ligand binding studies.
- The last category of models comprises all those which were build based on templates which share a high degree of sequence identity (> 70%) with the target.
Such models have proven useful during drug design projects and allowed
the taking of key decisions in compound optimisation and chemical
synthesis. For instance, models of several species variants of a given
enzyme can guide the design of more specific non-natural inhibitors.
However, nothing is absolute and there are numerous occasions in
which models falling in any of the above categories, could either not
be used at all or in contrast proved to be more useful and correct than
estimated.