Vous êtes sur la page 1sur 16

Concepts and Tools for

NMR Restraint Analysis


and Validation
SANDER B. NABUURS,1 CHRIS A.E.M. SPRONK,1 GERT VRIEND,1 GEERTEN W. VUISTER2
1
Center for Molecular and Biomolecular Informatics, University of Nijmegen, Toernooiveld 1,
6525 ED Nijmegen, The Netherlands
2
Department of Biophysical Chemistry, University of Nijmegen, Toernooiveld 1, 6525 ED Nijmegen, The Netherlands

ABSTRACT: The quality of NMR-derived biomolecular structure models can be assessed


by validation on the level of structural characteristics as well as the NMR data used to derive
the structure models. Here, an overview is given of the common methods to validate
experimental NMR data. These methods provide measures of quality and goodness of t of
the structure to the data. A detailed discussion is given of newly developed methods to
assess the information contained in experimental NMR restraints, which provide powerful
tools for validation and error analysis in NMR structure determination. 2004 Wiley
Periodicals, Inc. Concepts Magn Reson Part A 22A: 90 105, 2004

KEY WORDS: structure validation; experimental restraints; restraint validation; structure


renement

INTRODUCTION the spectroscopic data directly, geometric confor-


mational restraints are derived from these data,
The result of a biomolecular structure determina- which are subsequently used to calculate the struc-
tion by solution nuclear magnetic resonance (NMR) tures (1). Derivation of such structural restraints
spectroscopy is typically a family of structural from NMR spectra is complicated because spectral
models describing the accessible molecular confor- overlap, spin diffusion, local dynamics, and inter-
mations. This family, or ensemble, of structure converting conformations have to be taken into
models should agree as a whole with the experi- account. The traditional manual assignment of
mental NMR data used in the procedure, as well as NMR resonances and conversion of NMR peaks
other additional data. Typically, rather than using into structural restraints is an extremely time-con-
suming process, even for experienced spectrosco-
Received 2 March 2004; revised 13 April 2004; ac- pists. Further, manual interpretation of NMR data is
cepted 13 April 2004 prone to human error and, possibly, manipulation.
These problems are being alleviated by the recent
Correspondence to: Geerten W. Vuister; E-mail:
vuister@nmr.kun.nl development of several automated methods (for
Concepts in Magnetic Resonance Part A, Vol. 22A(2) 90 105 (2004) detailed discussions see [2, 3]), which have in-
Published online in Wiley InterScience (www.interscience.wiley. creased the speed, reliability, and reproducibility of
com). DOI 10.1002/cmr.a.20016 the interpretation and analysis of NMR data and the
2004 Wiley Periodicals, Inc. subsequent structure calculation process.
90
NMR RESTRAINT VALIDATION 91

It is of great importance that the structural re- ecules using NMR spectroscopy (11). NOEs provide
straints are subjected to thorough validation. First, in essential information to dene secondary and tertiary
the process of optimization and interpretation of the structure as they connect pairs of atoms that are in
structure model, a good assessment of the reliability close proximity in Cartesian space but potentially far
of the data and structures has to be made before apart in sequence space. Peaks observed in NOESY
publication and deposition in the BioMagResBank spectra (1214) are translated into distance restraints
(BMRB) (4) and the Protein Data Bank (PDB) (5, 6). using the volume of the cross-peak (V), which is
Second, users of structures from the PDB have at related to the distance d ij between the two interacting
present access only to the derived structural restraints atoms i and j by
and not the original spectroscopic data. Thus, a good
assessment of the reliability of the complete structure
V d 6
ij f c [1]
determination procedure is difcult after deposition of
the structure models and data and therefore relies on
proper procedures by the submitting authors. The transfer of magnetization depends on both the
In this review, we discuss various common and variation in d ij , as expressed in the averaging of this
new methods for validation of NMR-derived struc- term, and effects of internal- and global motions, as
tural restraints. An overview will be given of the accounted for in the function f( c ). Although average
different types of restraints that can be derived from distances are sometimes calculated directly from the
NMR data, their application in structure calculations, NOE peak volumes, it is customary to classify peak
and the concepts and tools that are available for their volumes into weak, medium, and strong signals (15),
validation. where each class corresponds to a set of approximate
distance bounds that should reect the uncertainties in
the derived distances (1). These uncertainties may
DATASETS arise from effects such as spin diffusion and local
dynamics and from problems such as errors in peak
Throughout this review we use experimental datasets integration or spectral overlap.
as deposited in the PDB to demonstrate applications Apart from taking the various origins of uncertain-
of the different restraint analysis methods discussed. ties in NOE peak volumes into account, possible
To calculate trends or averages for multiple datasets, multiple contributions to the peak volumes have to be
we use the experimental data of a set of 100 proteins, incorporated in the treatment of the distance re-
ranging in size between 20 and 370 amino acids, straints. These multiple contributions may arise from
which we analyzed and validated previously (7). As true ambiguities due to spectral overlap of resonances
an individual example, we use the NMR dataset of a from different (groups of) atom(s) or from degeneracy
recently determined high-resolution structure of the of resonances within a group of covalently con-
B1 immunoglobulin binding domain of protein G (8) strained atoms. An example of the rst would be the
(GB1), as shown in Fig. 1. two methyl groups of two different alanines that over-
lap in the spectra and of which the resonances show
an NOE with a third (group of) atom(s). An example
RESTRAINT TYPES of the latter type of degeneracy would be the three
protons within a methyl group that show a single NOE
In NMR structure determination, three major classes cross peak, with three contributions, to another pro-
of structural restraints can be distinguished: distance ton.
restraints, torsion angle restraints, and orientational Various different ways have been used in the past
restraints. Additionally, other sources of information to take into account spectral degeneracy of, e.g., the
have been used in the renement of the derived struc- protons in methyl groups or aromatic rings within a
tures, such as information from chemical shifts (9) residue. Among these is the use of pseudoatoms in
and high-resolution structure databases (10). combination with corrections on the distance bounds
(denoted in XPLOR/CNS terminology as center av-
eraging) or methods that sum up the individual con-
Distance Restraints
tributions (denoted in XPLOR/CNS terminology as
NOE-Derived Distance Restraints. Traditionally, r6 sum averaging) (16).
nuclear Overhauser effect (NOE)-derived distance re- True ambiguities in the assignment of NOE cross-
straints are the most important source of structural peaks can often only be resolved on the basis of a
information in the structure determination of biomol- structural model, which is usually not available at this
92 NABUURS ET AL.

stage of spectral assignment. To address this problem,


the concept of ambiguous distance data was intro-
duced (17). An ambiguous distance restraint (ADR) is
described in terms of the distances between all pairs
of protons that may be involved, as shown in Eq. [2].


1/6
NF 1 ,F 2

dF 1,F 2 d6
k [2]
k1

where k runs through all N(F1, F2) contributions to


a crosspeak at frequencies F1 and F2, and dk is the
distance between the two protons corresponding to
the contribution k calculated from a model struc-
using an appropriate target
ture. By restraining d, Figure 1 (A) Ensemble of 30 structural models of GB1
function, to the distance bounds derived from the (8 ). The -helix is shown as a blue ribbon, the -sheets are
volume of the NOE crosspeak, ambiguous restraints indicated with red ribbons. Hydrogen atoms have been
are now commonly included in NMR structure cal- omitted for clarity. (B) Restrained minimized average struc-
culations (18 ). ture of GB1, with the 659 experimental distance restraints in
the experimental dataset shown in yellow. Restraints involv-
ing groups of hydrogen atoms are, for clarity reasons, only
Hydrogen Bond Restraints. Hydrogen bond re- shown for one of the protons involved. Figure made using
straints can be very useful during structure calcula- YASARA (http://www.yasara.org).
tions, especially in dening secondary structure ele-
ments such as -helices and -sheets. A problem with
hydrogen bond restraints is that they are commonly
assigned based on indirect experimental information, Torsion Angle Restraints
such as nonexchanging amide protons (19), chemical Torsion angle restraints are derived from the vicinal
shifts (20), and one-bond coupling constants (21), coupling constant, 3 J, by means of the well known
which only identify the donor atom involved in the Karplus relations (25):
hydrogen bond. However, 3h J and 2h J spin-spin cou-
plings can now be measured across hydrogen bonds, 3
J A cos 2 B cos C [3]
providing for direct evidence of the acceptor atom
(22, 23). This procedure works well for oligonucleo-
tides and in favorable cases for proteins. In the latter where describes the torsion angle between the four
case, the heteronuclear 2h J(CH) and 3h J(CN) cou- involved atoms. The three parameters A, B, and C
have been empirically parameterized using molecules
plings provide the experimental evidence for the pres-
for which X-ray models are known and depend on the
ence of a hydrogen bond (23, 24).
coupling constant under investigation. The 3 J-cou-
Even though the acceptor atom can now be iden-
pling constant information can be translated into al-
tied experimentally, it is not uncommon to infer the
lowed ranges for the relevant torsion angles and as
acceptor from structural models or assumptions about such can be incorporated in the structure calculation.
regular secondary structure. In these cases, ambiguous In case of 13C and 15N isotope labeling coupling
distance restraints are useful to allow for some ambi- constants involving these nuclei can provide many
guity or multiple possibilities in the assignment of useful additional parameters that can provide struc-
potential hydrogen bond acceptor(s). An example of tural information on backbone and side chain torsion
how the hydrogen bonding network can be dened in angles (26).
an -helix using ambiguous distance restraints is Whereas the atoms involved in an NOE restraint
shown in Fig. 2. The ambiguous distance restraint can be far apart in sequence space, torsion angle
illustrates the formation of either the i, i 3, i, i restraints provide only information on local confor-
4, or the i, i 5 hydrogen bond, thus allowing for mations. As a result, the contribution of torsion angle
the various types of helices dened by these different restraints to the global fold is limited (27), though
hydrogen bonding patterns. they do provide important information to restrain the
NMR RESTRAINT VALIDATION 93

polyacrylamide gels with anisotropic cavities. Phages


and bicelles readily align with the magnetic eld, and
steric or electrostatic interactions with these anistropic
molecules introduce a small net alignment of the
biomolecule of interest. As a result, dipolar couplings
no longer completely average out and give rise to a
observable residual dipolar coupling, typically scaled
down by a factor of 103 relative to its static value.
In this case, the overall appearance of the NMR
spectrum is not altered, allowing for the direct mea-
surement of these couplings as small contributions to
the J-couplings of the involved nuclei (29, 30). The
observed residual dipolar coupling (D) between two
nuclei, p and q, is given by

3
D pq, D a 3 cos 2 1 Dr sin2 cos2
2
[4]

Figure 2 An ambiguously assigned hydrogen bond re- where D a is the magnitude of the dipolar coupling
straint indicated in an -helix. The three contributing dis- tensor and D r the rhombicity. For a given value of
tances are shown in yellow, allowing for the formation of D pq , the cylindrical coordinates and describe a
either the i, i 3, the i, i 4, or the i, i 5 hydrogen cone of solutions for the orientation of the vector pq
bond. In this case, the distance would be restrained to 2 . in the principal axis system of the molecular align-
For any given structure model, the effective distance is
ment tensor. Provided that the molecular alignment
calculated using Eq. [2]. Figure made using YASARA
tensor is known, residual dipolar couplings present
(http://www.yasara.org). [Color gure can be viewed in the
online issue, which is available at www.interscience.wiley. information on the orientation of internuclear vectors
com.] relative to an external reference frame. This external
frame can be introduced in the structure calculation
process as a tetra-atomic pseudomolecule, which rep-
backbone torsion angles and , and the side chain resents the alignment tensor as an orthogonal axis
torsion angles. system (28). In addition to providing constraints on
local geometry, the residual dipolar couplings also
restrain the orientation of all the involved bond vec-
Orientational Restraints tors relative to this common reference frame. This
A more recent addition to the arsenal of experimental provides long-range order information, which is not
restraints available to the NMR spectroscopist are the directly accessible from any of the other commonly
orientational restraints derived from residual dipolar used NMR parameters. During structure calculations,
couplings (RDCs). To obtain structural information the orientational restraints derived from residual di-
from dipolar couplings, the internuclear dipolar inter- polar couplings can be used together with distance
actions must be observable. In solution, the distribu- restraints and torsion angle restraints to yield more
tion of the molecular orientations is normally random accurate NMR structures (28, 31).
in the absence of a magnetic eld, and as a result the
internuclear dipolar interactions average to zero and
thus cannot be observed. In the presence of a mag- ANALYZING SETS OF RESTRAINTS
netic eld, most biomolecules will still be oriented
randomly, with the exception of molecules that have a Sets of experimentally derived restraints, as discussed
large magnetic dipole moment, such as nucleic acids, in the previous sections, can be used for different
which may show a small degree of alignment with the types of analyses, pertaining to four different aspects
magnetic eld (28). of validation discussed throughout this section. The
Alignment of biomolecules can be achieved by rst and most widely used method is a statistical
immersing them into a slightly anisotropic environ- analysis of the restraints, e.g., the total number of
ment, such as solutions containing phages, bicelles, or restraints that were obtained from the different exper-
94 NABUURS ET AL.

Table 1 Restraint Validation Scores for the GB1 Dataset (PDB Entry 3GB1)

Torsion Angle Dipolar Coupling


Distance Restraints Restraints Restraints

Number of restraints 735 145 300


(37/17/11/35%)e
Restraints per residue 13.1 2.6 5.4
NOE completenessa 49%
Average number of violationsb,c 1.0 1.0
Cross-validated average number
of violationsb,c 4.5 1.5
RMS violationsb 0.02 0.67
Cross-validated RMS violationsb 0.16 19.0
Independent RDC R factord 26%/18%d
RDC R factord 8%/7%d
QUEEN set information 98.4% 1.6%
(0.4/2.4/11.8/85.4%)e
Nonredundant distance restraints 480
(9/18/19/53%)e
a
Determined at a 4 cutoff with all restraints included.
b
Determined in structures calculated without the inclusion of residual dipolar coupling data.
c
A violation threshold of 0.2 and 2 was used for the violation analysis.
d
The independent dipolar coupling R factor was calculated with only distance and dihedral restraints included in the structure calculation.
The rst and second values relate to NH dipolar coupling measured using tobacco mosaic virus and bicelles, respectively (31).
e
The numbers in brackets relate to the distribution (expressed in percentages) over the intraresidual, sequential, medium-range, and
long-range restraint classes, respectively.

iments. Second, once a structure model has been tempt to assess their value for judging the quality of a
calculated, it can be compared to the restraints and a structure model.
measure of how well the structure ts the data can be
obtained. This normally includes a restraint violation
analysis and calculations of the root mean square of Number of Restraints
the restraint violations. Third, a measure of accuracy One of the most straightforward indicators of the
is desirable, providing an estimate of how accurately quality of an experimental restraint set is the number
the structure model represents the truth. Like in X-ray of restraints, which is sometimes reported as the num-
crystallography, this may be done by calculation of ber of restraints per residue [cf. Table 1 and Fig. 3B].
the completeness of a dataset, R-factors, and by cross- This indicator can be heavily biased, however, if not
validation. The last type of analysis discussed relates derived carefully. Fewer than average, or no NOEs at
to the information contained in experimental re- all, are typically observed for the regions of molecules
straints, i.e., how important a particular (type of) with conformational disorder, e.g., exible termini or
restraint is for the nal structure that is derived. One large external loops. When disordered residues are
of the classical measures of information is the classi- included in the restraint count, this can result in an
cation of distance restraints in short-, medium-, and articially low number of restraints per residue. To
long-range restraints. This classication is based on prevent this, the analysis is generally restricted to the
the distance a restraints spans in sequence space, a well-dened regions of the molecules under investi-
measure independent of its Cartesian distance. More gation, e.g., those areas with a low ensemble RMSD
recently, we proposed measures based on information or circular variance, or those residues involved in
theory (27). secondary structure elements. For our test set of 100
Here, we explain the classical and new methods for NMR-derived structural ensembles, the number of
validation and analysis of experimental restraints. An distance restraints per residue increases from 12 5,
overview of the various validation tools applied to the calculated using all residues, to 16 7, if only those
GB1 dataset is given in Table 1 and will be discussed residues involved in secondary structure elements are
throughout the text. Possible drawbacks and advan- included in the analysis. Adding to the difculty of
tages of different methods are discussed and we at- interpreting the total number of distance restraints is
NMR RESTRAINT VALIDATION 95

Figure 3 NOE completeness values and number of NOEs for the GB1 dataset (8 ). (A) The
cumulative and per-shell completeness (left y-axis) versus the radius of the distance shell and the
number of observed and expected NOEs (right y-axis) versus the distance shell radius. (B) NOE
completeness at 4 cutoff radius (left y-axis) and number of NOE restraints per residue (right
y-axis) for the 56 residues of GB1.

their dependence on the size and shape of the biomol- plings across hydrogen bonds present an exception to
ecule, and to some extent on the residue-type compo- this, as these cannot be predicted unambiguously from
sition. Plotting the number of distance restraints per sequence information alone.
residue corrects to some extent for the size and shape
dependence; however, for very small molecules the
maximum number of restraints per residue that are NOE Completeness
expected will be lower than for large molecules. Fur- The NOE completeness was developed in the late
ther, the residue-type dependence will be expressed nineties to provide a more informative measure of the
clearly in such plots, e.g., glycines will generally be information contained in data sets then the number of
involved in fewer NOEs than the larger leucines (32). restraints per residue. The completeness is dened as
Yet another pitfall in deriving the number of re- the ratio, expressed as a percentage, of the number of
straints per residues is redundancy in the experimental matched experimentally observed NOEs (N observed )
input data. It has been show that a signicant fraction and the number of expected NOEs (N expected ) (32):
of the distance restraints used in a structure determi-
nation is typically redundant (27, 33). In order to use
the number of restraints as a measure for the amount N observed
completeness 100% [5]
of information in the data set, redundant distance N expected
restraints have to be removed prior to the analysis.
Methods to do so are discussed in some of the fol- and thus normalizes the number of observed re-
lowing sections. straints. Unfortunately, there is no way to determine
For restraints derived from chemical shifts and the number of expected restraints a priori, and there-
through-bond interactions, such as J-couplings and fore a structural model has to be known to determine
residual dipolar couplings, the number of restraints is the NOE completeness. Given this structural model,
more indicative of the quality of the experimental the expected NOE restraints are derived by selecting
data, since the total number of measurable interac- all typically observable proton-proton distances below
tions is known from the primary structure of the a given cutoff value. Protons that are rarely observed
molecule. Restraints that are derived from J-cou- in NMR experiments, such as those in amino (Lys,
96 NABUURS ET AL.

N-terminus), carboxyl (Asp, Glu, C-terminus), tive to also study the size and number of smaller
sulphhydryl (Cys), and hydroxyl (Ser, Thr, Tyr) violations below the cutoff value, as they can also
groups, and those connected to the imidazole group of pinpoint problematic regions in the structure or erro-
histidine and the guadinium group in arginine, are neous restraints in the dataset.
excluded from the analysis. The set of expected NOE The agreement of an ensemble of structural models
restraints is then compared to those actually observed with the experimentally derived distance restraints
within the limits imposed by the cutoff value. The can be judged by the root mean square (RMS) NOE
number of matched restraints is then used to calculate deviation:


the NOE completeness for that particular cutoff value
using Eq. [5]. By increasing the cutoff value in small

Nr Nm
steps, gradually all observed NOEs are included in the 1
RMS NOE 2
analysis. In case of structural variability, the expected Nr Nm k1 l1
kl


NOE distance is dened as the average distance in the
different members of the ensemble. If this average dkl r upper
k kl dkl r kupper
with r k dkl r k kl 0
lower upper
distance is above the cutoff value, it will be discarded [6]
from the set of observable contacts. This way, a dkl r lower
k kl r lower
k dkl
exible region of a protein can have a similar com-
pleteness as a rigid region although the latter is de- with d kl the actual distance for restraint k in model l,
ned by more NOEs per residue. r upper
k the upper bound of restraint k and r lower
k the
Figure 3A and Table 1 show the NOE complete- lower bound of restraint k. The sum is calculated over
ness values for the GB1 dataset. The cutoff value was all N r distance restraints and N m structural models.
increased from 2.5 to 8.0 with a step size of 0.5 . An identical expression can be used to calculate the
The completeness drops from 55% in the 2.0 2.5 RMS torsion angle restraints deviations (accounting
shell to 0% in the 7.5 8.0 shell. In addition to the for the circular nature of this property), with d kl the
completeness per shell, the cumulative completeness actual torsion angle measured in model l, and r upperk ,
is also shown. The cumulative completeness for the and r lower
k , the upper and lower limit of the torsion
GB1 dataset is 53%, 49%, and 33% up to 3, 4, and 5 angle range allowed for by restraint k, respectively.
cutoff distances, respectively. A comparison be- Fig. 4 shows the distribution of the size and num-
tween the NOE completeness per residue, calculated ber of distance restraint violations in the reference set
with a 4 cutoff, and the number of restraints per of 100 NMR structural ensembles before and after
residue is shown in Fig. 3B. Although the overall renement of these structures in explicit solvent (7).
trend between the two graphs is similar, the two In the original structures a signicant number of vio-
quantities are only weakly correlated as expressed by lations between 0.5 and 1.0 are found, most of
a correlation coefcient of 0.4, illustrating the differ- which are absent in the rerened structural models.
ent nature of the two information measures. This improvement in the t of the models to the
experimental data is also reected in the RMS NOE
violation for the original and rerened models, which
Restraint Violations is 0.089 and 0.029 , respectively (7).
The structural models resulting from an NMR struc-
ture calculation are rarely in exact agreement with the
NMR R Factor
experimental input data used to calculate them. Pos-
sible reasons include inconsistencies in the input data, A more direct indicator for the quality of structural
which can originate from assignment errors, calibra- models calculated on the basis of experimental NMR
tion problems, or the presence of several different data is the agreement between the original experimen-
conformers in solution. The best models generated in tal data and the data back-calculated from the pro-
a structure calculation can be selected using different, posed structures. This agreement can be expressed
often subjective, criteria, e.g., force eld energies or using the R factor, the normalized mean deviation
the number and size of the experimental restraint between the measured and back-calculated data, a
violations. In the latter case, there is general consen- quality measure often used in X-ray crystallography.
sus in the NMR community that structures without The lower the R factor, the better the model represents
violations 0.5 can be considered acceptable, the experimental data. As the principal data in NMR
although over the past years lower cutoffs of 0.3 structure determinations are the NOE signals with
have also been reported. When selecting structures their corresponding intensities, the NMR R factor is
based on a violation size criterion, it can be informa- dened as the measure of agreement between the
NMR RESTRAINT VALIDATION 97

Figure 4 Occurrence of NOE-derived distance restraint violations in 100 NMR-derived structural


ensembles as function of the violation size shown before (red) and after (blue) renement in explicit
solvent. The 100 ensembles consist of 1567 structural models in total. [Color gure can be viewed
in the online issue, which is available at www.interscience.wiley.com.]

observed and back-calculated intensities observed in tions and uncertainties involved in the computational
the NOESY spectra (34). Theoretical NOE signal procedures, these problems make the R factor a rarely
intensities can be calculated using relaxation matrix used validation criterion in structure determinations
theory, which solves the Bloch equations for the com- by NMR spectroscopy.
plete spin system for a given mixing time mix (35,
36). In this approach, all interacting spins are treated
as a network, and the volumes of the NOE cross peaks Complete Cross Validation
are calculated by exponentiating the matrix of cross-
relaxation rates: the relaxation matrix. As mentioned, in X-ray crystallography the R factor is
An R factor denition to determine the agreement a much more commonly used criterion for judging the
between the measured and calculated intensities, anal- quality of structural models. But the crystallographic
ogous to that used in X-ray crystallography, is R factor is not without problems, as it can be arti-
cially reduced by introducing more parameters to
i, j W ij mixA calc
ij mix A ij mix
p exp p describe the model. Kleywegt and Jones (37) have
R [7] shown that it is possible to overt diffraction data to
i, j Wij mix Aexp
ij mix
p

the extent that a structural model with every amino


with A calc and A exp the calculated and experimental acid at an incorrect position can still be rened to a
ij ij
cross-peak intensities resulting from the interacting reasonable R factor. The R free factor was introduced
spins i and j, respectively, W ij ( mix ) a mixing time- as a quality indicator less prone to overtting (38). Its
dependent weighting factor, and the power p can denition is identical to that of the traditional R fac-
simply be 1 or, e.g., 1/6 to reect the asymptotic tor, except that the R free value is calculated for a small
behavior of the NOE. The weighting factor should subset of the data which was not used in the rene-
account for the uncertainties arising from the experi- ment of the model. Therefore, the R free measures how
mental and computational procedures. It is also here well the model predicts experimental data not used to
that the problems with an NMR R factor lie. Estimat- derive the model, and hence can detect overtting of
ing the error of the signals observed in the NOESY the data. This method is also commonly referred to as
spectrum is complicated. Together with the assump- cross-validation.
98 NABUURS ET AL.

In the application of cross-validation in crystallog- shifts, different X-ray and NMR-derived structural
raphy, typically 10% of the reections are omitted models were validated in this way (42). Changes in
(the test set) and the remainder of the data (the work- the chemical shifts of carbonyl carbons were mea-
ing set) is used for model renement. In contrast to sured (meas) in a dilute liquid crystalline phase and
X-ray crystallography, where each single reection back-calculated from structures (pred) that were de-
contains information about the whole structure, each termined without the inclusion of the chemical shifts
NOE signal in NMR spectroscopy provides only local as structural restraints. Structures calculated with re-
information. Hence, where deletion of 10% of the data sidual dipolar couplings as orientational restraints
in a crystallographic dataset results in a 10% loss in showed an improved t to the carbonyl chemical shift
information content, this will not be case for NMR- data as compared to structures calculated without the
derived datasets. Additionally, the information con- inclusion of residual dipolar couplings. The measure
tent of the separate restraints is not identical through- for the agreement between the structures and the
out an NMR dataset, e.g., intermolecular restraints observed property, in this case the changes in car-
versus intraresidual restraints (27, 33). As a result of bonyl chemical shift, was obtained by using the qual-
this, cross-validation with a single test set is not ity or Q factor (42), dened as:
appropriate for NMR structure determinations.
To circumvent these problems, the concept of com- rms meas pred
plete cross-validation was introduced (39). In this Q [8]
rmsmeas
approach the NMR restraints are randomly partitioned
into test sets of roughly equal size, and cross-valida-
tion is performed with each of the test sets. Statistical Like the R factor, the closer the Q factor is to zero, the
quantities are then averaged over the different test better the agreement between the model and the ob-
sets. By doing so, the differences in information con- served independent data.
tent between the test sets is expected to average out, A similarly dened Q factor has been used as a
resulting in more meaningful cross-validated mea- measure of t for residual dipolar couplings (43, 44),
sures of t for NMR datasets. Despite this, the fact and is equivalent to 2 times the dipolar coupling R
remains that NOEs are not independent observations. factor (45). When not directly used in the calculation
NOEs can often only be assigned based on informa- protocol, the RDC R factors for GB1 have values of
tion provided by earlier assigned NOEs, e.g., from 26% and 18% for data recorded in tobacco mosaic
structures calculated using these NOEs. If one of virus and bicelles, respectively. In contrast, these val-
these early assigned NOEs is then moved to the test ues drop to 8% and 7% upon their inclusion in the
set, it will still inuence the structure if assignments calculation, indicating that they convey unique infor-
made based on this restraint are still present in the mation about the biomolecular structure (31).
working set. Thus, if not treated carefully, the calcu- As a second example of independent validation, we
lated free R values are not as free as one might mention the use of the residual dipolar coupling Q
assume. factors to check improvements in renement schemes
Using the deposited GB1 restraint set (8), we cal- and force elds (46, 47). Structures of the protein
culated and cross-validated an ensemble of 30 struc- ubiquitin were calculated and rened using distance
tures using the default CNS (40) protocols. Cross- and torsion angle restraints, and independently vali-
validated values typically are signicantly higher dated against different sets of residual dipolar cou-
when compared to their full dataset counterparts (39) plings. Structures rened in explicit solvent showed
[see Table 1], and no direct conclusion can be derived signicant decreases in the residual dipolar coupling
from their comparison. They are, however, useful Q factors as compared to the original structures,
when comparing the effects of variations in the struc- which was accompanied by a concomitant decrease in
ture calculation protocol, such as the usage of multi- NOE RMS values, indicating a better overall agree-
ple models (41). ment with the experimental data.

Quality Factors Information Content


In a so-called independent validation of structures, Recently, we proposed a method for the quantitative
some of the experimental data is completely kept out evaluation of experimental NMR restraints (QUEEN)
of the structure calculation and renement process (27). QUEEN is based on a representation of the
and used only to validate the resulting structural mod- structure in distance space and concepts derived from
els. For example, using anisotropic carbonyl chemical information theory. As many of the commonly used
NMR RESTRAINT VALIDATION 99

Figure 5 The structural uncertainty, H structure , of GB1 as a function of the number of included
distance restraints. The distance restraints are grouped into four sets: intraresidual restraints (i j
0, IR), sequential restraints (i j 1, SQ), medium-range restraints (i j 5, MR), and
long-range restraints (i j 5, LR). Two different orders of addition of the experimental data
are shown: (A) IR-SQ-MR-LR and (B) LR-MR-SQ-IR. (C) Comparison of the average relative
dataset size and average set information content for 100 NMR-derived experimental datasets. The
fraction of restraints in these datasets that belongs to each of the four mentioned restraint classes is
shown in the left bar. The right bar shows the average distribution of the structural information over
the different restraint classes a determined by the QUEEN (27 ) method. [Color gure can be viewed
in the online issue, which is available at www.interscience.wiley.com.]

experimental input data can be readily represented in tainty, the information I r contained in an experimental
distance space, it is possible to construct a distance restraint (or a set of restraints) is dened as the differ-
matrix representing all available distance information. ence in uncertainty of the system before (Hstructure) and
In this matrix, a structure with N atoms can be after (H structurer ) addition of the experimental data r:
uniquely dened by N(N 1)/ 2 interatomic dis-
tances. As structures are typically not exactly dened I r H structure H structurer [10]
by NMR-derived data, distances are described by an
upper and lower bound between which the true value A bound-smoothed distance matrix containing only
for that distance must lie. QUEEN uses an N N the covalent constraints is taken as the initial state,
matrix for storing the upper and lower bounds. By H structure0 , prior to the addition of any experimental
means of a bound-smoothing algorithm (48, 49), the restraints. After addition of a restraint, or a set of
distance information contained in the experimental restraints, the upper and lower bounds on all distance
restraints can be introduced in this distance matrix. limits are adjusted and the uncertainty of the structure
We express the amount of structural knowledge can then be calculated using Eq. [9]. Fig. 5A and 5B
about a specic system in terms of a uncertainty, H. show the decrease in the structural uncertainty for
Using concepts derived from information theory (50) GB1 as the 659 distance restraints of its dataset are
we derived (27) the following denition for the struc- incorporated into the distance matrix. The two graphs
tural uncertainty, H structure : clearly illustrate the context dependency of restraint
information: e.g., the information contained in the

logd
N N medium range restraints is strongly reduced if the
1
H structure upper
d lower [9] long-range restraints are already incorporated in the
NN 1 s1 ts
st st
distance matrix. The context dependency can be re-
moved by averaging the information content of the
where N is the number of atoms in the structure, different restraint sets over all their possible permu-
upper
d st is the upper bound of the distance between tations. Fig. 5C shows the average set information
lower
atoms s and t, and d st is the lower bound of that content for the 100 experimental datasets, together
same distance, with all bounds given in ngstrom with the distribution of the restraints in these datasets
units. With this denition for the structural uncer- over the different restraint classes. Traditionally, the
100 NABUURS ET AL.

classication of restraints in intraresidual, sequential,


medium range, and long range is used to provide a
qualitative measure of the information content in
NMR-derived distance restraints (15), as it is gener-
ally recognized that the more long range the interac-
tions are observed, the larger their impact on the
structure determination and thus the more information
this category contains. Fig. 5C shows that this as-
sumption is indeed true; calculating the information
content of the different classes using the QUEEN
procedure shows that the long-range restraints are the
most informative class of restraints. However, it also
shows that the distribution of the information over the
four restraint classes is completely different from the
distribution of the actual number of restraints over
those classes, rendering the latter a rather poor indi-
cator for the information content in experimental re-
straint sets.

ANALYZING INDIVIDUAL RESTRAINTS

In addition to analyzing restraints sets as a whole, it is


also informative to analyze experimental restraints
individually. During the structure determination pro- Figure 6 (A) The relative information content, I ave /I total ,
cess, this approach can be especially helpful in the and (B) the relative unique information content, I ave /I total ,
interpretation and validation of the derived restraints both plotted as a function of the NOE restraint index of the
and their resulting structures. At this stage it is still GB1 dataset (8 ). The different restraint classes are labeled
as in Fig. 5. Note that the similar gures in (27 ) are
possible to return to the experimental data and verify
generated using the similar but not identical original GB1
the assignment and volume integrations of those re- dataset (57 ).
straints that are identied as problematic. Once the
structures have been deposited, and the original spec-
troscopic data often is no longer available, the differ- restraints. Often, incorrect restraints induce violations
ent restraint analyses can only be used to identify in nearby correct restraints. Therefore, if a consis-
problematic regions in structural models, without the tently violating restraint is found, one should reexam-
possibility of reevaluation against the original spec- ine all restraints in that region of the structure.
troscopic data.
Restraint Information Content
Consistent Violations
Next to analyzing the agreement of the applied exper-
Violation analyses and statistics for complete datasets imental restraints with the generated structures, the
have been discussed in the previous sections. Here, experimental restraints can also be analyzed indepen-
we briey mention the approach of analyzing individ- dently of the structural models. Using QUEEN, the
ual restraint violations throughout the different mod- overall importance and the unique contribution of
els in an NMR ensemble. When analyzing individual each restraint can be determined (27). Similar to
restraints, it is important to distinguish between con- calculating the information content for a set of re-
sistent violations, occurring throughout the majority straints, the average information content (I ave,r ) of a
of the generated models, and those occurring ran- single restraint r can be calculated as:
domly. Consistent violations are a powerful indicator
of inconsistencies in the experimental data, where I ave,r H structure H structurer [11]
those violations occurring less frequently might just
indicate that the structure calculation algorithm has The average information content provides a measure
not reached convergence yet. The consistently violat- of the overall importance of a restraint within the
ing restraints are not necessarily also the incorrect complete dataset. Due to the context dependency of
NMR RESTRAINT VALIDATION 101

the information content a meaningful comparison of group of 15 entries (those with the highest fraction of
information measures between different datasets is intra-residual restraints), 58% of these restraints were
not possible, though we are currently in the process of redundant. To detect these redundant restraints, they
developing methods to allow such analyses. Fig. 6A implemented a check in the AQUA program, which is
shows the relative importance of each of the 659 aimed at the analysis of biomolecular structures de-
distance restraints in the GB1 dataset. As expected termined by NMR (53). AQUA identies redundant
from the results shown in Fig. 5C, the medium- and intraresidual restraints by comparing the upper and
long-range restraints contain the most experimental lower bounds of the experimental restraints to the
information. However, a large variation in informa- distance limits observed for that particular intrare-
tion content is observed, even between restraints sidual distance in a reference molecular dynamics
within one restraint class. simulation. Restraints are considered redundant when
In addition to the average contribution of a re- both the upper and lower bound are within the dis-
straint, the unique information a restraint adds to the tance limits imposed by the molecular geometry. Be-
dataset can also provide useful insights. The unique cause of the limited sampling possible in a molecular
information content (I uni,r ) of a restraint r is dened dynamics simulation, AQUA only considers redun-
as the information it adds given knowledge of all dancy in the intraresidual restraints (33).
other (R r) restraints in the dataset (R): With the information measures calculated by
QUEEN, it is straightforward to identify redundant
I uni,r H structureRr H structureR [12] restraints in any of the restraint classes. If a restraint
has a unique information content of zero, it contains
The unique information content of the distance no information that is not already present in the data-
restraints in the GB1 dataset is shown in Fig. 6B and set, and is therefore redundant (cf. Eq. 12). However,
shows a very different pattern compared to that ob- because of the context dependency, not all restraints
served for the average information. Restraints with a with a unique information content of zero can be
high unique information content are important be- immediately removed from the dataset. To ensure
cause they are less well supported by the remainder of consistency, the unique information content has to be
the dataset, indicating that they either provide crucial reevaluated after each restraint deletion. By doing so,
knowledge about this particular structure or alterna- a minimal set of restraints can constructed, which no
tively suggest a potential error. In either case, these longer contains structural redundancy in any of the
restraints are interesting and denitely warrant further restraint classes.
investigation. It is important to note here that we do not advise
We have shown previously that combining the the use of redundancy checks to lter restraint data-
unique and average information content can be useful sets. Removal of the redundant restraints results in
in the identication of problematic restraints in an smaller datasets without any loss of structural infor-
experimental dataset (27). Furthermore, when analyz- mation, which are therefore useful in retrieving more
ing large sets of restraints, QUEEN can be used to sort sensible restraint counts, e.g., for the number of in-
the restraints based on their information content. The formative restraints per residue. For the GB1 dataset,
concept of placing a lower condence on restraints Table 1 shows that about 35% the experimental dis-
that are not well supported by other experimental data tance restraints are redundant. However, nonredun-
has been applied previously in spectral assignment dant datasets are deprived of valuable conrmative
methods (51, 52). By verifying those restraints with a information. All restraints in the dataset will now
high average and unique information content rst, have high values for the amount of unique informa-
inconsistencies and errors are likely to be found tion they carry, making validation by, e.g., the
faster, as the lesser-supported data is found at the QUEEN software, uninformative.
beginning of such a sorted restraint list. Approaches
like this can potentially shorten the time needed to
manually verify and validate experimental restraint CONCLUSIONS
datasets and can be used in intelligent automated
procedures. In this review we have discussed different validation
tools available to NMR spectroscopists to assess the
quality of experimentally derived restraint datasets. In
Redundant Restraints
practice, the approaches discussed are often only ap-
In a study assessing the quality of NMR structures, plied at the end of the structure determination process.
Doreleijers and colleagues (33) observed that for a We feel, however, that the quality of both the struc-
102 NABUURS ET AL.

Figure 7 Fraction of NMR structures in the PDB with associated deposited experimental re-
straints. The best linear t is shown as a dashed line.

tures and the generated datasets would benet from Finally, the AQUA program suite can be downloaded
restraint analyses as an integral part throughout the from http://urchin.bmrb.wisc.edu/jurgen/aqua/.
iterative assignment and structure calculation process.
Fortunately, this is becoming more feasible as ever
more sophisticated automated assignment and struc- ACKNOWLEDGMENTS
ture calculation approaches are becoming available
(54 56). Additionally, more frequent usage of robust We thank all NMR spectroscopists who deposited the
quality indicators, such as complete cross-validation, experimental restraints together with their structures,
would provide a better quality assessment of the de- especially G.M. Clore and coworkers for making
rived structural models. Deposition of the experimen- available the high-quality datasets of GB1. Financial
tal restraints together with the structures ensures that support from the Netherlands Foundation for Chemi-
other researches can reproduce the structures and use cal Research (NWO/CW) to C.S. and from the Euro-
the restraints for development and testing of new pean community (5th Framework program NMR-
structure calculation, renement, and validation pro- QUAL contract number QLG2-CT-2000-01313) to
tocols. Fortunately, the number of structures that is S.N. is gratefully acknowledged.
deposited together with the experimental restraints
used to calculate them gradually increases over the
years [as shown in Fig. 7] and provides the ground for REFERENCES
development and testing of more elaborate validation
procedures. 1. Wuthrich K. 1986. NMR of proteins and nucleic acids.
New York: Wiley.
2. Moseley HNB, Montelione GT. 1999. Automated anal-
Selected Hyperlinks ysis of NMR assignments and structures for proteins.
The QUEEN software package can be downloaded from Curr Opin Struct Biol 9(5):635 642.
http://www.cmbi.kun.nl/software/queen/. The DRESS 3. Guntert P. 2003. Automated NMR protein structure
calculation. Prog Nucl Magn Reson Spectrosc 43(3 4):
database of rened structures and validation reports is
105125.
available at http://www.cmbi.kun.nl/dress/. The Collab- 4. Seavey BR, Farr EA, Westler WM, Markley JL. 1991.
orative Computing Collaborative Computing Project for A relational database for sequence-specic protein
the NMR Community (CCPN) is accessible at http:// NMR data. J Biomol NMR 1(3):217236.
www.ccpn.ac.uk/. The PROCHECK_NMR program 5. Bernstein FC, Koetzle TF, Williams GJ, Meyer EF, Jr.,
suite can be downloaded from http://www.biochem. Brice MD, Rodgers JR, Kennard O, Shimanouchi T,
ucl.ac.uk/roman/procheck_nmr/procheck_nmr.html. Tasumi M. 1977. The protein data bank: a computer-
NMR RESTRAINT VALIDATION 103

based archival le for macromolecular structures. J Mol signment of protein secondary structure through NMR
Biol 112(3):535542. spectroscopy. Biochemistry 31(6):16471651.
6. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat 21. Juranic N, Ilich PK, Macura S. 1995. Hydrogen bond-
TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The ing networks in proteins as revealed by the amide
protein data bank. Nucleic Acids Res 28(1):235242. 1JNC coupling constant. J Am Chem Soc 117(1):405
7. Nabuurs SB, Nederveen AJ, Vranken W, Doreleijers 410.
JF, Bonvin AMJJ, Vuister GW, Vriend G, Spronk 22. Dingley AJ, Grzesiek S. 1998. Direct observation of
CAEM. 2004. DRESS: a database of rened solution hydrogen bonds in nucleic acid base pairs by inter-
NMR structures. Proteins 55:483 486. nucleotide (2)J(NN) couplings. J Am Chem Soc
8. Kuszewski J, Gronenborn AM, Clore GM. 1999. Im- 120(33):8293 8297.
proving the packing and accuracy of NMR structures 23. Cordier F, Grzesiek S. 1999. Direct observation of
with a pseudopotential for the radius of gyration. J Am hydrogen bonds in proteins by interresidue (3h)J(NC)
Chem Soc 121(10):23372338. scalar couplings. J Am Chem Soc 121(7):16011602.
9. Case DA. 1998. The use of chemical shifts and their 24. Cordier F, Rogowski M, Grzesiek S, Bax A. 1999.
anisotropies in biomolecular structure determination. Observation of through-hydrogen-bond (2h)J(HC) in a
Curr Opin Struct Biol 8(5):624 630. perdeuterated protein. J Magn Reson 140(2):510 512.
10. Kuszewski J, Gronenborn AM, Clore GM. 1996. Im- 25. Karplus M. 1959. Contact electron-spin coupling of
proving the quality of NMR and crystallographic pro- nuclear magnetic moments. J Chem Phys 30(1):1115.
tein structures by means of a conformational database 26. Vuister GW, Tessari M, Karimi-Nejad Y, Whitehead B.
potential derived from structure databases. Protein Sci 1998. Pulse sequences for measuring coupling con-
5(6):10671080. stants. In: Berliner LJ, Krishna NR, editors. Modern
11. Clore GM, Robien MA, Gronenborn AM. 1993. Ex- techniques in protein NMR. Vol. 16. New York: Ple-
ploring the limits of precision and accuracy of protein num p 195257.
structures determined by nuclear magnetic resonance 27. Nabuurs SB, Spronk CA, Krieger E, Maassen H,
spectroscopy. J Mol Biol 231(1):82102.
Vriend G, Vuister GW. 2003. Quantitative evaluation
12. Jeener J, Meier BH, Bachmann P, Ernst RR. 1979.
of experimental NMR restraints. J Am Chem Soc
Investigation of exchange processes by two-dimen-
125(39):12026 12034.
sional NMR-spectroscopy. J Chem Phys 71(11):4546
28. Tjandra N, Omichinski JG, Gronenborn AM, Clore
4553.
GM, Bax A. 1997. Use of dipolar 1H15N and 1H13C
13. Macura S, Ernst RR. 1980. Elucidation of cross relax-
couplings in the structure determination of magneti-
ation in liquids by two-dimensional NMR-spectros-
cally oriented macromolecules in solution. Nat Struct
copy. Mol Phys 41(1):95117.
Biol 4(9):732738.
14. Kumar A, Ernst RR, Wuthrich K. 1980. A two-dimen-
29. Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH.
sional nuclear Overhauser enhancement (2d Noe) ex-
1995. Nuclear magnetic dipole interactions in eld-
periment for the elucidation of complete proton-proton
cross-relaxation networks in biological macromole- oriented proteins: information for structure determina-
cules. Biochem Biophys Res Comm 95(1):1 6. tion in solution. Proc Natl Acad Sci USA 92(20):9279
15. Markley JL, Bax A, Arata Y, Hilbers CW, Kaptein R, 9283.
Sykes BD, Wright PE, Wuthrich K. 1998. Recommen- 30. Tjandra N, Bax A. 1997. Direct measurement of dis-
dations for the presentation of NMR structures of pro- tances and angles in biomolecules by NMR in a dilute
teins and nucleic acids. J Mol Biol 280(5):933952. liquid crystalline medium. Science 278(5340):1111
16. Fletcher CM, Jones DNM, Diamond R, Neuhaus D. 1114.
1996. Treatment of NOE constraints involving equiva- 31. Clore GM, Starich MR, Bewley CA, Cai ML, Kuszew-
lent or nonstereoassigned protons in calculations of ski J. 1999. Impact of residual dipolar couplings on the
biomacromolecular structures. J Biomol NMR 8(3): accuracy of NMR structures determined from a mini-
292310. mal number of NOE restraints. J Am Chem Soc
17. Nilges M. 1995. Calculation of protein structures with 121(27):6513 6514.
ambiguous distance restraints. Automated assignment 32. Doreleijers JF, Raves ML, Rullmann T, Kaptein R.
of ambiguous NOE crosspeaks and disulphide connec- 1999. Completeness of NOEs in protein structure: a
tivities. J Mol Biol 245(5):645 660. statistical analysis of NMR. J Biomol NMR 14(2):123
18. Nilges M. 1997. Ambiguous distance data in the cal- 132.
culation of NMR structures. Fold Des 2(4):S5357. 33. Doreleijers JF, Rullmann JA, Kaptein R. 1998. Quality
19. Wagner G, Wuthrich K. 1982. Amide protein exchange assessment of NMR structures: a statistical survey. J
and surface conformation of the basic pancreatic trypsin Mol Biol 281(1):149 164.
inhibitor in solution. Studies with two-dimensional nu- 34. Gonzalez C, Rullmann JAC, Bonvin AMJJ, Boelens R,
clear magnetic resonance. J Mol Biol 160(2):343361. Kaptein R. 1991. Toward an NMR R factor. J Magn
20. Wishart DS, Sykes BD, Richards FM. 1992. The chem- Reson 91:659 664.
ical shift index: a fast and simple method for the as- 35. Bonvin AM, Boelens R, Kaptein R. 1991. Direct NOE
104 NABUURS ET AL.

renement of biomolecular structures using 2D NMR 52. Herrmann T, Guntert P, Wuthrich K. 2002. Protein
data. J Biomol NMR 1(3):305309. NMR structure determination with automated NOE as-
36. James TL. 1991. Relaxation matrix analysis of two- signment using the new software CANDID and the
dimensional nuclear Overhauser effect spectra. Curr torsion angle dynamics algorithm DYANA. J Mol Biol
Opin Struct Biol 1(6):10421053. 319(1):209 227.
37. Kleywegt GJ, Jones TA. 1995. Where freedom is given, 53. Laskowski RA, Rullmann JA, MacArthur MW,
liberties are taken. Structure 3(6):535540. Kaptein R, Thornton JM. 1996. AQUA and PRO-
38. Brunger AT. 1992. Free R value: a novel statistical CHECK-NMR: programs for checking the quality of
quantity for assessing the accuracy of crystal structures. protein structures solved by NMR. J Biomol NMR
Nature 355:472 475. 8(4):477 486.
39. Brunger AT, Clore GM, Gronenborn AM, Saffrich R, 54. Linge JP, Habeck M, Rieping W, Nilges M. 2003.
Nilges M. 1993. Assessing the quality of solution nu- ARIA: automated NOE assignment and NMR structure
clear magnetic resonance structures by complete cross- calculation. Bioinformatics 19(2):315316.
validation. Science 261(5119):328 331. 55. Herrmann T, Guntert P, Wuthrich K. 2002. Protein
40. Brunger AT, Adams PD, Clore GM, DeLano WL, Gros NMR structure determination with automated NOE as-
P, Grosse-Kunstleve RW, Jiang JS, Kuszewski J, Nil- signment using the new software CANDID and the
ges M, Pannu NS, et al. 1998. Crystallography and torsion angle dynamics algorithm DYANA. J Mol Biol
NMR system: a new software suite for macromolecular 319(1):209 227.
structure determination. Acta Crystallogr D Biol Crys- 56. Gronwald W, Moussa S, Elsner R, Jung A, Ganslmeier
tallogr 54(Pt 5):905921. B, Trenner J, Kremer W, Neidig KP, Kalbitzer HR.
41. Bonvin AM, Brunger AT. 1996. Do NOE distances 2002. Automated assignment of NOESY NMR spectra
contain enough information to assess the relative pop- using a knowledge based method (KNOWNOE). J Bi-
ulations of multi-conformer structures? J Biomol NMR omol NMR 23(4):271287.
7(1):7276. 57. Gronenborn AM, Filpula DR, Essig NZ, Achari A,
42. Cornilescu G, Marquardt JL, Ottiger M, Bax A. 1998. Whitlow M, Wingeld PT, Clore GM. 1991. A novel,
highly stable fold of the immunoglobulin binding do-
Validation of protein structure from anisotropic car-
main of streptococcal protein G. Science 253(5020):
bonyl chemical shifts in a dilute liquid crystalline
657 661.
phase. J Am Chem Soc 120(27):6836 6837.
43. Meiler J, Peti W, Griesinger C. 2000. DipoCoup: a
versatile program for 3D-structure homology compari-
son based on residual dipolar couplings and pseudocon-
BIOGRAPHIES
tact shifts. J Biomol NMR 17(4):283294.
44. Zweckstetter M, Bax A. 2000. Prediction of sterically
induced alignment in a dilute liquid crystalline phase:
aid to protein structure determination by NMR. J Am
Chem Soc 122(15):37913792.
45. Clore GM, Garrett DS. 1999. R-factor, free R, and
complete cross-validation for dipolar coupling rene-
ment of NMR structures. J Am Chem Soc 121(39):
9008 9012.
46. Spronk CA, Linge JP, Hilbers CW, Vuister GW. 2002.
Improving the quality of protein structures derived by
NMR spectroscopy. J Biomol NMR 22(3):281289.
47. Linge JP, Williams MA, Spronk CA, Bonvin AM,
Nilges M. 2003. Renement of protein structures in
explicit solvent. Proteins 50(3):496 506. Sander Nabuurs (second from left) studied chemistry at the Uni-
48. Crippen GM. 1977. A novel approach to the calculation versity of Nijmegen. During his studies he participated in research
projects at the Department of Biophysical Chemistry under the
of conformation: distance geometry. J Comp Phys 26:
supervision of Geerten Vuister and at the Center for Molecular and
449 452.
Biomolecular Informatics (CMBI) with Gert Vriend. They now
49. Havel TF, Kuntz ID, Crippen GM. 1983. The theory both act as his Ph.D. supervisors on the topic of NMR structure
and practice of distance geometry. Bull Math Biol validation. In addition to his graduate research, he very much
45:665720. enjoys building homology models in collaboration with several
50. Shannon CE, Weaver W. 1949. The mathematical the- experimental groups.
ory of communication. Urbana, IL: University of Illi-
Chris Spronk (second from right) studied chemistry at the Uni-
nois Press. versity of Nijmegen and received his Ph.D. in 1999 at Utrecht
51. Englander SW, Wand AJ. 1987. Main-chain-directed University. Starting with the application of solution NMR to solve
strategy for the assignment of 1H NMR spectra of and study protein and DNA structures, his main eld of interest
proteins. Biochemistry 26(19):59535958. shifted toward the computational aspects of NMR structure deter-
NMR RESTRAINT VALIDATION 105

mination, such as improved NMR renement protocols and devel- professor at the University of Nijmegen, establishing the Center for
opment of new NMR validation tools. His other goals are to educate Biomolecular Informatics. His main research topics are model
scientists in the basic principles of structure validation, provide building by homology, structure verication, specialized databases,
automated tools and improved structure models to the NMR com- and the application of computers in wet biology.
munity, and stress the importance of good validation methods for
Geerten Vuister (right) studied chemistry at the University of
NMR structures.
Groningen and received his Ph.D. at Utrecht University. Between
Gert Vriend (left) studied biochemistry at Utrecht University and 1991 and 1994 he held a postdoctoral position at the NIH, Be-
received his Ph.D. in 1983 at the University of Wageningen. He thesda, Maryland, USA. Subsequently, he was appointed assistant
held postdoctoral positions in Purdue, Indiana, USA, and in Gro- professor in the Department of NMR Spectroscopy at Utrecht
ningen, The Netherlands, working on several structures while start- University. In 1997 he became associate professor at the Depart-
ing the WHAT IF project. In 1989 he moved to the EMBL in ment of Biophysical Chemistry, NSRIM at the University of Ni-
Heidelberg, Germany, where he continued working on (and with) jmegen, where his research focuses on high-resolution NMR pro-
the WHAT IF program. In the summer 1999 he was appointed a full tein structure determination and validation.

Vous aimerez peut-être aussi