Vous êtes sur la page 1sur 8

Proc. Nati. Acad. Sci.

USA
Vol. 75, No. 8, pp. 35404547, August 1978
Applied Physical and Mathematical Sciences

Structure determination of crystalline substances by diffraction


methods: Philosophic concepts and their implementation (A Review)*
(phase problem/non-negativity/ordinary structures/biological macromolecules)
JEROME KARLE
Laboratory for the Structure of Matter, Naval Research Laboratory, Washington, D.C. 20375
Contributed by Jerome Karle, June 1, 1978

ABSTRACT Structure determination of single crystals by obstacle to the deduction of atomic arrangements from the
diffraction methods is reviewed in terms of the philosophical observed diffraction intensity data. The diffraction pattern
and mathematical aspects of the analytical techniques that are taken by the Weissenberg technique and shown in Fig. 1 is
used. A key problem concerns the need to determine the relative typical. Four numbers are associated with each diffraction
phases of the scattered rays. A major advance in treating this maximum. Three numbers are the Miller indices, which
problem resulted from recognition that useful consequences identify the set of planes in the crystal that is involved in pro-
ensue from the fact that electron densities in crystals are non-
negative functions. Advantage is also taken of the fact that the ducing the particular diffraction maximum, and the fourth
problem is usually greatly overdetermined by the number of number is a measure of the intensity. The development of a
experimental data. For biological macromolecules, the data useful theory for addressing this problem affords a good illus-
available are more limited in range and accuracy and great use tration of the use of physical constraints in structure analysis,
is made of the introduction of suitable heavy atom moieties. The and the development of a suitable procedure provides a
developments in crystal structure analysis have had a consid- worthwhile opportunity to observe the process of bridging
erable impact on progress in many scientific disciplines. between mathematical results and practical application. Despite
The determination of the atomic arrangements in crystals by the complexity of the structures of interest and of the mathe-
diffraction techniques affords especially good examples of the matical relationship between the atomic positions and the
usefulness that is derived from combining physical constraints diffracted intensities, considerable progress has been made
with diffraction theory. In applications to single crystals, use toward making the direct determination of crystal structures
is made of the non-negativity of the electron density distribution from the measured intensities a fairly routine operation.
to provide formulas for determining the relative phases of the The method for accomplishing this is known as the "direct
scattered rays. The virtue of the concept of non-negativity in method." This terminology implies that the structure is deter-
mined without the use of special information, such as the known
diffraction analysis has been discussed in a previous article (1) positions of heavy atoms that may be present. The direct
concerning the structures of gaseous and amorphous substances. method involves the determination of the phases associated with
In this earlier article, several aspects of diffraction analysis were the scattered amplitudes directly from the measured intensities.
discussed. It was pointed out in some detail that diffraction An intensity is the absolute magnitude squared of the ampli-
experiments are generally ambiguous in the sense that non- tude. Once the phases are known, appropriate Fourier series
physical models can be formulated that satisfy the diffraction that use as coefficients the observed magnitudes of the ampli-
data. The ambiguities are overcome by the introduction of tudes and their corresponding phases can immediately provide
constraints that are based on physical and chemical consider- the desired structural information.
ations. Also discussed in some detail in the previous article (1) A key feature of the attack on the problem of determining
is the concept of bridging, which concerns the adjustment of the phases of the diffraction amplitudes was the recognition that
theoretical formulas and the treatment of experimental data the non-negativity of the electron density in a crystal imposes
to achieve useful results and improve their accuracy. The a crucial constraint on the system that could possibly lead to a
ensuing discussion, particularly of single crystals, provides many practical solution of the problem. It is also valuable that the
illustrations of these many aspects of diffraction analysis. electron distributions about the individual atoms are known to
A characteristic of true crystals is three-dimensional peri- a good approximation and that it is possible to transform the
odicity. Very great numbers and varieties of substances can observed x-ray diffractiop intensities to those that would accrue
agglomerate to form crystals. Monatomic crystals can be formed from essentially point htoms. In point atoms, the electron
from the elements, and complex ones can be formed from density is concentrated in a central point in an atom, rather than
macromolecules such as proteins and viruses. Crystalline ma- being distributed in space. With the knowledge of the atomic
terials provide, as a rule, a large amount of diffraction infor- electron densities at hand, it is easy to show that the problem
mation relative to the number of unknown structural param- is greatly overdetermined by the number of diffracted inten-
eters to be determined, although this advantageous ratio de- sities normally obtained for crystals, except for the very complex
creases dramatically for macromolecular substances. When the ones such as are formed by proteins and nucleic acids. The
advantageous ratio prevails, it accounts for the fact that simple overdeterminacy can inspire the hope that relatively simple
mathematical formulations can provide the solution to a rather mathematical relationships might be found among the phases
complex analytical problem. that can have a high probability of being correct. This is, indeed,
what happens, and some of the mathematical details will be
CHARACTERIZATION AND METHOD outlined below showing the key role played by the non-nega-
tivity of the electron density.
The determination of the structures of single crystals has a very
broad range of application. This follows not only from the * By invitation. From time to time, reviews on scientific and techno-
propensity that all sorts of materials have to crystallize readily, logical matters of broad interest are published in the PROCEED-
but also from developments in techniques for overcoming a key INGS.
3540
Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978) 3541
problem in structure analysis and is generally referred to as the
"phase problem."
The Fourier inversion of Eq. 1 followed by the reduction of
the integral to the sum of contributions from the N discrete
atoms in the unit cell gives, for the Fourier coefficient,
N
E fjh exp(2irih- rj),
IFhI exp(ikh) = J=l [3]
where fjh is the atomic scattering factor (scattering amplitude)
of the jth atom in the unit cell and r1 is its position vector.
To estimate the solvability of the phase problem, Eq. 3 can
be considered to be a system of simultaneous equations since
-W -

0
-1
measurements of the intensities are made for a large number
of h. The unknown quantities are the phases Oh and the atomic
positions rj. The known quantities are the IFhI obtained from
experiment and the fjh, the atomic scattering factors, which
are tabulated. It should also be noted that each of the simulta-
FIG. 1. X-ray diffraction pattern of a single crystal by the neous equations is actually two equations since both the real part
Weissenberg technique. and the imaginary part of each equation may be set equal
separately. Comparison of the number of unknown quantities
Let us recall that a diffracted ray has an amplitude composed to be determined with the number of independent data avail-
of a magnitude and a corresponding phase. Measuring instru- able from the use of an x-ray tube having a copper target in-
ments such as photographic film or photon counters normally dicates that the overdeterminacy could be as great as a factor
record only the intensity, which is the square of the magnitude of 50 for centrosymmetric crystals and 25 for noncentrosym-
of the amplitude. The phase is therefore omitted in the re- metric ones. In the usual practice somewhat less than this degree
cording and apparently lost in the experiment. It is therefore of overdeterminacy is obtained, but the factor is still quite
most remarkable that within the context and the constraints of large.
the crystal structure problem the phases are recoverable at all, By multiplying Eq. 3 by its complex conjugate, the phases
and that they are, in fact, recoverable from the measured in- are eliminated and we obtain
tensities. Strictly speaking, what is recovered is relative phase; N N
that is, the phase of a scattered amplitude relative to the phase IFhI2 =
E E
j=1ik=
fZfjk exp[2irih- (rj-rk)]. [4]
of other scattered amplitudes. It actually makes no sense to This suggests the possibility that the atomic coordinates might
consider recovering the phase of an individual reflection in be obtained directly from the measured intensities without the
some isolated fashion since, for a number of reasons, this
quantity changes continually throughout an experiment.
intermediate use of phases. As a practical matter, it has so far
However, given some minimal specifications, the relative phases
been found to be generally more feasible to determine struc-
of the set of reflections are determinate. In phase-determining tures by first determining the phases and then computing Eq.
procedures, it is necessary to specify the values of some phases 1 than to find the atomic coordinates directly.
in order to establish an origin in the crystal of interest to which
The Fourier transform of Eq. 4 is known as the Patterson
atomic positions may be referred. The remaining phases that
function (2, 3),
co
are evaluated have values that are determined relative to those
that are specified.
P(r) = EFhi 2 exp(-2-ri h * r). [5]
The maxima of the electron density distribution p(r) locate The maxima of this function represent the interatomic vectors
the atomic positions. Since crystal structures possess three- in a structure. It is evident that the coefficients of Eq. 5 are
dimensional periodicity, p(r) may be expressed in terms of a directly obtainable from the experimental measurements. The
three-dimensional Fourier series difficulty with using this function in a general way for structure
determination arises from the lack of resolution that usually
p(r) = V-1 Fh exp(-2ih * r), [1] occurs for the N(N - 1) interatomic vectors. There is, however,
a circumstance in which the Patterson function is particularly
where the coefficients useful and has found wide application. This is when a structure
Fh = IFhI exp(i4h) [2] possesses only a few heavy atoms. The interatomic vectors as-
sociated with the heavy atoms are then readily identifiable and
are the crystal structure factors associated with the planes la- atomic positions for them can then be readily deduced. The
beled with the vector h and the h have integer components h, coordinates for the heavy atoms can be used in Eq. 3 to compute
k, and 1 (the Miller indices) whose values are inversely pro- an initial set of phases. There are numerous procedures for
portional to the intercepts of crystal planes on the x, y, and z developing a complete structure from this information (4-10).
axes, respectively. The angle qh is the phase associated with Fh, The Patterson function also becomes somewhat accessible when
and r is the position of any point in the unit cell. FhE is the it is possible to introduce structural information in the form of
amplitude of the scattered wave associated with the crystal known atomic groupings (11-13). A detailed study of the
plane labeled by h, where E is the electric field vector of the properties of the Patterson function has been presented by
incident beam. The measured x-ray intensities are proportional Buerger (14).
to IFh12. Despite the limitations on the direct analysis of the Patterson
If the values for the 40h were obtained directly from experi- function, the great overdeterminacy of the problem has moti-
ment, the structures could be immediately calculated from Eq. vated the search for alternative methods that could extend the
1. The absence of phase information gives rise to a difficult range of complexity of structures that could be investigated.
3542 Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978)

The direct method for phase determination has this capability. probable value of kh. In fact, expression 8 is more probably
Its philosophic basis and mathematical background will now correct, the larger the values for the structure factor magnitudes
be outlined. associated with the phases involved. Such probabilistic features
The usefulness of the non-negativity criterion as described can be obtained directly in a quantitative fashion by exami-
in the previous article (1) for the electron diffraction of gases nation of inequality 7. The quantity FkFh-k/Fooo can be con-
motivated the search for additional applications. The result of sidered as an expected value for Fh, and a measure of the
imposing this constraint on the electron density distribution in variance can be based on the right side of inequality 7. With
crystals is an infinite set of determinantal inequalities of in- these quantities, a probability distribution function for Fh may
creasing order whose elements are the structure factors (15). be immediately written (16) by use of the central limit theo-
In order of complexity, the first inequality statement is that FOWo rem.
must be non-negative, the second is that IFhj S Fooo, and the Various aspects of probability theory have been applied to
third is a relationship among the structure factors that plays a the phase problem over the years (17-21), starting soon after
primary role in direct crystal structure determination. The the development of the inequality theory. The probabilistic
inequality is written characteristics of inequalities had been noted by Gillis (22), who
was working with a set of inequalities derived by Harker and
Fooo F -k F-h Kasper (23) somewhat earlier than the determinantal theory.
V Foo F-h+k I>o 6 Although the Harker-Kasper inequalities did not include in-
| h Fh-k Fooo | [6] equality 7, they were stimulating in showing that simple in-
equality relationships could give phase information (24) and
in providing early insight into their probabilistic implications
Its important features may be seen by rewriting it in the form (22). The determinantal inequalities were derived (15) as a
(15) consequence of the non-negativity of the electron density dis-
FkFh-k[
|1
tribution. These types of determinantal inequalities were known
FO O to mathematicians working with non-negative Fourier series.
After the determinantal inequalities were obtained, it was noted
IFooo F -k 1/2 |Fooo F -h+k 1/2 that the Harker-Kasper inequalities were contained in them.
< Fk FOOOI
| Fh-k Fooo | [7] Centrosymmetric crystals, i.e., crystals having a center of
symmetry, can have phase values that are only 0 or ir for a
properly chosen origin in the crystal. This means that only a plus
The interpretation of this inequality is that the structure factor or minus sign attaches to jFhj for such crystals. A simple
Fh is bounded by a circle in the complex plane that has its center probability measure of this in the convenient hyperbolic tangent
at FkFh-k/Fooo and a radius given by the right side of in- form of Woolfson (18) is given by the probability that the
equality 7 (Fig. 2). When the structure factors Fk and Fh-k have structure factor Fh be plus (i.e., has a phase of zero), P+(h),
an unusually large magnitude, inequality 7 becomes quite re- given several normalized structure factors Ek and Eh-k,
strictive because the right side becomes rather small. This
permits the conclusion that Oh, the phase of Fh, is approximately P+(h) - '/2 + l/2 tanh a3a2-32 I Eh IFk EkE h-k. [9]
equal to Ok + 4h-k, the phase of FkF h-k. Inequality 7 can be
strengthened by transforming the structure factors F to those The normalized structure factors represent scattering from
that would be obtained from point atoms. Even with this essentially point atoms and have the property that the average
strengthening, if it were necessary to be guided solely by the value of their magnitudes squared is equal to unity, i.e.,
bounding range of the inequality, the relation (IEhj2)h = 1. The quantities an, are defined in terms of the
atomic numbers, Z1,
Oh -ok + Oh-k [8] N
would be applicable to only the simplest of crystals. orn = E Z7n [10]
1=1
Expression 8 has, however, extensive application. This is so Certain variance factors that would enhance P+(h), depending
because inequality 7 has probabilistic implications that, in ef- upon how large the I El in Eq. 9 are, have been omitted.
fect, confine the bounding range, giving ok + Oh-k as the most Crystals that lack a center of symmetry can have any value
for a phase. In that case, useful mathematical tools are the ap-
Im propriate probability distributions (16, 19) and measures of the
variance (16, 21).
The development of a practical procedure of broad appli-
cation from relation 8 and its corresponding probability mea-
sures, the symbolic addition procedure (21), was not immediate.
It required many years and was, in fact, preceded by a proce-
dure for phase determination for centrosymmetric crystals (17)
that afforded experiences that facilitated its development. The
Re general approach in the symbolic addition procedure involves
initially the stepwise use of expression 8 and the appropriate
corresponding probability measures. Evidently, in order to use
expression 8 certain phase values must be available. They are
derived from certain allowed specifications that, for example,
locate the origin in the crystal and certain others that are
FIG. 2. Circle in complex plane showing interpretation of in- specified by symbols. The stepwise procedure involves the
equality 7, where 6 = FkFh-k/Fooo and r represents the right side of definition of a large number of phases in terms of the ones that
the inequality. Fh must be on or within the circle. Knowledge of PFhI
further restricts Fh to the line that intersects the circle. have been specified, with each step guided by considerations
Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978) 3543

of high probability. In a very large number of cases, only a.few merical values for phases do not accrue. This is effected by using
phase values need to be specified symbolically, usually no more the capacity of computers to use and store numerous symbolic
than five and often less. Symbols for phasesiiiciatedwithdl definitions for a phase and to defer the taking of averages until
or imaginary structure factors take on only two values, 0 or X the symbols are evaluated. In addition, it should be noted that
for the real factors and +7r/2 for the imaginary ones. For the capacity of symbols to store phase information can far ex-
general structure factors, it suffices to limit the phase values to ceed the capacity of computing machines to process the indi-
these four cardinal points. vidual numerical alternatives that the symbols represent.
Many ways exist for evaluating the symbols. Relationships The most difficult structure determinations are those that
develop among the symbols as the phase determination pro- concern essentially equal atom noncentrosymmetric crystals
ceeds. In addition, there are several auxiliary formulas available having many atoms in the asymmetric unit, perhaps 50-100
for helping with the evaluation of symbols. Finally, Fourier atoms or more. An asymmetric unit is that part of the unit cell
series for the remaining alternative sets of values for the phases of a crystal whose composition is unrelated to the symmetry
can be computed and examined for suitability as an acceptable elements of the crystal. The composition of the entire unit cell
physical answer. can be generated from the asymmetric unit by application of
Other procedures have been developed that use many al- symmetry operations appropriate to the crystal. Occasionally,
ternative sets of numerical phase specifications instead of the a phase determination does not succeed in producing an answer.
symbols that can represent these alternatives. The motivation This can occur because the stepwise procedure is based on
for this has been the ease with which the programming of a probabilities rather than certainties, and errors can accumulate.
computer can be performed and, to some extent, the expecta- Even this does not present insuperable difficulties, however,
tion that certain advantages would accrue from being able to because the stepwisebpath through a phase determination is far
use phase-determining formulas with numbers rather than from unique and a fresh start could provide a path to more
symbols. There is an ambiguity of 2wr that interferes with the accurate phase determination.
averaging of different symbolic definitions of a phase. It is A great many structure determinations have been performed
possible to use symbols, however, in such a way that the antic- by use of the direct method of phase determination; it has
ipated differential advantages from the use of alternative nu- stimulated research in numerous fields of science because basic

FIG. 3. The four cocrystallizing conformers of cyclohexaglycyl. The form at the upper left occurs four times in the unit cell, the one at the
lower left occurs twice, and the other two occur once. [Reprinted with permission from Karle, I. L., Gibson, J. W. & Karle, J. (1970) J. Am. Chem.
Soc. 92, 3755-3760. Copyright by the American Chemical Society.]
3544 Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978)
structural information is now much more readily available than of phases from among alternative ones. The higher order de-
in the past. This is especially true in the fields of organic terminants also hold promise for application in the initial stages
chemistry, biological chemistry, and natural products chem- of direct phase determination.
istry. Structure analysis affords information concerning struc- What impact the continuing theoretical investigations will
tural formula, configuration, and conformation and finds ap- have is a question for the future. The ultimate test is whether
plication in a variety of investigations concerning, for example, they can significantly facilitate current practice or extend the
products of syntheses, biosynthetic pathways, reaction inter- range of complexity of structures that can be presently han-
mediates, rearrangements, reaction mechanisms, ion transport, dled.
and radiation damage to genetic material. For additional reading, there are books on x-ray crystallog-
An interesting example of an application is given by the first raphy by Zachariasen (39) and by Buerger (40), a recent book
substance investigated by the symbolic addition procedure, on crystallographic computing techniques edited by Ahmed
cyclohexaglycyl (25). This cyclic peptide crystallizes in space et al. (41), an outline of several aspects of phase determination
group P1 with four different conformers in a unit cell con- (42), and review articles (43, 44). Aspects of crystal structure
taining 196 nonhydrogen atoms, of which 98 are independently determination by neutron diffraction have been discussed in
placed (Fig. 3). Not only is the cocrystallization of the four a book by Bacon (45). A book by Vainshtein discusses crystal
conformers of inherent interest, but the results of this investi- structure determination by electron diffraction (46).
gation have been used as models for the prediction of confor-
mation (26, 27). MACROMOLECULAR CRYSTALS
Efforts to advance the theoretical aspects of crystal structure Macromolecules often fold up to form globular structures and
determination continue. One path concerns the development
of probabilistic formulas and applications for the higher order can crystallize with three-dimensional periodicity. Some
phase invariants, e.g., quartets and quintets. The quartet in- macromolecules form long, fibrous structures that do not form
variants have a long history of development and application, globules. The fibers generally agglomerate with two-dimen-
starting with the Harker-Kasper inequalities (23) and the sional periodicity, but the third dimension in the direction of
sigma-3 formula (17). Recent interest was stimulated by Schenk the fiber axis often exhibits various kinds of disorder. I discuss
and de Jong (28), who suggested a test for selecting a correct set 'here the structures of large molecules whose crystals possess true
of phases from among several alternatives. The test was based three-dimensional periodicity.
on the use of special quartets that, on the basis of indications About 100 structure determinations of protein molecules
from Harker-Kasper inequalities, might be expected to be have been performed to date. They include globins and other
negative, i.e., to have a phase of ir. Further developments have reversible oxygen-transporting molecules, lysozymes, nucleases,
concerned joint probability distributions that include quartets proteases, cytochromes, synthetases, hormones, immunoglob-
and their closely related structure factor magnitudes called ulins, electron transport proteins, and glycolytic enzymes. Some
neighborhoods (29-31). Such formulas have been derived by studies of simple viruses are approaching atomic resolution, and
Hauptman (32) and Giacovazzo (33). Similar derivations have the structure investigation of tRNA has reached the refinement
been carried out by Hauptman and Fortier for quintets (34) and stage. The smallest protein contains about 500 nonhydrogen
sextets (35). The objective of the introduction of numerous atoms. This is about as small a molecule that we consider to be
structure factor magnitudes into the joint distributions is the a macromolecule. The viruses currently under investigation
expectation that the conclusions from the resulting conditional have molecular weights that range from 2 to 9 X 106. They are
probability distributions for the higher order phase invariants composed of numerous protein subunits in association with a
would be more reliable, given the known values of the magni- nucleic acid. The problem is somewhat simplified by the
tudes from experiment. presence of symmetry or near symmetry, so that only part of
An alternative path for theoretical investigation has been the the atoms need to be independently located. Nevertheless, the
development of the probabilistic properties of the higher order simplest problem involves the determination of the coordinates
determinants, i.e., determinants of the type given by formula for about 15,000 nonhydrogen atoms.
6 but of order 4 and higher. Such studies have been pursued by The development of techniques for determining the struc-
Tsoucaris (36) and Karle (16, 37) and bear a relationship to those tures of macromolecules has been a major stimulus in the bio-
that focus on the individual higher order invariants, such as logical sciences. The results of structure investigations estab-
quartets and quintets, since the determinants contain phase lished the foundations and have continued to motivate progress
invariants ranging from triplets to n-tets, where n is the same in the field of molecular biology.
order as the determinants. The determinants also contain in- Besides the complexity associated with the large number of
formation concerning structure factor neighborhoods, and it atoms that need to be located, macromolecules present addi-
is possible to derive special probability distributions for quartets tional problems for the analyst. The main problem is the rela-
and higher invariants. This fact, however, is not the primary tively few data that are available compared to the number of
motivation for investigating the higher order determinants. The unknown parameters. The data are limited because of positional
motivation is derived from the increasingly restrictive bound disorder, which occurs in the crystals of these large molecules.
that high order determinantal inequalities can place on a phase Positional disorder means that the locations of the atoms are not
or, from another point of view, their closer approach to be- precisely the same from one unit cell to the other. This has the
having like strict equalities. This suggests that advantages would effect of damping out the diffraction data at the higher scat-
be derived from being able to use the determinantal probability tering angles. The problems that arise from the limitations in
distributions and their implications in toto without the re- the range of the data and the general complexity of the mate-
duction to special formulas. One aspect of this is the use of a rials have been overcome by special techniques for treating the
general form of the maximum determinant rule, an implication phase problem.
of the determinantal probability distributions, to obtain an The limited data range also affects the accuracy with which
evaluation of the symbolic definitions of phases, as suggested final results may be obtained, and much effort is presently being
by Woolfson (38). In a related way, Tsoucaris (36) showed the expended on structure refinement in order to optimize the
value of higher order determinants in selecting the correct set accuracy of the results. The accuracy is often needed for a
Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978) 3545

clarification of the relationship of the structure to its func- Im


tion.
The theoretical basis for the analysis of macromolecular
crystals is derived from the work of Patterson (2, 3) in the
context of its implications regarding the usefulness of the-
presence of a heavy atom in the structure to be determined. As
was pointed out previously, the location of a heavy atom and
the calculation of phases based on the heavy atom affords a
useful first step in structure determination. The situation is
somewhat more complicated for macromolecules. With so
many atoms present, a single heavy atom has too little effect Re
on the total scattering to sufficiently influence the values of the
phases and thus permit the usual application of the heavy atom
method. One solution would be to try to introduce many heavy
atoms. This makes for other complications, however, such as
the accurate location of all heavy atom substituents, so that
another approach has been followed. It is called the method of
isomorphous substitution and is a more powerful way to use a
heavy atom derivative than afforded by the usual heavy atom
method. The method involves the use of at least two different
crystals that have essentially identical structures except that they
differ with respect to their content of heavy atoms. In appli- FIG. 4. Construction based on Eqs. 11 and 12 for multiple iso-
cations to protein structure analysis, one crystal is usually morphous replacement showing the relationship among the vectors.
formed from the native protein and others are made to contain The manner of choosing between alternative values for the phase
one or a very few heavy atoms. The usual method for intro- angle associated with FN is indicated by the solid lines, which have
ducing the heavy atom is to soak the protein crystal in a solution FN in common.
containing a compound of a heavy atom. The process takes
place by diffusion of the compounds through solvent channels structure affect the accuracy with which the phases can be
in the crystal. There are particular sites on the protein molecule determined. Multiple isomorphous replacement involving
that have a special affinity for certain heavy atom moieties. It many different heavy atom derivatives can resolve such
is common to make several heavy atom derivatives and to problems and, once good heavy atom derivatives are available,
combine the results from all the derivatives. This is called the the practical aspects of the phase determination proceed in a
multiple isomorphous replacement method. In fact, when fairly routine fashion.
applying the method to crystals that lack a center of symmetry, The first application of isomorphous replacement to mac-
as do protein crystals, at least two isomorphous derivatives of romolecules was made in the investigation of myoglobin by
the native protein are required, in the absence of additional Kendrew and collaborators (51) and in the investigation of
information such as is derived from anomalous dispersion, to hemoglobin by Perutz and his coworkers (52) (Fig. 5). This work
avoid an ambiguity in the evaluation of the phases. The analysis and the particular suitability of proteins for application of the
of this aspect of isomorphous replacement was developed by isomorphous replacement technique have facilitated the ex-
Bijvoet and coworkers (47, 48). tensive developments that have taken place in the structure
The isomorphous replacement technique, as applied to investigations of proteins over the past 15 years. As noted, this
area of research has expanded into the investigation of virus
noncentrosymmetric crystals, may be understood by reference structures, with considerable progress in several laboratories
to Fig. 4. The structure factor for the native protein is denoted during the past few years toward the ultimate goal of atomic
by FN, the structure factors for the heavy atom substituents are resolution. First steps have also been taken in extending the
denoted by Fy and FZ, and those for the substituted proteins application to the investigation of the structures of nucleic acids
by FN+Y and FN+Z. They satisfy the equations with the determination of the three-dimensional structure of
FN+ FY = FN+Y [11] phenylalanine tRNA by Rich, Kim, and coworkers (53) and
Klug and coworkers (54).
and Another valuable technique that has found application in
FN+ FZ =FN+Z. [12] phase determination for macromolecules has also been devel-
oped by Bijvoet and his coworkers (55-57). It makes use of the
In the usual experiment, the information available would be anomalous values for atomic scattering factors in the vicinity
IFNI, IFN+yI, IFN+zI, Fy, and Fz. FY and F_ are available
since the method involves the initial locating of the heavy atoms
of an absorption edge for an atom, referred to as anomalous
dispersion (39). An analysis of the anomalous dispersion tech-
in the unit cell. Eqs. 11 and 12 can be satisfied by two config- nique as applied to noncentrosymmetric crystals may be found
urations, each placed symmetrically about the vectors Fy and in a previously cited review article (43). The initial applications
Fz, respectively, 'set at the origin. It is seen in Fig. 4 that only of anomalous dispersion were made by Bijvoet et al. (58) to
one configuration for each of the pairs share FN in common. solve the problem of the absolute configuration of molecules
The ambiguity is thus resolved and the appropriate phase angles whose mirror images are distinct. In this way, they were able
to be associated with the given magnitudes may be measured to show that the convention established by Emil Fischer was,
from the real axis. Harker (49) suggested an alternative con- in fact, consistent with what actually occurs in nature.
struction for resolving the ambiguity in isomorphous replace- There is considerable evidence that structures obtained for
ment. protein molecules in the crystalline state are biologically
Problems arise in the practical application of isomorphous meaningful. It has been shown, for example, that in many cases
replacement because errors in measurement and changes in enzymes in the crystalline state are active. The proteins that
3546 Applied Physical and Mathematical Sciences: Karle Proc. Nati. Acad. Sci. USA 75 (1978)
Watenpaugh (74, 76). Discussions of several of these topics may
be found in ref. 40.
For readings in protein crystallography there are many ar-
ticles of a review or specialized nature. General reviews have
been presented by Phillips (77), Blundell and Johnson (78), and
Matthews (79). Articles have been written on the preparation
of isomorphous derivatives by Blake (80), on the x-ray crystal-
lography of enzymes by Eisenberg (81), on the molecular ar-
chitecture of oxygen-carrying proteins by Hendrickson (82),
and the evolutionary aspects of protein structures by Dickerson
(83). Noteworthy additions to the literature are a volume on the
structure and function of proteins (84) and a text by Blundell
and Johnson (85).

CONCLUDING REMARKS
FIG. 5. Schematic depiction of the fitting together in space of a The field of structure determination by diffraction methods
molecular unit of human hemoglobin composed of four protein mol- has made major advances in the development of mathematical
ecules, each having a molecular weight of 16,000. The drawing is based techniques, data reduction procedures, and analyses that pre-
on one by Dickerson and Geis (50). pare experimental data for application to mathematical theory
and conversely. In these developments, advantage has been
have been studied maintain the watery environment that they taken of the numerous mathematical and physical constraints
have in solution by cocrystallizing with large amounts of water. that can be imposed. Considerable benefits have been obtained
Some of the water is ordered in the crystal and some is not, from the use of chemical information and from structural in-
adding to the difficulties in analysis and refinement. Because formation available from other techniques such as spectroscopy
of the fairly large scattering factors of hydrogen and deuterium and microscopy. Mathematical constraints, such as non-nega-
for neutrons relative to those for other atoms, it is possible to tivity, and their mathematical consequences have led to greatly
investigate the water structure in materials by neutron dif- enhanced facility and accuracy in analyses. They have made
fraction in some detail. This contrasts with x-ray diffraction, it possible to effect solutions of the phase problem in crystal
in which the scattering from hydrogen atoms is quite weak. It structure analyses. Special experimental techniques such as
is also possible to design experiments that take advantage of the isomorphous replacement and their associated theory have
fact that the scattering factors for hydrogen and deuterium for made it possible to develop in depth a heretofore inaccessible
neutrons are negative and positive, respectively. An investi- field, the investigation of macromolecular structure and its
gation to this end is the neutron diffraction analysis of the correlation with function. It is reasonable to expect that the
carbon monoxide derivative of myoglobin by Norvell et al. progress and ingenuity that has characterized these efforts will
(59). continue. Work is currently under way, for example, that may
The limited scattering range that is normally obtained for one day lead to accurate models of the arrangements of atoms
protein structures places a severe limit on the resolution and in amorphous materials or reveal the structures of the complex
accuracy of the final results unless use is made of additional organizational features in living cells such as chromosomes and
information. The first Fourier maps of the electron density ribosomes. In view of the significance and broad range of ap-
distribution that are obtained from a phase determination for plication of many of the present studies, the future of these
proteins, in contrast with those for small structures, are usually activities can be viewed with deep interest and anticipation.
poorly resolved and rather inaccurate. Considerable im-
provements are to be expected from the use of mathematical 1. Karle, J. (1977) Proc. Nati. Acad. Sci. USA 74,4707-4713.
relationships that can refine sets of phases whose values are 2. Patterson, A. L. (1934) Phys. Rev. 46,372-376.
approximately known and especially from the introduction of 3. Patterson, A. L. (1935) Z. fur Kristallogr. 90, 517-542.
chemical information in the form of amino acid sequences, the 4. Woolfson, M. M. (1956) Acta Crystallogr. 9,804-810.
structures of amino acid residues, and knowledge of bond dis- 5. Bertaut, E. F. (1957) Acta Crystallogr. 10, 670-671.
tances, bond angles, and intramolecular interactions. Appli- 6. Sim, G. A. (1960) Acta Crystallogr. 13,511-512.
cations of these various means for improving the experimental 7. Hoppe, W. & Huber, R. (1963) in Crystallography and Crystal
results have many aspects. Perfection, ed. Ramachandran, G. N. (Academic, New York),
Model-building techniques have been developed by Dia- pp. 61-65.
mond (60) that make use of chemical sequencing and the 8. Srinivasan, R. (1966) Acta Crystallogr. 20, 143-144.
9. Karle, J. (1968) Acta Crystallogr. Sect. B 24, 182-186.
known structure of residues. Phase refinement calculations 10. Sayre, D. (1972) Acta Crystallogr. Sect. A 28, 210-212.
based on a variety of phase-determining formulas (16, 61-65) 11. Hoppe, W. (1975) Z. Elektrochemie 61, 1076-1083.
have been proposed and performed by several workers. Other 12. Nordman, C. E. & Nakatsu, K. (1963) J. Am. Chem. Soc. 85,
refinement techniques that have been developed involve the 353-354.
use of noncrystallographic symmetry (66), modification of 13. Huber, R. & Hoppe, W. (1965) Chem. Ber. 98,2403-2424.
electron density (67, 68), coordinate refinement with constraints 14. Buerger, M. J. (1959) Vector Space (Wiley, New York).
on the protein chains (69, 70), interactive display, model fitting, 15. Karle, J. & Hauptman, H. (1950) Acta Crystallogr. 3, 181-
and refinement (71-73), Fourier and least-squares refinement 187.
of coordinates (74), and least-squares refinement of coordinates 16. Karle, J. (1971) Acta Crystallogr. Sect. B 27, 2063-2065.
17. Hauptman, H. & Karle, J. (1953) American Crystallographic
coupled with the inclusion of restraints based on general Association, Monograph No. 3 (Polycrystal Book Service, Pitts-
structural information such as bond distances and bond angles burgh).
(75). A major stimulus to the progress in protein structure re- 18. Woolfson, M. M. (1954) Acta Crystallogr. 7,61-64.
finement has come from the success of the work of Jensen and 19. Cochran, W. (1955) Acta Crystallogr. 8,473-478.
Applied Physical and Mathematical Sciences: Karle Proc. Natl. Acad. Sci. USA 75 (1978) 3547

20. Karle, J. & Hauptman, H. (1956) Acta Crystallogr. 9, 635- 54. Robertus, J. D., Ladner, J. E., Finch, J. T., Rhodes, D., Brown,
651. R. S., Clark, B. F. C. & Klug, A. (1974) Nature 250,546-551.
21. Karle, J. & Karle, I. L. (1966) Acta Crystallr. 21, 849~591 i 55. i:ijvoet, J. M. (1954) Nature 173,888-891.
22. Gillis, J. (1948) Acta Crystallogr. 1, 174-179. 56. Peterson, S. W. (1955) Nature 176,395.
23. Harker, D. & Kasper, J. S. (1948) Acta Crystallogr. 1, 70-75. 57. Bijvoet, J. M. & Peerdeman, A. F. (1956) Acta Crystallogr. 9,
24. Kasper, J. S., Lucht, C. M. & Harker, D. (1950) Acta Crystallogr. 1012-1015.
3, 436-455. 58. Bijvoet, J. M., Peerdeman, A. F. & van Bommel, A. J. (1951)
25. Karle, I. L. & Karle, J. (1963) Acta Crystallogr. 16,969-975. Nature 168, 271-272.
26. Ramachandran, G. N. & Sasisekharan, V. (1968) in Advances in 59. Norvell, J. C., Nunes, A. C. & Schoenborn, B. P. (1975) Science
Protein Chemistry, eds. Anfinsen, C. B., Jr., Anson, M. L., Edsall, 190,568-570.
J. T. & Richards, F. M. (Academic, New York), pp. 283-438. 60. Diamond, R. (1966) Acta Crystallogr. 21, 253-266.
27. Venkatachalam, C. M. (1968) Biopolymers 6, 1425-1436. 61. Weinzerl, J. E., Eisenberg, D. & Dickerson, R. E. (1969) Acta
28. Schenk, H. & de Jong, J. G. H. (1973) Acta Crystallogr. Sect. A
29,31-34. Crystallogr. Sect. B 25,380-387.
29. Hauptman, H. (1977) Acta Crystallogr. Sect. A 33,553-555. 62. Hendrickson, W. A. & Karle, J. (1973) J. Biol. Chem. 248,
30. Hauptman, H. (1977) Acta Crystallogr. Sect. A 33,568-571. 3327-3334.
31. Giacovazzo, C. (1977) Acta Crystallogr. Sect. A 33,933-944. 63. Sayre, D. (1974) Acta Crystallogr. Sect. A 30, 180-184.
32. Hauptman, H. (1977) Acta Crystallogr. Sect. A 33,556-564. 64. DeRango, C., Mauguen, Y. & Tsoucaris, G. (1975) Acta Crys-
33. Giacovazzo, C. (1976) Acta Crystallogr. Sect. A 32,91-99. tallogr. Sect. A 31, 227-233.
34. Hauptman, H. & Fortier, S. (1977) Acta Crystallogr. Sect. A 33, 65. Podjarny, A. D., Yonath, A. & Traub, W. (1976) Acta Crystallogr.
575-580. Sect. A 32,281-292.
35. Hauptman, H. & Fortier, S. (1977) Acta Crystallogr. Sect. A 33, 66. Bricogne, G. (1974) Acta Crystallogr. Sect. A 30, 393-405.
697-701. 67. Hoppe, W. & Gassmann, J. (1968) Acta Crystallogr. Sect. B 24,
36. Tsoucaris, G. (1970) Acta Crystallogr. Sect. A 26,492-499. 97-107.
37. Karle, J. (1978) Proc. Nati. Acad. Sci. USA 75,2545-2548. 68. Collins, D. M. (1975) Acta Crystallogr. Sect. A 31, 388-389.
38. Woolfson, M. M. (1977) Acta Crystallogr. Sect. A 33, 219- 69. Diamond, R. (1971) Acta Crystallogr. Sect. A 27,436-452.
225. 70. Huber, R., Kukla, D., Bode, W., Schwager, P., Bartels, K.,
39. Zachariasen, W. H. (1945) Theory of X-ray Diffraction of Deisenhofer, J. & Steigemann, W. (1974) J. Mol. Biol. 89, 73-
Crystals (Dover, New York). 101.
40. Buerger, M. J. (1962) X-ray Crystallography (Wiley, New 71. Hermans, J. & McQueen, J. E. (1974) Acta Crystallogr. Sect. A
York). 30,730-739.
41. Ahmed, F. R., Huml, K. & Sedlacek, B., eds. (1976) Crystallo- 72. Dodson, E., Isaacs, N. W. & Rollett, J. S. (1976) Acta Crystallogr.
graphic Computing Techniques (Munksgaard, Copenhagen). Sect. A 32, 311-315.
42. Ibers, J. A. & Hamilton, W. C., eds. (1974) International Tables 73. Ten Eyck, L. F., Weaver, L. W. & Matthews, B. W. (1976) Acta
for X-ray Crystallography (Kynoch Press, Birmingham, En- Crystallogr. Sect. A 32, 349-350.
gland). 74. Watenpaugh, K. D., Sieker, L. C., Herriott, J. R. & Jensen, L. H.
43. Karle, J. (1969) in Advances in Chemical Physics, eds. Prigogine,
I. & Rice, S. A. (Interscience, New York), Vol. 16, pp. 131- (1973) Acta Crystallogr. Sec. B 29,943-956.
222. 75. Konnert, J. H. (1976) Acta Crystallogr. Sect. A 32, 614-617.
44. Robertson, J. M., ed. (1972) Chemical Crystallography (Butter- 76. Jensen, L. H. (1974) Annu. Rev. Biophys. Bioeng. 3,81-93.
worth, London). 77. Phillips, D. C. (1966) in Advanccs in Structure Research by
45. Bacon, G. E. (1975) Neutron Diffraction (Clarendon, Ox- Diffraction Methods, eds. Brill, R. & Mason, R. (Interscience,
ford). New York), Vol. 2, pp. 75-140.
46. Vainshtein, B. K. (1964) Structure Analysis by Electron Dif- 78. Blundell, T. L. & Johnson, L. N. (1972) in Chemical Crystal-
fraction (Pergamon, Oxford). lography, ed. Robertson, J. M. (Butterworth, London), pp.
47. Bokhaven, C., Schoone, J. C. & Bijvoet, J. M. (1951) Acta Crys- 199-246.
tallogr. 4, 275-280. 79. Matthews, B. W. (1977) in The Proteins, eds. Neurath, H. & Hill,
48. Bijvoet, J. M. (1952) in Conference on Computing Methods and R. L. (Academic, New York), Third Ed., Vol. 3, 403-590.
the Phase Problem in X-ray Crystal Analysis: Penn State Col- 80. Blake, C. C. F. (1968) Adv. Protein Chem. 23, 59-120.
lege, ed. Pepinsky, R. (X-ray Crystal Analysis Laboratory, De- 81. Eisenberg, D. (1970) in The Enzymes, ed. Boyer, P. D. (Aca-
partment of Physics, Pennsylvania State College, PA.), pp. 84- demic, New York), pp. 1-89.
89. 82. Hendrickson, W. A. (1977) Trends in Biochem. Sci. 2, 108-
49. Harker, D. (1956) Acta Crystallogr. 9, 1-9. 111.
50. Dickerson, R. E. & Geis, I. (1969) The Structure and Action of 83. Dickerson, R. E. (1979) in Second Taniguchi International
Proteins (W. A. Benjamin, Menlo Park, CA). Symposium in Biophysics on Molecular Evolution and Poly-
51. Kendrew, J. C., Dickerson, R. E., Strandberg, B. E., Hart, R. G., morphism (National Institute of Genetics, Mishima, Japan), in
Davies, D. R., Phillips, D. C. & Shore, V. C. (1960) Nature 185, press.
422-427. 84. Structure and Function of Proteins on the Three-Dimensional
52. Perutz, M. F., Rossmann, M. G., Cullis, A. F., Muirhead, H., Will, Level Cold Spring Harbor Symposia on Quantitative Biology
G. & North, A. C. T. (1960) Nature 185,416-422. (1971) (Cold Spring Harbor Laboratory, Cold Spring Harbor,
53. Kim, S. H., Suddath, F. L., Quigley, G. J., McPherson, A., Suss- NY), Vol. 36.
man, J. L., Wang, A. H. J., Seeman, N. C. & Rich, A. (1974) 85. Blundell, T. L. & Johnson, L. N. (1976) Protein Crystallography
Science 185, 435-440. (Academic, New York).

Vous aimerez peut-être aussi