INTRODUCTION TO BIO-GEOMETRY

Herbert Edelsbrunner Departments of Computer Science and Mathematics Duke University

Table of Contents
P ROLOGUE I II III IV V VI VII VIII IX X B IO - MOLECULES G EOMETRIC M ODELS S URFACE M ESHING C ONNECTIVITY S HAPE F EATURES D ENSITY M APS M ATCH AND F IT D EFORMATION M EASURES D ERIVATIVES S UBJECT I NDEX AUTHOR I NDEX i 1 17 35 53 71 89 101 117 125 141 147 149

Preface
[Mention the pioneers who early on recognized the importance of geometry in structural molecular biology: Fred Richards, Michael Levitt, Michael Connolly] [Mention that my book on the “Geometry and Topology for Mesh Generation” is complementary/a prerequisite to this book. In particular, it covers the construction of Delaunay triangulations in detail, and it describes the simulation of simplicity as a general idea to deal with non-generic situations.] [This book is really about alpha shapes in a broad sense. It might be useful to describe the history of that research in short. 1981. Vancouver. Conception of idea with Kirkpatrick and Seidel. 1985-89. Graz and Urbana. SoS, Delaunay software, Alpha Shape software with Ernst Mucke, Harald Rosen¨ berger, and Patrick Moran. 1990-93. Urbana and Berlin. Surface triangulations, Betti numbers, inclusion-exclusion, CAVE with Ping Fu, Ernst Mucke, Cecil Delfinado, Nataraj Akkiraju, and ¨ Jiang Qian. 1994-95. Hong Kong. Morphing, molecular skin, with Ping Fu, Siu-Wing Cheng, Ka-Po Lam, and Ho-Lun Cheng. 1995-98. Urbana. Flow and pockets, skin surfaces with HoLun Cheng, Tamal Dey, Michael Facello, Jie Liang, Shankar Subramaniam, Claire Woodworth. 1999-2001. Duke. Skin triangulation, hierarchy, Morse ¨ ¨ complexes with Ho-Lun Cheng, Alper Ungor, Afra Zomorodian, David Letscher, John Harer, Vijay Natarajan. 2002-2003. Duke and Livermore. Docking, Reeb graphs, Jacobian manifolds with Johannes Rudolph, Sergei Bespamyatnikh, Vicky Choi, John Harer, Valerio Pascucci, Vijay Natarajan, Ajith Mascarenhas. 2000-2005. ITR Project. Derivatives, interfaces, software with Robert Bryant, Patrice Koehl, Michael Levitt, Andrew Ban, Johannes Rudolph, Lutz Kettner, Rachel Brady, and Daniel Filip. ] [This book is based on notes developed during teaching the courses on “Sphere Geometry” in the Spring of 2000, and on “Bio-geometric Modeling” in the Spring of 2001 and the Fall of 2002, all at Duke University. These courses were either taken for credit or audited at least occasionally by Luis von Ahn, Tammy Bailey, Yih-En (Andrew) Ban, Robert Bryant, Ho-Lun Cheng, Vicky Choi, Anne Collins, Abhijit Guria, Tingting Jiang, Looren Looger, Ajith Mascarenhas, Gopi Meenakshisundaram, Nabil Mustafa, Vijay Natarajan, Xiuwen Ouyang, Anindya Patthak, Ken Roberts, Apratim Roy, Scott Schmidler, Xiaobai Sun, Yusu Wang, Shumin ¨ ¨ Wu, Alper Ungor, Peng Yin and Afra Zomorodian.]

Herbert Edelsbrunner Durham, North Carolina, 2002

3 on Construction and Simplification.1 on Molecular Dynamics.3: mention new results on scheduling. Chapter VII In Section VII.by 03-. Should Section V. Write Section VIII. General Fix the software for creating the index and glossary.3: replace 23.2: find out about finding the best bi-chromatic matching in . Chapter X Write a new chapter on area and volume derivatives and related topics. 13. Write Section VI. 2004). Write Section VIII.3 on Rigidity. Chapter VI Write Section VI. Exercises: come up with questions. Write a section on the Weighted Area Derivative. Exercises: come up with questions. ¢ £¡                                         . Write a section on the Weighted Volume Derivative.and 23collapses. Exercises: come up with questions.4 on Shape Space.4. Should the Exercise sections be labeled so the page heading is more uniform? Chapter III Section III. Exercises: add a few more questions. Write Section VIII.4 on Simultaneous Critical Points.2 on Spheres in Motion. Chapter IX Exercises: come up with questions.2 on Topological PerChapter V sistence be reorganized by first presenting the algebra and second the algorithm? In Section V.To do or think about (March 15. Chapter VIII Write the introduction to Deformation. Add the interface software description to Section V. Write Section VIII.

it should not be surprising that there are exceptions to almost everything meaningful that can be said about them. and proteins. which carries the genetic information: We begin by describing the chemical structure of DNA and RNA in Section I. DNA is the stuff that genetic material is made of. proteins are created in two steps from DNA.3.2 I.1. Finally. They are relatively simple locally but exceedingly complicated in their totality. we present some of the fundamental premises and results of molecular mechanics in Section I. According to the central dogma of biology.2 and talk about the structural organization of proteins in Section I.Chapter I Bio-molecules This chapter discusses the three main classes of organic macromolecules involved in the hereditary and life maintenance mechanisms of living beings: DNA. Perhaps it is more surprising that anything of broad validity can be said at all. which they accomplish in a complicated net of interactions. We then explain the translation from RNA to proteins in Section I. DNA transcription replication RNA translation Protein I. RNA is mostly but not entirely an intermediate product copying portions of the DNA (transcription) and turning this information into working proteins (translation). All mentioned molecules are between large and huge.1 I. Proteins act like machines that define the cell cycle as an ongoing process. RNA. 1 . Because of the complexity and the large variety.3 I. Each cell is like a society whose members have specialized tasks.4.4 DNA and RNA Proteins and Amino Acids Structural Organization Molecular Mechanics Exercises We talk briefly about the processes indicated by the three arrows and focuses on the structure of the players involved.

DNA has three chemical components: phosphate. The attachment of these bonds to the sugar groups is illustrated in Figure I. and four nitrogenous bases. The carbons of the sugar group are numbered from to . with atoms shown as tightly packed and partially overlapping spheres. orientation.3. As discovered by Watson and Crick in 1953. deoxyribose sugar. The bases are attached to the 1-carbons. the treatment of DNA in this section is coarse and lacking of many important details. and the other is between the phosphate and the -carbon. We obtain the nucleotides G. The two strands of DNA are held together by weak hydrogen bonds between complementary bases. We use boldface edges to connect atoms that are joined by two covalent bonds.2. the two backbones are in opposite. the hexagonal ring of cytosine has a total of eight covalent bonds.MOLECULES indicate the total number of extra shared electrons. The covalent bonding in the ring structures of the nitrogenous bases is more interesting. and one of the four bases. Figure I. DNA consists of two strands of nucleotides twisted into the shape of a double helix. Compared to standard genomics texts. or anti-parallel. C and T by substituting the corresponding base for adenine in Figure I. and the chemical structure of the other three nitrogenous bases below. passes through the -carbon. Chemical structure of DNA. This implies that the sequence of bases along one strand determines the ¡ ¦ ¡ ¥ ¡ £ ¡ ¤£ ¡ ¢  ¡ £ ¡ ¥ . cytosine. We think of the backbone as oriented in the direction of the path that starts at the -carbon. guanine.2 sketches the chemical structure of the nucleotide A and shows the chemical structures of the remaining three bases. which we may think of as four thirds of a covalent bond between every contiguous pair. The two bases of a pair are said to be complementary.1 DNA and RNA DNA (or deoxyribonucleic acid) is the material that forms the genome. The phosphate and the sugar groups in the backbone are connected by phosphodiester bonds. One part of the phosphodiester bond is between the phosphate and the -carbon. For example. All atoms in the ring share electrons as a group and we draw some double bonds just to Double helix.2 I B IO . forming the structure of a spiraling staircase. and ends at the -carbon. In the double stranded DNA molecule. Figure I. Adenine interacts with thymine and guanine with cytosine. which is a complete set of the genetic material of a living organism. and thymine. A nucleotide is conveniently referred to by the first letter of its base.2: The chemical structure of the DNA nucleotide with adenine as the nitrogenous basis above. The first two bases are double-ring and the last two are single-ring structures.1: A short piece of the DNA double-helix. each composed of a phosphate group. NH2 N O −O I. as depicted in Figure I. a deoxyribose sugar. our way up the multi-scale structure of DNA.1. namely adenine. We begin by looking at the small level and work C C C N N CH HC N O CH2 C H H C OH O H C H C H P −O adenine phosphate deoxyribose sugar O N HC N C N C C NH C NH2 HC HC N NH2 C CH 3 N C O C HC N O C NH C O guanine cytosine thymine Figure I. The backbone of each strand is a repeating phosphate-deoxyribose sugar polymer. Interactions between base pairs hold the two strands together. The chemical components are arranged in groups called nucleotides.

We begin by looking at the chemical features of RNA. giving each strand an orientation. Each cell of an organism contains a copy of the entire genome. but substitute uracil for thymine found in DNA.4 illustrates the chemical difference between RNA and DNA by showing a ribonucleotide containing uracil. The body has about cells. 2. O P O O O O P O 3’ 4’ 5’ H2 2’ 1’ O O O H HN N 5’H2 O 4’ 3’ T NH A 1’ 2’ O P O 5’ H2 O O O O O O P O 3’ 4’ 5’ H2 2’ 1’ O H HN N O O 1’ 4’ 3’ G NH NH H C 2’ O P O O O O O O O P O Figure I. The beads of wrapped histones assume a coiled structure (a solenoid) stabilized by another type of histone that runs along its central axis. Since humans are small relative to that distance. O HC O −O sequence of bases along the other: reverse the reading direction and replace each basis by its complement. The numbers to order the carbon atoms of each sugar group. guanine. Chromosomes. There are three main differences to DNA. This higher level uses a core scaffold made of another enzyme. Note that this definition depends on the rather complicated process of transcription. A protein machine builds new DNA strands by separating the two old strands and complementing each by a new anti-parallel strand. which itself assume the form of a spiral. The dotted connections between the nitrogenous bases indicate hydrogen bonds. How is a long thread of DNA converted into the relatively thick and worm-like structure visible through the electron microscope? On the lowest level. RNA has ribose sugar in its nucleotides. each chromosome is a long thread (a double-strand) that is densely folded around protein scaffolds. It takes one more level of packaging to convert the solenoid into the threedimensional structure we call a chromosome. RNA nucleotides carry the bases adenine. RNA is a single-stranded nucleotide chain and can therefore assume a much greater variety of geometric shapes than DNA.3: Chemical structure of a very short segment of DNA. which is a much needed operation during packing and unpacking the DNA.I. This enzyme has the ability to pass a strand of DNA through another. 3. topoisomerase II. ¦ ¨¥ ¥ © ¦ §¥ ¥ © 5’ 3’ Replication is based on this simple rule of complementarity and makes essential use of the relatively weak bonds between the two strands. this implies that the DNA must be thin and efficiently packed. Figure I. Uracil forms hydrogen bonds with adenine just as thymine does. totaling about meters of DNA. 1. ¡ ¤£ ¡ ¢  Chemical structure of RNA. the DNA is wrapped twice around a configuration of eight histones (a  AATCGCGTACGCG TTAGCGCATGCGC 3’ 5’ ¢       ¢ C     NH C O HC N O CH2 C H H C OH O H C OH C H P −O uracil phosphate ribose sugar Figure I. Indeed. .1 DNA and RNA 3 special protein). and cytosine. The best evidence suggests that the solenoid arranges in loops emanating from the scaffold. which differs from deoxyribose sugar by one additional oxygen atom. A gene is a subsequence of the DNA capable of being transcribed to produce a functional RNA molecule. which can fail for a variety of reasons. In the case of a human cell. which is more than a hundred times the distance between the earth and the sun.4: Chemical structure of the RNA nucleotide with uracil as the nitrogenous basis. this amounts to about two meters of DNA partitioned into twenty-three pairs of chromosomes per cell.

1999. Nature 171 (1953). Termination. The transcription process. It was long known that DNA is critically involved in that mechanism. J. H. The Origin of Genetics: A Mendel Source Book. Modern Genetic Analysis. D. Freeman. The book by Watson [4] is an enjoyable personal account of the years preceding the discovery of that structure. C. as sketched in Figure I. The resulting RNA sequence is S I B IO . Antheneum. The process is catalyzed by another protein machine. WATSON AND F. Electron microscope pictures show that the transcription of DNA to RNA is a highly parallel process in which a row of RNA polymerase complexes follow each other along the gene and produce RNA concurrently. which makes RNA.MOLECULES A gene is thus not only marked but indeed defined by the promoter segment preceding and the terminating sequence succeeding it. J. 5’ P C S P G S P A S 3’ P U [3] C.5. H. The idea that traits are hereditary is old. Verhand¨ lungen des naturforschenden Vereines. 737–738. It compares free ribonucleotides with the next exposed DNA basis and adds a complementary match. which helps coordinating the assembly of amino acids to proteins. Today there are many books on the subject. C RICK . Br¨ nn 4 (1866). L EWONTIN . [5] J. New York. M ILLER AND R. The Double Helix. u 3’ A 5’ T P S P C S P G S P S Figure I. except that U replaces T. which brings amino acids to the mRNA during the translation process. C. G ELBART. which acts as an intermediary structure in the synthesis of proteins. Chapters 2 and 3]. D. and ribosomal RNA (or rRNA). which moves along the DNA adding ribonucleotides to the growing RNA. Genetic implications of the structure of deoxyribonucleic acid. During the transcription of a gene. C RICK . RNA polymerase moves along the DNA. Freeman. S TERN AND E. and one strand acts as a template for RNA synthesis. 964–967. D. [2] G. It then unwinds the DNA and begins the synthesis of an RNA molecule. but it took until the work of Watson and Crick in 1953 to discover the chemical structure of DNA [5. H. R. [1] A. Initiation. Molecular structure of nucleic acid. which is not translated into protein. and most of the material in this section is taken from [1.5: The RNA grows in the 5’ to 3’ direction. Specific sequences in the DNA signal the chain termination by triggering the release of the RNA strand and the polymerase. WATSON . New York. but the detailed mechanism how it comes about started to unfold only recently. C. [4] J. Transcription. maintaining a transcription bubble to expose the template strand. when he discovered the basic rules of the hereditary mechanism [2]. S HERWOOD . Each individual transcription works in three steps. W. M ENDEL . Bibliographic notes. Free ribonucleotides align along the DNA template. 3–47. [6] J. Versuche uber Pflanzen-Hybriden. Abhandlungen. the two strands of DNA are separated locally. in this case by adding a nucleotide carrying uracil to the chain.4 RNA is classified into different types depending on their function. 1981. The vast majority is messenger RNA (or mRNA). F. . RNA polymerase binds to a promoter segment of DNA located in front of the gene. There is also functional RNA produced by a small number of genes. Elongation. Nature 171 (1953). WATSON AND F. M. 6]. the RNA polymerase complex. The groundwork for our current understanding was laid in the nineteenth century by Gregor Mendel. An English translation of this work can be found in [3]. A structure for deoxyribose nucleic acid. 1966. G RIFFITH . the same as the non-template sequence of the gene. is similar to the replication process of DNA. Examples are transfer RNA (or tRNA).

Among a much larger variety of amino acids. As can be seen in Alanine Cysteine Aspartate Glutamate Phenylalanine Glycine Histidine Isoleucine Lysine Leucine Ala Cys Asp Glu Phe Gly His Ile Lys Leu A C D E F G H I K L Methionine Asparagine Proline Glutamine Arginine Serine Threonine Valine Tryptophan Tyrosine Met Asn Pro Gln Arg Ser Thr Val Trp Tyr M N P Q R S T V W Y I. Only L-amino acids occur in nature as building blocks of proteins.7: The two isomers of an amino acid. Different residues are distinguished by their side-chains.9. In this section. All unlabeled nodes are either carbon or hydrogen atoms.6.6: Two amino acid residues joined by a peptide bond. we mark double and partially double bonds by boldface edges. We list their names together with their three-letter codes and single-letter abbreviations in Table I. C . Asparagine The four neighbors of an -carbon. two amino acids are linked by a peptide bond whose creation releases water. which is part of the backbone. Each amino acid consists of a central carbon atom.7. H N H H C C OH O H H N H OH2 H N H H C O C N H H C C OH O C C OH O     Table I. with rare occurrences of oxygen. are at the vertex positions of a tetrahedron around C .9 have pentagonal and hexagonal ring ¢ L R R H D Figure I. residues differ widely in size and structure. -carbon and carbon atoms is the backbone of the protein. we sketch the translation process and discuss the chemical structure of proteins. one hydrogen atom. Chemical structure. nitrogen and sulfur atoms. a carboxyl group.8 may be viewed as trees rooted at the -carbon. Most of the internal nodes are carbon atoms.1: Names. the carbon. one being the mirror image of the other. + R R Figures I. As shown in Figure I. Amino acids that are linked into a polypeptide chain are referred to as residues. sketched in Figure I.2 Proteins and Amino Acids 5 Amino acids.2 Proteins and Amino Acids Proteins are polypeptide chains obtained by translation from strands of messenger RNA.1.8 and I. The two oriented forms are referred to as isomers and distinguished by letters L and D. . codes and abbreviations of the twenty amino acids that occur as building blocks of natural proteins. This tetrahedron has two orientations. ¡ ¡   Glycine Alanine O O Threonine S Cysteine O O Serine Aspartate N NH2 COOH COOH NH 2 S N N N O O Glutamine Lysine Methionine Glutamate O N Cα Cα Arginine H Figure I. Four of the five amino acids   R R Valine Isoleucine Leucine O N Figure I. The shaded circle is the -carbon on the backbone. A protein is a linear sequence of amino acids connected to each other by peptide bonds. nature uses only twenty to build proteins. The resulting repeating sequence of nitrogen.I. As before.8: The fifteen amino acids without cycle in their chemical structure. as illustrated in Figure I. The fifteen amino acids sketched in Figure I. and a side-chain. linked to an amino group.

2.MOLECULES The translation is accomplished by transfer RNA molecules that recognize codons through the same binding mechanism used for replication and transcription. it differs from the tRNA that binds to the AUG codon in the middle of the sequence. which implies that the map is not injective but uses redundancy to reduce the number of outcomes. G in the first row and C.9: The five amino acids with cyclic chemical structure.   I B IO . In many cases. . Since codons are triplets of nucleotides. Genetic code. and UGA. The sequence of nucleotides is read consecutively in groups of three. The complete map is shown in Table I.3. although that one also binds to methionine. As mentioned above. The codon XYZ is A A G C U Tyr Tyr Cys Lys Asn Glu Asp Gln His Lys Asn Glu Asp Gln His Arg Ser Gly Gly Arg Arg G Arg Ser Gly Gly Arg Arg Trp Cys Thr Thr Ala Ala Pro Pro Ser Ser Translation. an accurate match at the first two positions suffices and a mismatch at the third position can be tolerated. each producing an entirely different residue sequence. This explains the relative uniformity among the four residues in any one slot of Table I. There are only twenty residues. one of which is sketched in Figure I. The translation process is more involved than transcription because it converts information between two languages that use different alphabets. Since there are four different types of nucleotides. This unique feature locally restricts the flexibility of the backbone. AUG. The redundancy is in part due to multiple tRNA molecules carrying the same residue and in part because there is flexibility in how the tRNA reads the codons.2. N Proline N Tryptophan O N O Tyrosine Phenylalanine N Histidine Figure I. The correct reading frame is identified by starting the translation always at a start codon. A tRNA Table I. mapped to one of the residues in the row of X and the column of Y. Each tRNA is a short sequence of about 80 nucleotides. as will be discussed in Section I.2: The genetic code. and complementary substrings shown. ¦ ¢   £¡¢ amino acid 3’ ¦ 5’ C Thr Thr Ala Ala Pro Pro Ser Ser Ile Ile Val Val Leu Leu Leu Phe U Met Ile Val Val Leu Leu Leu Phe G C G G A U U C U C G G A G C C C A G G G U C C G C C U A A G A C A C C U G U G anti−codon GAA Figure I. Empty entries correspond to the stop codons. the tRNA molecules are instrumental in translating codons into residues. we have codons. there are apparently three possible reading frames. called codons. UAG.10. covalently attached amino acid at the top. The initiator tRNA is a specific transfer RNA that recognizes this sequence and binds to methionine. The four positions inside that slot correspond to A. The fifth amino acid is proline. The start codon is AUG and maps to methionine. which forms a cycle by having its chain connect back to the nitrogen next to the -carbon along the backbone. U in the second row. which are UAA.6 structures. Some residues correspond to more codons than others. Complementary subsequences form double-helix substructures that further fold up to characteristic ‘clover leaf’ formations.10: Transfer RNA with anti-codon at the bottom. Incidentally.

W ILKINSON . £ ¢  ¡ . J. The protein chain and the mRNA are released and the ribosome dissociates into its two subunits. B. with several ribosomes working concurrently and in sequence along the strand. An Introduction to the Molecular Biology of the Cell. J.2 Proteins and Amino Acids molecule matches the exposed codon of the mRNA with its anti-codon and contributes its residue to the polypeptide chain that grows at the other end. 6]. K. and a few more years to decipher the genetic code on which the dogma is based. M. C REIGHTON . the translation even starts during transcription. A. M OORE AND T. Protein Structure. [2] N. New York. P. 7 [4] N. Biochemistry. 1993. P. it took only a few years for the community to agree on the central dogma. A LBERTS . 1993. Proteins: Structures and Molecular Properties. It consists of a small subunit and a large subunit. 1990. Bibliographic notes. New York. WALTER . [1] B.I. before the mRNA strand is complete. After the determination of the DNA structure in 1953. New York. 1998. BAN . Science 11 (2000). England. H ANSEN . J OHNSON . [3] T. it finds a tRNA with matching anti-codon and appends its amino acid as a residue to the carboxyl end of the growing polypeptide chain. the translation of an mRNA strand into a protein happens in parallel. 5]. Oxford Univ. B RAY. ROBERTS AND P. S TRYER . R AFF . which come together around an mRNA strand with the help of the initiator tRNA that contributes the first residue. The ribosome scans through the strand like a tape reader. Most of the twenty amino acids that occur in proteins have been identified in the nineteenth century. The orientation of the mRNA strand from the 5. L EWIS . Similar to transcription. The complete atomic structure of the large ribo˚ somal subunit at A resolution. [5] P. Third edition. The translation process ends when a stop codon is read. Freeman. S TEITZ . 3. J. For each codon. Garland. Press. Essential Cell Biology. E. The codon and anticodon are matched in anti-parallel orientation. Oxford Univ. Considerably shorter and more focussed descriptions of proteins and protein structures can be found in [4. which is a large complex made from more than 50 different proteins and several RNA molecules. In some cases. DARBY AND T. The geometric structure of the ribosome has recently been resolved by x-ray crystallography [2]. The material of this section is taken from [1. J.to the 3-end is thus preserved by the orientation of the polypeptide chain from the amino group of the first to the carboxyl group of the last residue. 878– 879. [6] L. The translation process is facilitated by the ribosome. M OODY AND A. England. C REIGHTON . E. Press. 1988. as always. N IESSEN . D. Freeman. C. Protein Engineering. E. A. Second edition. all three of which are comprehensive texts in their respective fields.

glycine is only H. which by convention is for the trans and for the cis form. The structure is stabilized by hydrogen bonds between every CO group and the NH group four residues later. The realizable angle pairs as a subset of the square of angle pairs. same proteins fold up to same shapes. which are flat and made up of several strands. In Figure I.11 its geometric structure. An interesting residue in this respect is proline. whose backbone forms a right-handed helix. and measures the rotation around the C -C bond. .8 I B IO . and this is really the reason why geometry plays an important role in their study. This so-called Ramachandran plot for glycine is sketched in Figure I. Bond rotation. which © ¨ ¢ Another recurring motif are -sheets.12. The stabilizing hydrogen bonds are between neighboring strands. and in this way restricts the rotational degree of freedom to a small region.3 Structural Organization We cannot hope to understand proteins without a good grasp of their multi-level structural organization. The characteristic dihedral angles for a right-handed -helix are roughly and . # ¡   $"! ¥ § ¢   ¡ ¥ £ § ¤  ¡   ¡ © ¥ £ ¦ ¤    ¥ £ ¦     ¨ ¨ © ¢ .6 shows its chemical and Figure I. Contiguous -carbons are separated ˚ in the rotation direction and by about A rise. A rotation takes about residues and produces an axial separation of about ˚ A. In contrast. A motif that is commonly observed in proteins is the -helix. Again by convention. and for the two coplanar trans forms. which can run in the same direction (parallel) or in opposite directions (anti-parallel). £ ¥   Ramachandran plot. A given residue prohibits some angles because of steric hindrances. A strand can be obtained by stretching the -helix until the axial distance between two ˚ contiguous -carbons reaches about A. which is the link between the carbon and the nitrogen atoms. The conformation of the backbone is completely determined when . and the cis form. Because of partial doubleCα   prohibited collisions between atoms. The side-chain of ψ φ O C ψ H H N H Cα N φ Cβ C Cα O Figure I.13 the tubes are visible as spiral sections of the ribbon. Figure I. 0     £     ¥ % )(¦ ¥     ©   ¥ % '&£ ¥   ¥  §   ¨ ¢ ¦ £ ¥ bond character. which is the reason that a relatively large portion of the square of angle pairs is realizable. Consider the three bonds from one carbon to the next along a protein backbone. measures the rotation around the N-C bond. in which it curves in one direction (zig-zig). and refer to it as a peptide unit. which is measured along the axis. . ¡ ¡ Two common motifs. As shown in Figure I. .11. Cartoon representations of protein structures usually draw -helices as tubes. and are specified for each residue in the chain. The and angles measure rotations around the bonds preceding and succeeding every -carbon atom. in which C -C-N-C is relatively stretched (zig-zag). All side-chains lie outside the helix structure. A will generally prohibit a larger range of smaller one. They combine strands to sheets. which differs from all others because it binds back to the backbone. The two forms are distinguished by the rotation angle along the C-N bond.12: The square represents all angle pairs and the shading indicates the region of disallowed pairs for glycine.11: The planarity of a peptide bond is caused by its partial double-bond character. Figure I. the links between the -carbon and the carbon and nitrogen atoms are single bonds with one-dimensional rotational degrees of freedom.MOLECULES are physically larger residue angles than a are visualized  ¥ £  ¥ £ § ¤  §   ¥  I. There are however two possibly planar configurations: the trans form. there is no freedom to rotate around the peptide bond. Most surprisingly.

which affect atoms in short distance (within ˚ about A). A protein typically has a few regions embedded in its surface. that are specific to interactions with other molecules. which have to be obtained from the known chemical structure threaded into the density.3 Structural Organization 9 Quaternary structure refers to the spatial arrangement of subunits of a protein. Prepare a protein crystal. it would be desirable to automate the process. £ ¥ Figure I. CO Cα NH OC Cα HN CO Cα NH Cα NH HN CO Cα NH OC OC Cα HN CO HN Cα Cα NH OC Cα Cα OC CO Cα NH OC NH CO HN Cα Figure I. its accumulated influence is significant if two subunits have geometrically complementary shapes that permit a large number of atom pairs within the reach of the force. Evidence for that claim can be provided by mutating a protein and distinguishing between mutations that preserve and that change the active sites. We only scratch the surface by explaining the principle steps in the reconstruction of protein structures from x-ray diffractions: 1. . Secondary structure refers to the spatial arrangement of residues that are near each other along the chain. Tertiary structure refers to the spatial arrangement of residues that are far from each other along the chain. It is common to distinguish four levels of organization in the description of protein architecture: Primary structure refers to the sequence of residues along the oriented polypeptide chain. but there are others and most notably images generated from nuclear magnetic resonance (or NMR) experiments. While active sites usually occupy only a small fraction of the surface.I. In biology. Even though proteins are large molecules that typically consist of a few thousand atoms. Expose the crystal to x-ray beams and collect the diffractions. Each chain forms what we call a subunit. The description of quaternary structure includes the rather weak van der Waals forces.14: Two parallel -strands to the left and two antiparallel ones to the right.13: Ribbon diagrams visualize proteins by emphasizing the backbones as it winds its way through the structure. This accumulated effect thus prefers interactions between geometrically complementary shapes. Compute the electron density and from it derive the structure. and quaternary structure addresses questions about their relative position and interaction. Both methods are complicated and laborious. Both options are illustrated in Figure I. That specificity plays a dominant role also in protein-protein and in protein-ligand interactions. How do we then know anything about the structural organization of proteins? The primary source today are xray diffractions from protein crystals. Structure determination. A single protein may indeed contain more than one polypeptide chain. It seems that Step 1 is the main obstacle in reaching this goal. 3. they decide protein function. so-called active sites.   Protein architecture and function. 2. this fact is expressed by saying that the van der Waals force creates specificity in the interaction. The dotted edges represent stabilizing hydrogen bonds. The x-ray experiment does not determine the element identities of the atoms.14. Since there are probably hundreds of thousands of different proteins. Although this force is weak compared to others. they are not visible under an electron microscope.

10 in part because some proteins are not known to form crystals at all. Step 2 requires an x-ray source, a device to rotate the crystal by small angles ( or less), and a detection device. For each angle, we get a two-dimensional picture of diffractions. The three-dimensional electron density is computed from a whole array of such pictures. A typical level surface of an electron density is shown in Figure I.15. The main mathematical tool in the construction
¥     
¡ £   ¤© ¢"¥ ¤©   £ © # §  £   ¤© ¡ § £  ¦© ¤© #   £ ¢¤ ¤¢  ¦# ¤¢  § £ ©  £ ¡  ¤¢  §  £ ¥ ¦# "!  ¡  £ ¥ ¡ "!  ¥ §  £ ¡ ¤¢  § ¡ £ ¡ ¦¦¥ ¤¢ 

I B IO - MOLECULES
§¦ ¤¨  £ ¡ §%§ ¤¨   £ ¡ # ¥ £ ¡ ¦¡ ¤¨ ¦¥ %$¨ ¡ £ § © # £ § ¦¥ %$¨ ¡ ¤¨ £  ¤¨  £ ¥ © £ ¡ ¤¨ ©   £  ¢"¥ ¤¨ ¡ £  § ¤¨ ¥  £      ¨
 

Table I.3: Incomplete records of the atoms that belong to an arginine residue. CA is the -carbon atom, CB the -carbon, etc.
¢

Figure I.15: The so-called chicken wire representation of a level surface of a three-dimensional density.

Bibliographic notes. The Ramachandran plot for realizable bond rotations goes back to work by Ramachandran and Sasisekharan [6]. The -helix has been suggested as a common motif in proteins by Pauling and collaborators in 1951 [4], and in the same year they also identified the -sheet [3]. This was a few years before these motifs had been observed in x-ray experiments. In the late 1950s, Max Perutz reconstructed the structure of hemoglobin from x-ray diffraction data [5], and John Kendrew did the same for myoglobin. A classic text on the x-ray crystallography method is [2]. The material on x-ray crystallography and PDB files presented in this section is taken from [1].
[1] L. J. BANASZAK . Foundations of Structural Biology. Academic Press, San Diego, California, 2000. [2] T. B LUNDELL AND L. J OHNSON . Protein Crystallography. Academic Press, New York, 1976. [3] L. PAULING AND R. B. C OREY. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci. USA 37 (1951), 729–740. [4] L. PAULING , R. B. C OREY AND H. R. B RONSON . The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA 37 (1951), 205–211. [5] M. F. P ERUTZ . X-ray analysis of hemoglobin. Lex Prix Nobel, Stockholm, 1963. [6] G. N. R AMACHANDRAN AND V. S ASISEKHARAN . Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7 (1963), 95–99.
 

of the electron density is the Fourier transform. A fundamental difficulty in this step is that only the amplitudes (intensities) of the waveforms are observable, while the phase information must be obtained by different means. Protein data banks. After completing the structural study of a crystallized protein, investigators usually send their results to the Protein Data Base, which is a public repository of protein structures described in so-called PDB files. At the beginning of each file we find ancillary information, including the header, the name of the protein, the author, the reference to the corresponding journal article, etc. There is also information about non-standard components and about secondary structure elements. The main body of the file lists the coordinates of the observed atoms. They are always given in an orthonormal coordinate system, in which the length unit is one angstrom. Table I.3 illustrates the format by showing a small portion of a PDB file for hemoglobin, listing the coordinates of the atoms of an arginine residue. Note that there are no hydrogen atoms, since they are too small to be resolved by an x-ray experiment.

¡ ¥ £ # ¦# ¤¨  ¤§ ¨ © £   ©   £ © ¢¤¨    £   ¢¤¡ ¨ § ¨ £   ¡¦¡ ¤¨ ¥ £ ©   £ © ¦¥ ¤¨ ¦ ¤¨ ¥ £ # ¡  £ #  ¤¨  § £ © ¦ ¤¨    £ © ¢¤¨

ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM

N CA C O CB CG CD NE CZ NH1 NH2

ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG

0

I.4 Molecular Mechanics

11 the Avogadro’s number of its atoms. In other words, if the mass of one atom of that element is daltons then the mass of one mole is grams. Table I.4 lists properties of elements that are commonly found in organic matter.
element Hydrogen Carbon Nitrogen Oxygen Sodium Magnesium Phosphorus Sulfur Chlorine Potassium Calcium #p 1 6 7 8 11 12 15 16 17 19 20 #n 0 6 7 8 12 12 16 16 18 20 20 electron shells . .. .. .. .. .. .. .. .. .. .. .... ..... ...... ........ ........ ........ ........ ........ ........ ........

I.4 Molecular Mechanics
After a protein has been created by translation, it folds into a shape, or conformation, that is determined by its sequence of residues. The folding process is a reaction to a multitude of forces that simultaneously act on every part of the protein. This section presents some of the current knowledge and efforts to model these forces. We begin by studying atoms and discuss covalent and non-covalent forces.

Atoms. Each atom has a positively charged massive nucleus, which is surrounded by a cloud of negatively charged electrons. The nucleus consists of protons, each contributing a unit positive charge, and of electronically neutral neutrons. The electrons are held in orbit by electrostatic attraction to the nucleus. Each electron has one unit of negative charge, which exactly neutralizes the positive charge of one proton. In total, we have the same number of protons and electrons and thus an electronically neutral atom, as illustrated in Figure I.16. Different

H C N O Na Mg P S Cl K Ca

Table I.4: Some elements together with their numbers of protons, neutrons and electrons distributed in the shells around the nucleus.

-

-

-

-

+

+ + + + + +
-

Figure I.16: A schematic picture of a hydrogen atom to the left and a carbon atom to the right.

elements consist of atoms with different numbers of protons. The atomic number is by definition the number of protons, which is also the number of electrons. The number of neutrons is usually about the same because too few or too many neutrons destabilize the nucleus. The atomic weight is the ratio of its mass over the mass of a single hydrogen atom. Because the mass of an electron is negligible, the atomic weight is almost exactly the number of protons plus the number of neutrons. Avogadro’s number is useful in translating from the miniscule world of single atoms into a humanly more accessible scale. It is the number of hydrogen atoms in one gram of hydrogen, which is roughly . The mass of one hydrogen atom is therefore gram which, by definition, is one dalton. One mole of an element is

Covalent bonds. According to the Born model, electrons live in shells around the nucleus and populate inner shells before using outer ones. The first three shells from inside out can hold up to 2, 8 and 8 electrons, as indicated in Table I.4. The chemical properties of an atom are defined by the tendency to either empty or complete its partially incomplete shell, if any. One way of doing that is by sharing electrons. The shared electrons complete the outermost non-empty shells of both atoms involved. According to Table I.4, carbon, nitrogen and oxygen need four, three and two electrons to fill their outer shells. As illustrated in Figure I.17, this can for example be done by covalently binding to the same number of hydrogen atoms. We can now define a molecule as a
+ + + + +

Figure I.17: The geometry of covalent bonding for carbon, nitrogen, and oxygen.

connected component of the graph whose vertices are the atoms and whose edges are the covalent bonds. When an atom covalently bonds to more than one other atom, then there is a preferred angle between pairs of bonds. For ex-

£

£

. .. ..... ...... ....... ........ ........

. ..

¢ 

¢     ¡  ¢   ¦  

¢

12 ample for carbon, this angle is what we get by connecting the centroid of a regular tetrahedron with two of the vertices. Using elementary geometry we find this angle is . Two atoms can also form a covalent double bond, which forces the nuclei closer together and is stronger than the corresponding single bond. It also prevents any torsional rotation around that bond, which is possible for single bonds. We need a sequence of four atoms and three covalent bonds to define the torsional angle of the middle bond. It is generally parametrized such that corresponds to the trans (zig-zag) coplanar configuration. For example for H CCH , we have three bonds on each side of the middle bond. There is an energetic preference for staggering the covalent bonds on the two sides, which corresponds to torsional angles of , , and . When two atoms that covalently bond are of different type then they generally attract the shared electron to different degrees. The shared electrons will therefore have a bias towards one end of the structure or another. We then have a polar structure in which the positive charge is concentrated on one end and the negative charge on the other. Examples of polar covalent bonds are between hydrogen and oxygen and between hydrogen and nitrogen, as illustrated in Figure I.17. In contrast, the bond between hydrogen and carbon has the electrons attracted much more equally and is relatively non-polar.

I B IO - MOLECULES der Waals interaction. Experimental observations point to a potential energy function roughly as graphed in Figure I.18. The corresponding force is the negative derivative,
energy

Non-covalent bonds. An atom can also donate an electron to another atom and thus create a complete outer shell. An example is sodium donating the only electron in its third shell to chlorine, which uses it to complete its third shell. As a result we get positively charged sodium cations and negatively charged chloride anions. Both are attracted to each other by electrostatic force and form a regular grid packing, in which each sodium cation is surrounded by six chloride anions, and vice versa. These arrangements are known as table salt. A weaker interaction, also based on electrostatic force, is generated by polar molecules. A prime example is water, which is partially positively charged at the two hydrogen ends. Water molecules thus tend to aggregate in small semi-regular structures, but this force is weak and bonds of this kind are constantly formed and broken. The polarity of water molecules is the basis for the difference between hydrophilic molecules, that are polar and therefore attract water, and hydrophobic molecules, that are non-polar and do not attract water. Another non-covalent force is responsible for the van

¢

¥     ¦¦  ¦   % !  ¥ §  ¥ ¥ £ ¦ ¤ 

¥ £ &  

  ¦¢

¥ § 

   © § ¥ £  ¨¦¤

¢

 

¢

¡ ¢ 

distance

Figure I.18: The van der Waals force is obtained by adding the attractive force (derivative of dashed curve) and the repulsive force (derivative of the dotted curve).

which is interpreted as a balance between an attractive and a repulsive force. The attraction is due to a dispersive force that can be explained using quantum mechanics. The repulsion also has a quantum mechanical explanation in terms of the Pauli principle, which prohibits any two electrons from having the same set of quantum numbers. It is useful to keep the relative strengths of the various forces in mind. Table I.5 gives estimates of the amount of energy necessary to break one mole of bonds.
bond type covalent ionic hydrogen van der Waals strength in vacuum water 90.0 90.0 90.0 3.0 4.0 1.0 0.1 0.1

Table I.5: Relative strength measured in kilo-calories per mole necessary to break the bonds. Water molecules interfere with ionic and hydrogen bonds, which are therefore considerably weaker in a solution than in a vacuum.

Force field. To get a handle on how molecules move, we define the potential energy of a system of atoms. The general assumption is that the system develops towards a minimum. To model the potential energy accurately, we would have to work with quantum mechanics, which is beyond the scope of this book and also beyond the capabilities of current computations for large organic molecules. The alternative is molecular mechanics, which uses classical mechanics to model the forces that act on atoms. The

¡   D© S B £ ¦e G gf ¢ ¡   G" ¡ ©cbIdG " ` This formula contains various constants that depend on the type of atom or interaction involved. . Recall Newton’s three laws of motion: 1. where is the force acting upon . its velocity is . is the dielectric constant of the medium. as illustrated in Figure I. The constants and are the charges. for some . and is the distance between the two atoms. For example. A trajectory is a solution to this equation. The second sum approximates the energy penalty for differing from the reference angle. The problem in molecular dynamics is significantly more involved. We have bodies (atoms) and the energy potential and force depend on the momentary locations of p Bp  srq¢    #B ¦ ¢ purpq Bt G gf   G " B ¦e ¥ ¥   ¥ Bond angle. The fifth sum approximates the van der Waals potential by the Lennard-Jones 126 function. ond law can now be written as . We briefly look at each one of the five terms. ¡ ¦ ¥ ¥ Bond length. Figure I. We use a vector to describe the state of a system of atoms and define the potential energy as a function . and is the value at the unique minimum. Newton’s second law is expressed by the differential equation . Both the gravitational and the electrostatic potentials have this form. A body continues to move in a straight line at constant velocity unless a force acts upon it. 3.19: A generic trajectory when the magnitude of the attraction to the origin decreases with the square distance. ©cbIaG if   ` ¦e § h¦ ¡ ¢ ¡ D# B R ©cbIaG " #D Y£ ` ©D XGV HWFVI U  ©D T  F SB #D PHFGI ¨  ©D E B F ©D R ©D Q£ EB ©D B D bonds ¢ ¡ § CB £ ¦ ¡  ©        7 92 0 783 0 3 ¦ '#& 2 3 A@   5   64 ) '#&  5 2   ( 1( )  0   $%  ¢  "# © !§         ¡       ©   ©    ) 2 0 2 0 2 ) 3  " ¥ ¡ ¤£¢ ¡ ¢   ¥ ¡  ¦ ( § ¤££¡ ¨¦ ¢ ¥  ¡ ¥ ( ¥        ©  ¦ . the trajectory can be computed analytically. Whether or not that approximation suffices depends on what we use it for. that energy is written as 13 as defined is only a rough approximaIt is clear that tion of the real potential energy that drives the behavior of the system.I. The strength. 2. In this case. Newton’s secthe acceleration. Its location at time is . is the distance between the two atoms. The rate of change of the momentum equals the force. is considerably less than for bond length. . marks where the function crosses the zero line. Suppose we write the force as the negative gradient of a potential function: . ¦ angles torsions atoms atoms Torsional rotation. Electrostatic interaction. As before. The strength is relatively large.19. Angles that lead to staggered arrangements of bonds at both sides are energetically preferred. stationary and equal to one over the norm. Molecular dynamics. if the potential is . by a quadratic function. This preference is modeled by a cosine function with minima and the same number of maxima. again by a quadratic function. The third sum approximates the energy for different torsional angles around a bond. In its simplest form. then . The forth sum adds the electrostatic potential between every pair of atoms in the system. Using this notation. namely about one one-hundredth or even less. In simple cases. . three accounting for covalent bonds and two for non-covalent bonds. The collision constant. To every action there is an equal and opposing reaction.4 Molecular Mechanics simplest such model sums five contributions to the potential energy. namely several hundred kilo-calories per mole. One of the applications of force fields is the simulation of molecular motion. and its momentum is . . Van der Waals interaction. Let be the trajectory of a point with mass . The rate of change of the velocity is also referred to as . The first sum approximates the energy penalty for differing from the reference length. the generic trajectory is an ellipse with one focus at the origin.

and we refer to physics texts such as [1. W. Amer. Springer-Verlag. Newton’s second law of motion can now be written as I B IO . Biol. The material on force fields is taken from Leach [4]. The first half of this section is a highly simplified introduction of atoms and bonds. ££¡ ¡ ¢ ¢  ¡ ¦gef £   S   ¤¢ ¦e ¡ gf   ¡ "   ¡ ¤£¢ ¡ § ¦ ¤££¡ ¡   ¢ ¦ ¥    ¥ ¦ ¥ . The definition of the van der Waals radii used to parametrize the Lennard-Jones functions is just one example. Simulating motion with molecular dynamics is an important topic in com- ¥ putational biology. J ORGENSEN AND J. L. 245–279. 253–266. M ERMIN . In this case. 290 (1999). J. 1968. B ONDI . The packing density in proteins: standard radii and volumes. The origin of the force is a fluctuation of electrostatic charge in atoms. To determine the constants needed to parametrize the mathematical formulation of a force field is far from trivial. Mol. Wiley. R. and the force acting on is . A SHCROFT AND N. there is no analytic solution and one has to resort to numerical methods to approximate the trajectories. [2] A. J. Molecular Crystals. Longman. New York. 2002. Tsai et al. Zeitschrift f¨ r Physik 63 (1930). G ERSTEIN . Zur Theorie und Systematik der Molekularkr¨ fte. who quantified the deviation of rare gas from ideal gas behavior. T IRADO -R IVES . Solid State Physics. T SAI . The OPLS potential functions for proteins. The classic two-body problem is the special case in which and is the sum of the two corresponding gravitational potentials. England. Soc. C. Finally. a u [6] T. [4] A. the generic trajectories are again ellipses. As before. Chapters 19 and 20] for further details. The van der Waals potential derives its name from the work of van der Waals. Jorgensen and Tirado-Rives [3] derive parameters in an attempt to reproduce thermodynamic properties in computer simulations. R. S CHLICK . 1996. [1] N. [3] W. [7] J. Bondi [2] looks for the distances of closest approach between atoms to determine van der Waals radii. Orlando. New York. Numerical algorithms for molecular dynamics can be found in Leach [4] and Schlick [6]. Chem. [5] F. Liquids and Glasses. The currently available numerical solutions are inadequate to simulate the entire folding process even for small proteins. One of the difficulties in the simulation is the near cancellation of large forces so that relatively weak residuals gain a decisive influence. we represent the collection of atoms by a point . Bibliographic notes. Energy minimization for crystals of cyclic peptides and crambin. 1976. L EACH . TAYLOR . [7] analyse the most common distances between atoms in small molecule crystals in the Cambridge Structural Database. Even small inaccuracies in the model or the computation can lead to false decisions and possibly spoil the entire remainder of the simulation.14 all bodies. 110 (1988). The energy potential is the function defined earlier.MOLECULES where the mass vector multiplies each component of the acceleration vector with the mass of the corresponding atom. Already for three bodies. 1657-1666. Harlow. L ONDON . Florida. C HOTHIA AND M. Principles and Applications. Molecular Modeling and Simulation. D. There are various approaches to determine these radii. The explanation of the dispersive contribution in terms of quantum mechanics is due to London [5]. Harcourt Brace. Molecular Modeling. The problem in molecular dynamics is even more difficult because the potential function is considerably more complicated than a sum of gravitational potentials.

A double-strand of DNA has no preferred direction. the full dihedral angle is . (i) Is the graph connected? (ii) Does every connected component have a path that passes through every node exactly once? 4.ch). Draw the graph whose nodes are the acyclic amino acids that has an arc connecting two nodes iff one amino acid can be obtained from the other by the replacement or addition of a single atom. Draw the result in form of a Ramachandran plot. Sketch two such lattices by drawing the atoms as points and connecting neighboring atoms by straight edges. which is the area of the unit sphere. [By convention. (i) Download a PDB file from either data base and extract the string of single-letter abbreviations describing the amino acid sequence. The force it exerts on a point is . and the full solid angle is . (ii) The body-centered cube (or BCC) lattice consisting of all points will all even or all odd integer coordinates: such that or . Descriptions of protein structures are publically available at the Protein Data ¤ ¦ p Bp B urq t ge ¦   G srf   G # " ¦ p Bp   B § ¦ © ¢ ¡ ¨ ¡ B ¡ ¦ © ¦ ¢ ¡ ¡ ¥ ¢ ¢ ¡   ¡ ¤ ¥ ¤ ¨ ¡ ¡ £ ¢  £&  ¢   £ ¡ ¢   § £ ¥ ¥   #B '   B ©B   ¡ ¤ ¦ ¡  B ¤ ¦ ¢ ¡ ¢   § £   B . (i) How many different linear pieces of doublestranded DNA of length are there? (ii) How many different cyclic pieces of doublestranded DNA of length are there? [Beware of palindromic sequences. how would you determine whether or not it is a palindromic sequence? (ii) Give an algorithm that finds the longest subsequence that is palindromic. (i) Given a strand. (i) The face-centered cube (or FCC) lattice consisting of all points with integer coordinates whose sum is even: such that .rcsb. 5. In either direction.] 8. Prove that the generic trajectory in this force field is an ellipse centered at the origin.Exercises 15 Base (www. Regular Tetrahedron. Counting strings. The arrangement of atoms in a folded protein is often compared to that in a crystal lattices. Let the energy potential be defined by . Call two linear or cyclic pieces of doublestranded DNA the same if they can be oriented so we read the same string of nucleotides in the two forward directions. Ramachandran Plot. Elliptic Trajectory. ¥ ¥ ©  Exercises 1. Call a single strand of DNA a palindromic sequence if it the same as the the complementary strand read backwards. Structure Repositories. we read the strand in the to direction. (ii) Determine the solid angle formed by three faces meeting at a common vertex. Lattices. 2. A regular tetrahedron has four equilateral triangles as faces.] 3. Palindromic Sequences.org/pdb) and the Swiss Bioinformatics Center (expasy. which is the length of the unit circle. (i) Determine the dihedral angle formed by two faces meeting along a common edge. Download a PDB file and extract the sequence of and angles along the backbone. Amino Acids. which meet along six equally long edges. as usual. but we can orient it so one direction is forward and the other is backward.hcuge. (ii) Is the relative frequency of amino acids you observe related to the relative number of codons that encode them? 6. 7.

MOLECULES .16 I B IO .

Finally in Section II. The details of that shape in terms of its cavities. like proteins fold up to same shapes. The rest of this books takes a complementary view by concentrating on mathematical models and computational data structures that arise in the study of proteins. but this might be a result of evolutionary selection.3. The goal of studying the geometry of proteins is therefore two-fold: the development of new computational tools to help determine or refine structure information and understanding the relationship between shape and function. dynamics. In a natural environment.Chapter II Geometric Models A surprising finding in the research on proteins is the importance of geometric shape in their functioning. this is only a small fraction of the wealth of available sequence information.1. to the near completion of several large-scale genome projects. we introduce some of the basic geometric models useful in representing molecular shape. S EQUENCE S HAPE A protein is a peptide chain of amino acids that folds up and forms a shape. we develop a language suitable for studying details of our models. which is due.3 II. In this chapter.2. which are dual to space-filling diagrams and are our preferred computational representation. By and large. At the current stage of our biological knowledge. In Section II. who aims at pruning the immense variety by limiting attention to physically or chemically likely configurations.4 Space-filling Diagrams Power Diagrams Alpha Shapes Alpha Shape Software Exercises 17 . and in doing   ¡    ¡  F UNCTION II. In Section II. we introduce alpha shapes. protrusions. In Section II. in part. This finding is usually expressed as a causal chain of responsibilities:  so. we talk about the Alpha Shape software and discuss how it can be used.1 II. and energetics determine how it interacts with other molecules. there is an overwhelming accumulation of sequence information. we use Voronoi diagrams to decompose space-filling diagrams.4. Although the number of proteins for which the three-dimensional structure has been resolved and is stored in the Protein Data Base is in the thousands. we introduce space-filling diagrams as the primary geometric model of molecules.2 II. We have seen the bio-chemist’s view in Chapter I. the shape seems to determine how proteins interact with each other and with other molecules.

we specify each ball by its center and its radius . We can imagine creating that portion with a milling machine whose material removing stylus has the shape of the rolling circle. this new curve is the boundary of the portion of that is not covered by any placement of the open disk bounded by the rolling circle.2. then the number of arcs cannot exceed . More formally. We specify each disk by its center and its radius .  ¤0    ¥£¢ 0 ¤ ¥  ¢ 0   ¢   ¦¡  ¡ ¥  ¥ d  ¡  ¡  ¡ ¢ 0   ¥¢ 0 ¥¢   £¢ ¢     ¡¡ ¡ ¥ ¡ 0 . The cen- Union of balls. and the portion of the sphere not covered by any cap is the ¢ ¡ ¡ ¢ 0   ¢    ¡ ¡ ¡ ¡ 0 ¢ ¡ of the disks.1. which we denote as . Similar to the two-dimensional case. We can make the boundary of the disk union smoother by substituting blending curves for the vertices where the circular arcs meet. which has no endpoints. as in Figure II. Even if we allow more general configurations.  ¡   §¢ Rolling circle. The front of II. Let now be a finite set of balls (solid spheres) in three-dimensional Euclidean space. which will be explained in Section II. The tacit assumption in constructing such a diagram is that the locations of the atoms in three-dimensional space are known. Four of the eight disks contribute two arcs each to the boundary. The total number of arcs is however rather limited.1 Space-filling Diagrams A space-filling diagram associates a molecule with a portion of the three-dimensional space it occupies. . At any moment during the motion. If there are disks whose union is a simply connected region.18 II G EOMETRIC M ODELS ter of the circle thus traces out a curve at distance away from the boundary. we cannot get more than arcs. A single disk can contribute any non-negative number of arcs. We thus obtain a tangent continuous immersion of a curve in . An example is shown in Figure II. Hints towards proving the upper bound can be found among the exercises at the end of this chapter. Let be a finite set of disks in the Euclidean plane.2. To understand the structure of the boundary of the union.1. Figure II. . To this end we roll a circle of radius on the outside about the boundary. has a boundary that consists of circular arcs meeting at common vertices.1: Union of disks in the plane. The sphere bounding intersects the other balls in a finite collection of caps.2. we study the portion contributed by a single sphere. The construction is illustrated in Figure II. but there would be if the two disks to the lower left were just a little smaller. the rolling circle describes the rounded boundary. Union of disks. In cases where tangent continuity is important. The upper bound is a consequence of the relationship between arcs in the boundary of the union and angles in the Delaunay triangulation.3 shows the union of balls that represent gramicidin. and on the inside the rounded boundary of the original union. the boundary of the union of uniformly grown disks. which we denote as . is by and We note that the rounded boundary of large tangent continuous but can have cusps at places where the rolling circle cannot quite squeeze through two disks. This curve is the boundary of obtained by growing every disk to radius . Figure II. The union  Figure II.2: On the outside. which consists of convex and reflex circular arcs. we may turn the cusps into crossings by adding arcs connecting the cusps. We study such unions first in the plane and then in space. An atom is represented by a ball (a solid sphere) and a molecule is the union of balls of its atoms. The interior of each cap lies in the interior of the union. There are no cusps in Figure II. which is a small protein of barely more than 300 atoms. It is also possible that an arc is an entire circle. the circle touches the boundary but never intersects the interior.

To get bounds on the total number of faces. The union of convex patches is sometimes referred to as the contact surface because that is where the rolling sphere touches . contribution of the sphere to the boundary of the union. the number of arcs in the boundary of the union of caps is less than . Similarly. The structural description of a finite union of balls is thus recursive in the dimension. This happens because the tunnel connecting the hole to the outside is slightly too narrow for the rolling sphere to squeeze through. In the application of space-filling diagrams to biology. The center of that sphere moves along the boundary of the union of grown balls. To count the faces contributed by our sphere.4: A molecular surface representation of the gramicidin protein. There is a hole whose rounded surface penetrates through the outer surface roughly in the middle of the picture. However. Figure II. can can detect a self-intersection of the surface in Figure II. The caps form the same structure as the disks discussed earlier. the radii of the balls are usually the van der Waals radii of the atoms. the numbers for well packed sets of spheres. It can be shown that for each value of .4 shows such a rounded surface representation of gramicidin. we multiply by and note that each arc belongs to at least two and each vertex belongs to at least three spheres. we also have no more than vertices. We will see that these components are related to the triangles of the Delaunay triangulation. We conclude that there are fewer than faces. By analogy to disks in the plane.3 have radii sphere patches that correspond to faces of . Since each arc has at most two endpoints (if it is a full circle then it has no endpoints) and each endpoint belongs to two arcs. and reflex sphere patches that correspond to vertices of . reflex torus patches that correspond to arcs of . The radius is chosen so that the rolling sphere approximates a water molecule.3: A union of balls representation of the gramicidin protein. and the boundary of is referred to as the solvent accessible 0  ¢ ¡ ¤ ¥  ¢   ¢ Figure II. fewer than arcs. arcs and vertices. Relative to that surface. and the boundary of is referred to as the van der Waals surface. edges and vertices. Rolling sphere. arcs and vertices. only that they live on a (two-dimensional) sphere instead of . and fewer than vertices. we first note that a single sphere intersects the other balls in fewer than caps. .1 Space-filling Diagrams 19 cally tight. This shows that the upper bounds are asymptoti- Figure II. There are convex spheres in Figure II. The same type of symmetry can also be observed in dimensions beyond three. which implies that there are fewer than faces on this one sphere. the .4. and its front sweeps out blending surfaces that cover cusps and crevices of the original boundary. we recall that these are the connected components of the complement of the union of caps.II. ¤ £¢   ¤  ¢ ¤ £¢     ¢ 0 0 ¥ 0 ¤ ¥  ¢ ¥  ¡ ¥¢  ¥ ¥   ¥ ¥  ¥  ¥¢ ¥ ¢ £¡ ¥  ¥  ¡ ¥  ¥ d¥ . To count the faces. there are configurations of balls with at least some constant times faces. which are common for proteins. the union of reflex patches (tori and spheres) is referred to as the re-entrant surface. are much smaller and typically only a constant times . The number of arcs and vertices in the boundary of a union of balls in can be quite a bit higher than the same numbers for a union of disks in . When we look carefully. We can again get a smoother boundary by rolling a sphere of radius about .

The cell of is the set of points at least as close to as to any other weighted point. [1] N. E DELSBRUNNER . ¢ Figure II. chapter 1]. IEEE Comput. Space-filling diagrams have a long tradition in biochemistry and are similar to the CPK mechanical models named after Corey. Analytic molecular surface calculation. this implies that is also star-shaped and that lies also in its kernel. Pauling and Koltun [5. Observe that for every point . J. [2] F. 58–61. C ONNOLLY. Q IAN . The boundary of and of do not necessarily have the same combinatorial structure. Graphics Appl. we have two non-empty cells separated by a two-dimensional membrane. We describe the same complex as a Voronoi diagram of the set of points with weights . The solvent accessible surface in Figure II. and . Voronoi diagrams — a study of a fundamental geometric data structure. Since the membranes bounding the are all sheets of two-sheeted hyperboloids. Increasing all radii of a set of circles or spheres continuously and at the same rate is referred to as the JohnsonMehl model of growth [4]. II G EOMETRIC M ODELS is the star-shaped and that lies in its kernel. 6 (1983).5: Two-dimensional Voronoi diagram generated by uniformly growing the disks. L. We can now see how structural differences between and arise: when we grow the balls. and each vertex sweeps out a curved edge in the common boundary of generically three membranes and three cells. Since common intersection of the . H. 16 (1996). The points of this membrane satisfy which is the equation of one sheet of a two-sheeted hyperboloid. Viewing geometric protein structures from inside a CAVE. Surveys 23 (1991). Appl. An algorithm that computes cells of the additively weighted Voronoi diagram in has been developed and implemented by Will [8]. and Bibliographic notes. [3] M. The same is true for and every . We get the boundary of by drawing the sphere bounding each ball only inside its own Voronoi cell. Otherwise.4 are computed using the software described in [1]. .3 and the molecular surface in Figure II. P. We refer to Aurenhammer [2] for a survey of Voronoi diagrams. the boundary of the union sweeps out the Voronoi diagram. Uniform growth. named after Michael Connolly who wrote early software constructing this surface [3]. 548–558. the line segment connecting and lies entirely in . Consider the case of two weighted points. If one ball is contained in the interior of the other then its cell is empty. In geometry. Figure II. F U AND J. AURENHAMMER . The rounded surface is usually referred to as the molecular surface.20 surface. A KKIRAJU . It follows in particular that is a connected cell. Define the weighted distance of a point from equal to the Euclidean distance minus the weight: . which is sometimes referred to as the additively weighted Voronoi diagram. 7]. Each face of the boundary sweeps out a (three-dimensional) cell in . the boundary of consists of patches of such hyperboloids. The molecular surface is sometimes referred to as the Connolly surface. their algorithms and applications. each arc sweeps out a (two-dimensional) membrane separating two cells. It leads to the Voronoi diagram of this section. this property is expressed by saying that is   2      ¢ ¡      2  2    ¢   ¢ ¤ §¢    ¡ ¢  £¢ 0   ¤  ¢ 2  ¤ £¢   B   0 p ¢  2 2   ¡ 0 0 ¥ ¨¦ ©§ ©B    ¥ ©B    £¢ ¡ ¡ B ¡ ¤ ¢ #B     ©B    ¤ ¢   ¢ ¢ ¥ ¥     £¢ B rp   0   ¢ #B    p  ¢ ¢ ¥ ¢ ¡ B !p p  ¡ B   ¥  ¢ ¢ ¥   B !p   ¢ 3  ¢ ¡ . We can understand structural changes by observing how they are introduced while we continuously grow the balls. The variations of these models discussed in this section have been introduced by Lee and Richards [6.5 illustrates the definition in two dimensions. Crystallogr. the arcs of the patches meet up in pairs along the membranes and in triplets along the curved edges of the Voronoi diagram. ACM Comput. let be the set of points with . which is . and we get a structural re-arrangement whenever we sweep over a vertex of the Voronoi diagram. All these patches are visible in their entirety if viewed from . By construction. 345–405.

1999. AIMME 135 (1939). Biol. Switzerland. Longman. 379–400. Rev. Trans. Am. M. [8] H. Reaction kinetics in processes of nucleation and growth. Biophys. L EACH . 6 (1977). Molecular Modeling. Principles and Applications. R ICHARDS . 1996.-M. Areas. J. ETH 13188. R.II. volumes. 151–176. 55 (1971). The interpretation of protein structures: estimation of static accessibility. M EHL . A. Ann. packing and protein structures. F. Inst. Diss. [5] A. Mol. u . Computation of Additively Weighted Voronoi Cells for Applications in Molecular Biology. Mining Metall.1 Space-filling Diagrams 21 [4] W. [7] F. 416–458. Bioeng. J OHNSON AND R. England. [6] B. R ICHARDS . M. W ILL . Harlow. L EE AND F. ETH Z¨ rich.

We have D     ¢  0   0  p  p  p  p     B    XB  B   XB  B D 0   0    p  rp D 0   0   p  rp B B     #B   D   0   p  ¡ ¡   © ¡D ¢    ¥   £   ¢  0 D 0D   0 ¥ ¥ ¢ ¢ ¥ ¥   DI   0 ¢ ¢ ¥ ¡ ¥  ©   B rp ¥ ¥ ¥ ¡  0   D  D   00   ¢ ¥ D   0   p  B ¢    ¢      § ¥ ¨¦ £ ¤ ¢ ¥ ¥ ¡ ¢ D  B I D   0 ¥ ¥ ¢ I  ¢ ¢ ¢ £ ¥ ¥  ¢ © ¥ ¡ ©  D B !p         0 . We see the circle at which the two spheres intersect sweeps out a plane. If we denote by the set of points whose power distance from is at most as large as the power distance from then . The set of balls at time is denoted as .7 illustrates the definitions in two dimensions by showing the Voronoi diagram of the same eight disks used in earlier figures. If follows that the membranes swept out by the arcs of are pieces of planes. We can describe the decomposition of space implied by the square radius growth model as a Voronoi diagram for yet another weighted distance function. This polyhedron may be bounded or unbounded. At first. #  if lies ¡ " B B  B ¡ "      # #B  B © If we grow the square radii of a finite collection of spheres or balls. edges.6: The line of equal power distance separates if the two circles are disjoint and not nested. Think of the three configura- The first order approximation of the growth is one half the inverse of the radius. intersect both. We grow each ball to radius at time . The two planes are indeed the same. As indicated in Figure II. or lie on the same side of both. In words. Hence. As in Section II. is sometimes referred to as the weight of the point . and it passes outside if the two circles are nested. This decomposition is known as the power diagram and has a variety of applications in molecular modeling. smaller balls never really catch up except in the limit:    Figure II. Every polygon is shared by two cells. Instead we just require that they both be equal. The points that belong to both spheres at time satisfy . we can show that the set of points with equal power distance from two balls form a plane. The appropriate function in this case is the power distance of a point from a ball defined as the square distance from the center minus the weight. so we get tions as snap-shots in an animation in which the center of the small circle moves towards the center of the large circle. larger balls grow slower than smaller ones. and vertices shared by the cells. the line moves in the same direction but then comes to a halt and reverses its direction moving away from the center of the large circle. Power diagram. Varying has the same effect as dropping the requirement that the two expressions vanish. and it is even possible that it is empty.1.2 Power Diagrams 0 . . is the intersection of a finite number of half-spaces and thus a convex polyhedron. the power distance of is the square length of a tangent line segment from to the bounding sphere. inside on boundary of outside !   !        ¥  p ¢ ¥  B rp  # $ II. Figure II.6. The power or (weighted) Voronoi cell of a ball under the power distance is the set of points at least as close to as to any other ball. Of course. it passes through their intersection if that is non-empty. this plane may separate the two bounding spheres.   ¦ ¨ § #B  © ¤ #B  © ¢ B  #   ¢ ¡ 3        ¡ 2 ¡ B¡    # ¡ # ¡ We are interested in the surface swept out by the intersection of the spheres bounding and and claim it is a plane. Using the same algebraic manipulations as above. we let be a finite set of balls .22 II G EOMETRIC M ODELS Growing square radii. we get a decomposition of space into convex polyhedra. The power or (weighted) Voronoi diagram of is the collection of cells together with the polygons. Power distance. The square of the radius. The Taylor series expansion of the radius as a function of time is If lies outside . and in the generic case every edge is shared by exactly three and every vertex is shared by exactly four cells.

The Delaunay triangles are transparent so they do not obstruct the structure of the Voronoi diagram underneath. The (weighted) Delaunay triangulation of is dual to the (weighted) Voronoi diagram. also the number of triangles and tetrahedra are at most some constant times . hence . The Euler relation here is . ¥ ¤ ¦  # #  #  ¥ "  ¢   ¥ ¢  "  ¥ (   ¥ ¦ ¥ ¤¤   #  ¥ ¦ a  d "¦ " ¤ ¢  $       ¥ # ¥d ¦   ¥   ¥      ¦d ¤ " ¥ "  % ¦ ©¥  #  "   £ ¥ ¤  £ ¤ ¦ ¥ ¥ d¥ # ¥  ¦ ¥ ¤ # Delaunay triangulation. each atom is surrounded by its neighbors in the Delaunay triangulation. edges. Similarly. Similarly. and are connected by a triangle if . we can perturb them ever so slightly to move them into general position. polygons become edges. Hence Figure II. edges become There are Delaunay triangulations that have almost this many simplices. . a triangle or a tetrahedron. and share a common vertex. We refer to an element of a Delaunay triangulation as a simplex. and vertices become tetrahedra. and . . we illustrate the definitions by showing a two-dimensional Delaunay triangulation in Figure II. Before counting the simplices in three dimensions.II. The neighbors are near the central atom and are therefore packed in a small amount of space. Combining this with the Eutetrahedra. In three dimensions. and for the numbers of vertices. implying there can only be a small constant number of them. and share a common edge. It is obtained by connecting and by an edge if the cells and share a common polygon. and are connected by a tetrahedron if . Writing .2 Power Diagrams 23 triangles. and . hence .7: Power or weighted Voronoi diagram of eight disks in the plane. this exhausts all possible types of overlap among the Voronoi cells. Assuming the balls in are in general position. . Since complexes of tetrahedra are difficult to draw. The number of vertices is at most the number of disks. If the balls are not in general position. . and as a consequence. and the number of edges is at most the number of pairs of vertices.      ¢ ¥ "  ¦ ¥ # ¢ " ¦  #   #  # ¢   ¡¢  ¢      ¢ # ¢      ¢ # #  # #     # ¢   ¢ ¢ . but they require a placement of the balls that would be rather unlike the configurations we observe for proteins. Number of simplices. which can be a vertex. we reverse the inclusion direction. Combining this inequality with the Euler relation implies and . triangles and tetrahedra. a Voronoi polygon belongs to a Voronoi cell iff the corresponding Delaunay edge contains the corresponding Delaunay vertex. Observe that every triangle has three edges and every edge belongs to at most two triangles. ¥ Observe that we reverse dimensions when we go from the Voronoi diagram to the Delaunay triangulation: cells become vertices. an edge. which says that the alternating sum of simplices is always equal to 1. . We can count the simplices using the Euler relation. Typically. we note that each tetrahedron has four triangles and each triangle belongs to at most two . The number of vertices is at most the number of balls.8. For example. . hence ler relation implies and . It follows that the number of edges in the Delaunay triangulation is at most some constant times . we have # Figure II. let us warm up to the challenge by counting the simplices of a two-dimensional Delaunay triangulation.8: Delaunay triangulation drawn over the dual Voronoi diagram of eight disks in the plane.

which implies that the power distance of from is less than that from . Chapters I and V]. Since real numbers are totally ordered. 527–549. Any two consecutive tetrahedra share a triangle.  ¢ ¡    ¡ § The name is justified because the two tangent planes defined at any point common to the bounding spheres of and form a right angle between them. . Math. Press. Cambridge Univ. Upper bounds on the number of Delaunay simplices for “well-spaced” points in can be found in [5]. This property can be used to characterize Delaunay tetrahedra for a generic set of balls. Let be the viewpoint and write if there is a half-line that emanates from and passes through the interior of the Delaunay tetrahedron before it passes through the interior the Delaunay tetrahedron . . Note that is further than orthogonal from all other balls. that is. It follows that the orthospheres of and of are orthogonal to the three balls whose centers span that triangle. That reference also explains how to computationally cope with ambiguities in the construction caused by non-generic input sets. Fiber polytopes.24 Orthospheres. and . there is no difficulty at all if is negative and is therefore imaginary. S TURMFELS . The half-line passes through a sequence of Delaunay tetrahedra. [4] H. the power distance of from the orthosphere of is less than its power . Suppose for a moment that the balls all have zero radius. That sphere is orthogonal to . We call this the visibility ordering with respect to the given viewpoint. Let be a half-line that emanates from and passes through the interiors of and . Algebraically. D ELAUNAY. In other words. Let now be a vertex of the Voronoi diagram of . The plane of points with equal power distance from and thus contains the shared triangle. as seen from the viewpoint. Acyclicity. Two spheres or balls and are orthogonal if II G EOMETRIC M ODELS that does not intersect any edge of the Delaunay triangulation. and larger power distance from all others. Nauk SSSR. and we have and for some . has equal power distance from four balls. J. ACYCLICITY L EMMA . and belongs to the Delaunay triangulation of iff the orthosphere of . whenever the same is true for and . Then each Voronoi vertex is equally far from four points and coincides with the center of the circumsphere of these points. Math. and . we can order two tetrahedra if one lies in front of the other one. Power diagrams of discrete sets of weighted points have been studied by Carl Friedrich Gauss more than 150 years ago in the context of quadratic forms [6]. Assuming the generic case. and is further than orthogonal from all other balls in . P ROOF. D IRICHLET. Let be the sphere with center and weight . and we refer to it as the orthosphere of the four balls. The dual triangulations have been introduced considerably later by Boris Delaunay (also Delone) [2]. We prefer to be economical with terms and refer to them as (weighted) Delaunay triangulations. In reference to subsequent work by Dirichlet [3] and Voronoi [8]. 40 (1850). a tetrahedron connecting points . . The viewpoint is on ’s side of that plane. 135 (1992). Given a fixed viewpoint. J. Sur la sph` re vide. ¨ [3] P. these diagram are often referred to a Dirichlet tessellations or Voronoi diagrams. The visibility ordering of the Delaunay tetrahedra with respect to any fixed viewpoint is acyclic. we conclude that is acyclic. L. Specifically. Algorithms for constructing weighted Delaunay triangulations in and are discussed in [4. It turns out that this relation can in general have cycles but is acyclic for Delaunay triangulations. Akad. G. [2] B. By transitivity. We use orthospheres to prove that the relation is acyclic. If the four balls had zero radius. Izv. Ann. We will use the concept of orthogonality to generalize this property to the case where the have not necessarily zero and not necessarily equal radii. e Otdelenie Matematicheskii i Estestvennyka Nauk 7 (1934). 793–800. . © ©  ¨  ¥ ¨ ¨    § ¨   §     ¨   ¨ ¡ ¦¨        ¨     ¥  ¦¨ ¨      ¨   ¨  ¨ ¨      0  ¢ ¡ ¡ ¡ 0    #B   § © ¢  ¢ ¡  ¡ 0  0 p ¡ ¡ !Bp     ¡¢   ¡ ¡  ¡ %§ ©B0    ©     0   0    p   p 0    ¡  ¢     ¢ B#    ©B   ©B   ¡   0 ¡ ¡ ¡ ¢ ¢   ¢ ¨ ¥  ©   B ¡0  © ¨  ¢  © ¢ ¥ ¨ © ¢     ¡ ¡ 0 X ¦¥&¨  ¤¢    £ § B ¡ ¡ ¡   ¡ ¡ ¡ ¡ ¡ 0 B £ . for all . E DELSBRUNNER Geometry and Topology for Mesh Generation. 2001. England. It is common to reserve the name Delaunay triangulation for unweighted points and to refer to the duals of power diagrams as regular triangulations [1] or coherent triangulations [7]. 209–227. We may assume Bibliographic notes. . and distance from the orthosphere of . . the power distance increases along chains of the relation . We need some notation. B ILLERA AND B. Reine Angew. Uber die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen. [1] L. would be their circumsphere.

2 Power Diagrams 25 [5] J. M. M. 125–134. Discrete Alg. K APRANOV AND A. Discriminants. [6] C. Resultants and Multidimensional Determinants. Reine Angew. 1994. 20 (1840). J. 97–178. Dense point sets have sparse Delaunay triangulations. 133 (1907). M. a [8] G. E RICKSON . G AUSS . ` e Math. Math. G ELFAND . . Nouvelles applications des param` tres cone tinus a la th´ orie des formes quadratiques. VORONOI . J. Boston.II. 198–287. In “Proc. ACM-SIAM Sympos. Birkh¨ user. [7] I. and 134 (1908). 312–320. 13th Ann.. 2002”. Recursion der Untersuchungen uber die ¨ Eigenschaften der positiven tern¨ ren quadratischen Formen a von Ludwig August Seeber. Z ELE VINSKY. V. F. Reine Angew.

Observe that the Voronoi cells decompose the union of balls in into convex cells . Again. ¥      ¥ 7     ¥5         ¥     ¥ ¥ ¥  ¥    0 ¥¤     £      ¢          ¥ 0  ¥ 0 ¦   ¤ ¤   ¦  ¥ 0  0      0 where is the convex hull of the centers of the balls with index in . Call of a collection of sets independent if for every subcollection there is a point inside every set in and outside every set not in : Hence. The nine edges correspond to the pairwise intersections and the two triangles to the triplewise intersections of the clipped Voronoi cells. The dual complex records the non-empty common intersections among these cells. we refer to it as the dual shape of . in three dimensions. and each arc cuts at most one region into two. two. there must be points whose patterns of inclusion in the sets are pairwise different.10. looks like the ball-and-stick diagram common in chemistry and biology. # A collection of size has subcollections. and they can form only one combinatorially distinct intersection pattern. Let we can get by drawing circles in the plane. each stick represents a covalent bond. and three disks in the plane. iff the common intersection of Voronoi cells has a non-empty intersection with the union of balls: . Recall that a simplex belongs to the dual complex iff the corresponding clipped balls (the ) have a non-empty common intersection. Figure II.9: The dual complex is drawn on top of the Voronoi decomposition of the union of disks.  ¡¡ # II. there can be at most four balls (one more than the dimension of the space). while here. there is only one possible intersection pattern for four independent balls.9 illustrates the definition for the set of disks used in many of the previous figures. which implies that at most three disks can be independent. Let be a subset of the index set. in which the balls have non-empty pairwise but no non-empty triple-wise intersections. The number of regions is therefore           ¢  ¥      ¥  #  "  § ! ¥   ¥ Dual complex. The underlying space is the set of points contained in simplices of . We first discuss this pattern for general sets that are not necessarily balls. We use the pigeonhole principle to show that the maximum number of independent disks in the plane is be the maximum number of regions three. we generalize this construction and consider the dual of the Voronoi diagram restricted to within the union of the defining balls. Equivalently. In the special case. There.10: The independent configurations of one. In this context.3 Alpha Shapes   ¤    §  ¢     ¡  ¢    ¥ ©    ¢ ¤ ¡ ¥¨ ¥ ©   ¨¢ ¦¨ ¡   ¤ § ¥   £  ¡  ¡    ¢£¢ ¡   # £ ¥¨ # ¤   .26 II G EOMETRIC M ODELS Independence. where it can be used to show that the maximum number of independent balls is four. Note that this is just a more formal way of explaining the duality transformation we used in the last section to construct the Delaunay triangulation from the Voronoi diagram. The same argument also works ¥   &   Figure II. . These points cut the -st circle into at most arcs. This condition has an interesting consequence on how the themselves may intersect. it represents the geometric overlap between two balls. In this section. ¡ Recall that the Delaunay triangulation is the dual of the Voronoi diagram. For each there is a (combinatorially) unique independent configuration shown in Figure II. We have and because the -st circle intersects the other circles in at most two points each. Figure II. For this collection to be independent. In a nut-shell.

Let be the collection of balls and the dual complex of at time . We need some notation. Then intersects the other three balls in three non- )   ¡ ¤       ¤ ¤    ¡ any point on the sphere. union: P ROOF. for example .11. To avoid the complications of a discussion for general dimensions. the three caps are not independent. Each has zero weight at time and negative weight and therefore imaginary radius before that time. we assume that is not independent. Figure II. four for a tetrahedron. It follows that each simplex in is independent. I NDEPENDENCE L EMMA . The lemma holds in any dimension. . In discussions of combinatorial properties. The three planes meet at . Filtration. two balls are independent iff the (unique) plane in the corresponding Voronoi diagram has a nonempty intersection with their union. we use the square root. As mentioned above. is not independent. the Voronoi cells of the balls are unchanged at all times. ¡ £   0 ¤    )  £ ¤¥ Figure II. The following lemma is the key to proving that all simplices in the dual complex are independent. but then . We return to the idea of growing the balls continuously and watch how the union changes. . covers all Voronoi vertices. and the dual complex is empty. This is a fairly strong statement since it limits the balls to a single intersection pattern. . To translate between continuous time and discrete To prove the reverse. we get three disks of maximum size by intersecting them with the plane that passes through the centers. and can be proved by induction over the dimension. and because lies outside . we call the simplex independent if the collection of balls is independent. There sphere bounding intersects the other balls in three caps. Given three balls. A particular such configuration is illustrated in Figure II. since the portions of the Voronoi cells covered by the balls can only grow. the dual complexes can also only get larger in time. It can still be that there is a point outside contained in . and so on. But this is exactly the criterion for a simplex to belong to the dual complex.12 illustrates the construction by showing three complexes in the filtration generated by eight disks in the plane. Furthermore. Similarly. we assume the lemma for disks (or rather for caps on a sphere) and prove it for balls in . It follows that the dual complexes that arise throughout time are subcomplexes of one and the same Delaunay triangulation. all radii are imaginary. We will prove shortly that all simplices in the dual complex are independent. )  ¦¤  ¤   ¥ ¡   ¡   ¢ ¡     ¡ ¡ ¤ u D   D ¡ D   ¢  ¡D ¡      ¡     ¡ ¤ ¡ ¢¡    £    D 0 £ ¤¥  D   0 0  ¡ ¤     ¥ ¡ ¡ ¢¡ ¢ ¡       ¢  ¡         ¡  ¡ ¡ ¡   ¢ §§¢¡ ¥ ¡ ¡ ¡         ¢ ¡ ¡ ¡  ¢    ¡ ¡   ¡ ¢ ¡     ¢ . But this implies that the Voronoi vertex lies outside the sphere: . For large enough time. Assume first that . Recall that each simplex in the Delaunay triangulation is spanned by the centers of a small collection of balls. The circles bounding these caps lie in the three planes bounding the Voronoi cell of .11: The planes bounding the Voronoi cell intersect the sphere in three circles. In other words. Instead of time. and because lies outside the sphere. We thus have a sequence of complexes that begins with the empty complex and ends with the Delaunay triangulation. We let time go from to and grow the weight of each ball to at time . We refer to as the -complex and to its underlying space as the -shape of . we sometimes forget the difference and think of the simplex as this collection of balls. the radius of the ball at time vention is that for is . and the dual complex is equal to the Delaunay triangulation.II. for every .3 Alpha Shapes Independent simplices. The main reason for this con. We refer to this sequence as a filtration of the Delaunay triangulation. that is. For small enough (large enough negative) time. So there exists a subset not represented by 27 independent caps. But this implies that three balls are independent iff the (unique) line in the corresponding Voronoi diagram has a non-empty intersection with the union of the three balls. the Independence Lemma also holds for three disks in the plane. three for a triangle. In this spirit. By construction. There are only finitely many simplices and therefore only finitely many subcomplexes of that arise as dual complexes during the growth process. . as claimed. A collection of four balls in is independent iff the (unique) vertex of the corresponding Voronoi diagram is contained in the . . the three caps are not independent. as the index for time varying sets. This plane intersects the Voronoi diagram of the balls in the Voronoi diagram of the disks.

Sometimes. 5]. New York. however. Let the orthosphere of be the smallest sphere orthogonal to all balls whose centers are vertices of . S. In the absence of any degeneracy. but the pair of larger disks became independent earlier. The triangle connecting all three centers and the edge connecting the centers of the two larger disks are born at the same time. About a decade later. We can sort the Delaunay simplices in the order in which they enter the dual complex.28 II G EOMETRIC M ODELS the shared Voronoi vertex. . that also belongs to the difference. E DELSBRUNNER . The main reason for the popularity is the duality between space-filling diagrams and alpha shapes as explained in this and the two preceding sections. From the first to the third complex. we define a function . All these simplices are born at the same time. Define the birth-time of a simplex as the minimum time such that for all . A LEXANDROV. Combinatorial Topology.    such that if We represent the filtration by sorting the Delaunay simplices by birth-time. the birth-time of coincides with the time it becomes independent. Dover. the concept has been generalized to three dimensions and made available as a software package with graphical user interface [4]. Figure II. which has been developed decades earlier in the area of combinatorial topology [1. the difference between and consists of two or more simplices. every prefix is a complex.13: The two larger disks are independent. [2] H. but the dual edge does not belong to the dual complex because their common intersection is disjoint from the corresponding Voronoi edge. 13 (1995). The time becomes independent is also the time the orthosphere of dies or shrinks to a point. 1998 (republication of translation of the original Russian edition from 1947). The difference between two contiguous complexes in the filtration consists of all simplices whose birth-time coincides with the creation of the second complex. Alpha shapes and alpha complexes have been introduced by Edelsbrunner. . Remaining ties are broken arbitrarily.12: Three unions of disks and the corresponding dual complexes. all these simplices are faces of a single simplex. the edges become thinner and the triangles become lighter. 415–440. This property of the ordering will be crucial for the algorithm in Chapter IV that . To fully develop that duality. The unexpected popularity of that software in structural biology triggered the development of further geometric concepts useful in structural biology. This is also the time when the three disks become independent.  ¤ Ordering simplices. In this case. and because of the tie breaking rule. namely when all three disks reach Bibliographic notes. That generalization benefitted from adopting the language of simplicial complexes. and this has been described in complete generality in [2]. Kirkpatrick and Seidel [3] in 1983 for finite sets of points in the plane. . Geom. Discrete Comput. this case is characterized by a non-empty common intersection between the affine hull of and the Voronoi cells of its vertices. with the orthosphere of dying last at time . alpha shapes had to be extended to take into account weights. Often two contiguous complexes and differ by only one simplex. The first complex contains all vertices but only two edges and no triangles. and in case of a tie by dimension.13 illustrates this case. Every dual complex is a prefix of this ordering.   ¤  ¨   ¨  ¡  ¤  ¤  £ ¡ ¨    ¢ ) ¡ ¨ ¡  D ¨  ¤  ¤ ¨  ¡  ) ¡ ¨       D  ¨   ¡  ¨ rank. Figure II. even if it does not coincide with a dual complex. The union of balls and its dual shape. computes the connectivity of the  ¤   ¤ ¡ ¤ %¨     ¨  £ ¡ ¨   D  ¤  ¥     ¨ ¨    ¤ ¨    D ¨   £ . [1] P. Geometrically. some of which are explained in this book. their orthospheres die at different times. Figure II. In the generic case.

Chapman and Hall. Three-dimensional alpha shapes. ¨ [4] H. 1981. G IBLIN . D. Theory IT-29 (1983). G. 551–559. . London.3 Alpha Shapes 29 [3] H. E DELSBRUNNER . S EI DEL .II. Graphs. Second edition. [5] P. K IRKPATRICK AND R. ACM Trans. J. M UCKE . 43– 72. IEEE Trans. E DELSBRUNNER AND E. P. Surfaces and Homology. On the shape of a set of points in the plane. Graphics 13 (1994). Inform.

see Section II. we take four steps to construct and visualize alpha shapes in an interactive graphical user interface: > > > > pdb2alf name. The first step towards computing alpha shapes is to construct the Delaunay triangulation of the set of balls. One of the most problematic elements is hydrogen (H). which is the most common approximation used for the size of water molecules.3. With this notation. including measurements of closest approach. but can be inferred to some accuracy from the types and relative positions of the other atoms in the protein. and so is the Delaunay triangulation. Unfortunately. The coordinates are explicitely given in the file. The -r option allows for the specification of a radius increment that is applied to every atom in the file. Delaunay triangulation. we write for the set of the first balls and for the Delaunay triangulation of . Hydrogen atoms sometimes donate their electrons to complete the shells of other atoms and thus can exist without any shell and radius to speak of.4 A. .1 of the Alpha Shape software executed on an SGI workstation running under the UNIX operating system and may differ for other versions and platforms. endfor.1. The basic strategy is incremental. £ § ¢ ¡ ¢  ¢ ¡ The details of the discussion apply to Version 4. which accounts for almost 50% of the number of atoms found in organic matter.30 II G EOMETRIC M ODELS tains a line for each atom listing its three coordinates and the van der Waals radius. molecular mechanics calculations. Only a fraction of the information is needed to construct alpha shapes. there is no universally agreed upon table. Specifically. name. Some differences are due to different methods used to derive radii. This is accomplished by the command > delcx name The aunay omple program creates a file name. We briefly mention the algorithmic ingredients used. and We can extract the coordinates and the radii using software that is part of the Alpha Shapes distribution. In our example.4 Alpha Shape Software This section introduces the basic Alpha Shape software and explains how to go from a standard descriptions of protein structures to the visualization of their alpha shapes. adding one ball at a time to the triangulation. Using an arbitrary ordering of the balls.pdb. the algorithm can be written as follows.pdb and create a new file name that con-    for to do I NSERT ¡  ¡ )    ) .dt that represents the Delaunay triangulation. Hydrogen atoms are generally not represented in pdb-files. To cope with the related robustness problem. Both tests reduce to the sign of the determinant of a small matrix and can be decided without computing intermediate geometric information. the van der Waals radii of larger atoms are adjusted to include the bonded hydrogen atoms. we use exact arithmetic and simulated perturbation.pdb name to read name. etc. we call > pdb2alf -r 1.pdb name delcx name mkalf name alvis name The -th ball is inserted through a sequence of flip operations. but the radius must be inferred from the atom type. The main public source for structural protein data is the Protein Data Bank (pbd) mentioned in Section I. Exact arithmetic guarantees the correct execution of flips in all generic and therefore unambiguous cases. The resulting set of balls thus defines the solvent accessible diagram representing the interaction with the surrounding water. This is done according to published translation tables that map atoms to van der Waals radii. II. The discussion is more descriptive and less analytical than in the previous three sections. In the common unified atom model. this radius increment is ˚ 1. O RIENTATION : decide whether a ball center is on the positive or negative side of the oriented plane spanned by three other ball centers. The flips are performed depending on the outcomes of only two types of primitive tests needed in the construction of the Delaunay triangulation: O RTHOGONALITY: decide whether a ball is closer or further than orthogonal to the orthosphere of four other balls.4 name. for each atom we only need its coordinates in three-dimensional space and its radius.   ¥  ) ¥ ¤ ¥ £ £ ¤     ¢  ¡ ) £   £ Data format. The operations are ambiguous if the balls are in non-generic position. for . Specifically. Given a pdb-file. The efficient and robust construction of the Delaunay triangulation in is not entirely straightforward.

This is the filtration of -complexes. The sequence is generated by calling ¥ ¥  £ § ¨ ¢  £   ¡ D ¤   D   £ usually well packed and have Delaunay triangulations of size at most proportional to . we need quick access to the simplices of the various types in . and a signature panel. For example. and the filtration file. In other words. Finally. . that stores the filtration along with some auxiliary data structures. if the centers of the balls lie on the moment curve and all radii are equal.14: Edge-skeleton of the Delaunay triangulation of twenty one points on the moment curve in . we apply a random permutation to the input sequence and construct the Delaunay triangulation following this permutation. when becomes a face of another simplex. and for a given moment . The main reason for recording all this information is to determine how to draw in the graphical interface. It stores each simplex several times. The use of exact rather than floating-point arithmetic poses a challenge to the efficiency of the code.dt and generates a new file. Fortunately. so . name. This danger is quite real as systematic enumerations of the data tend to generate subconfigurations with relatively large Delaunay triangulations. but there are others. Given a value of . we store the existence intervals in a number of intervals trees. The necessary support structures are computed and the graphics user interface is opened by executing > alvis name The pha shape ualization program uses both the Delaunay triangulation file. The software refers to the sorted sequence of simplices as the ‘masterlist’. ¨ Figure II. Then we spend a lot of time constructing that triangulation. The remedy here is to add the balls in a random sequence. only to destroy most of it before arriving at the final triangulation. so .II.4 Alpha Shape Software simulated perturbation reduces ambiguous cases in a consistent manner to unambiguous ones. For this purpose. The simplex is regular if it belongs to the boundary but is not principal. marking when is born. then every pair of vertices forms an edge in the Delaunay triangulation. As mentioned in Section II. Figure II. bound the error.14. A simplex in the boundary of can never become interior. so . Each such tree stores some number of intervals in space O( ). name. Then  Visualization. for .2. and scene panel. As explained in Section II. £    ¨ D  D      D ¡ D       is  D           ¤ D ¤ ¡D   D   D    D ¡ not in singular regular interior if if if if ) ¡ ¨ ¡ D ¨ ¨ ¨   ¤ ¡ ¥ ¨ D ¨ ¦¦    ¦¦ ) ¢  D ¤ D ¤ ¡D £ ¨ ¤ £ ¢ ¡ ¥ ¡  £ )       ¥ ¡ ¤  ¢ £ ¤ ¥    ¢  ¤ £¡ ¤    ¢   . For example. we only show the singular simplices together with the regular triangles. A common remedy is to use so-called floating-point filters: calculate in floating-point arithmetic. Suppose the three events happen at times . and it is interior if it is completely surrounded by other simplices.15 shows four alpha complexes of the relatively small gramicidin protein. Some of the three events may coincide. a tetrahedron is interior as soon as it is born. Another challenge to the efficiency of the code is the inherent size of the Delaunay triangulation. All alpha ©  © ¢ £ Filtration. and redo the computation in exact arithmetic if the error is too large to guarantee a correct decision. The combinatorial topology term for being singular is principal and means that is not a face of any other simplex. In each case. dual complexes obtained by growing the square radii form a nested sequence of subcomplexes of the Delaunay triangulation. a simplex whose orthosphere dies strictly before the simplex is born is never singular. and when becomes interior to the alpha complex.3.dt.alf. The danger remains that one of the intermediate triangulations is large. We represent the filtration by the sequence of Delaunay simplices ordered by birth-time. as shown in Figure II. We finally discuss the visualization interface of the Alpha Shapes software. The interface consists of a visualization panel. the balls of organic molecules are 31 > mkalf name The a e pha shape iltration program reads the Delaunay triangulation in name.alf. the Delaunay triangulation in can have a number of simplices that is quadratic in . it enumerates the simplices whose intervals contain in time O( ). name.

£  ¤ Bibliographic notes. as shown in Figure II. As mentioned earlier. triangles and the regular triangles are shown. edges. or with gaps created through a slow explosion. To facilitate the reconstruction of the map from time. The best documentation of the algorithm and data structures used in the software are still his thesis [6] and the original paper on the topic [4].14 is obtained by drawing all edges of the last alpha complex while suppressing the display of all triangles and tetrahedra. A particular index. shaded. complexes are shown in the first but which complex is shown and how it is shown is decided in the other two panels. The interval tree used for fast retrieval of simplices is explained in [2]. seamless. The matrix on the right hand side can be used to select the types of displayed simplices. Different settings can be used to highlight different aspects of an alpha complex. All signatures that count rather than measure are displayed in log-scale.17. and the volume of . the panel contains a signature that maps the index to time. which can be accessed via the web [8]. By default. is selected by the position of a vertical bar in the signature panel and by clicking the Alpha Shape button in the scene panel.15: Four alpha complexes of gramicidin. the three default signatures map each index to the number of singular edges. Figure II. which is still the most recent version distributed on the web [7]. the largest resource for structural protein data is the Protein Data Bank [1]. To support that selection. only the singular vertices. the panel displays a variety of functions (or signatures) that illustrate how the complexes change with time.17: Scene panel of the Alpha Shape visualizer.1 in 1996. 1-skeleton of the Delaunay triangulation shown in Figure II. Specifically. . the software reached version 4. the D #D ¨   £  ¤ Figure II. the signatures map the index to the property of . For example.16 shows the signathe underlying space of ture panel and the three default signatures for gramicidin. Figure II. The visualized complex is selected in the signature panel. The buttons in the middle of the scene panel provide control over how simplices are drawn: colored. the area of the boundary. Instead of mapping the time to a property of . After a period of rapid development directed by Ping Fu at the National Center for Supercomputing Applications. it shows the log-scale graph of . The Delaunay triangulation software in the Alpha Shapes distribution is based on a variety of algorithmic techniques described in a recent text by Edelsbrunner [3]. A survey of geometric measure- £ ¡ ¤   ¤  ¡ ¡ ¤ ¨ . The Alpha Shape software was created by Ernst M¨ ucke as part of his doctoral work at Urbana-Champaign.32 II G EOMETRIC M ODELS Figure II. For example.16: Signature panel of the Alpha Shape visualizer. in wireframe.

Press. 209– 219. 43–72. [1] H. J. Graphics 13 (1994). Dept. A new approach to rectangle intersections – part I. B OURNE .rcsb. 13 (1983). M.. Univ.alphashapes. 235–242. Comput. [2] H. M. B ERMAN . T. UIUCDCS-R-93-1836. [5] M. W ESTBROOK .org/pdb. [7] Alpha Shapes web-site at www. I. 1993. 2001. S HINDYALOV AND P. Shapes and Implementations in Threedimensional Geometry. ¨ [4] H. Three-dimensional alpha shapes. P. B HAT. Nucleic Acids Res. England.4 Alpha Shape Software ments of proteins including a discussion of different tables for van der Waals radius assignment can be found in [5]. ACM Trans.duke. 28 (2000). see also the software collection in biogeometry. E DELSBRUNNER . E DELSBRUNNER AND E. 33 . G ILLI LAND . Rossmann and E. Chapter 22 in The International Tables for Crystallography. F. ¨ [6] E. [8] Protein Data Bank web-site at www. N. G. F ENG . and volumes. Internat. Vol. M UCKE . Rept. G. Cambridge Univ. 2001. N. M. G ERSTEIN AND F. The Protein Data Bank. W EISSIG . M UCKE . the Netherlands. Geometry and Topology for Mesh Generation. J. [3] H. R ICHARDS . Urbana.org. Z. 531–539. P. Protein geometry: distances. Dordrecht. E.II. areas. Comput.edu. Sci. H. Kluwer. E DELSBRUNNER . Arnold (eds. Math. Illinois.).

Call a disk in a finite collection of disks redundant if its Voronoi cell is empty. Similarly. In other words. and that satisfy Conditions (a) and (b). [We note that the relation in (ii) neatly generalizes the formula . a half-space is the set of points on or on one side of a plane in . (i) Prove that if there are disks . Prove that a tree-like cyclic sequence over an alphabet of letters has length at most . (i) Assuming the boundary of is a single closed curve. (ii) Describe the Voronoi diagram and the sequence of alpha complexes of the model.] 6.  £  ¡ ¡ R r¡     ¡ R ¡   ¥ ¥ #B     ¥ #¥¦ © ¥ d ¥ d    ¢ ¡ ¡ ¥ ¥    ¡ ¡¡ B ©B    ©B  ¥  ¥ ¢          ¥ ¤ ¥        ¢ ¥ ¡ ¢ ¡ ¢  R ©   ¥    § ¨   ©¥ ¥  ¡ R ¡ R ©        ¥  ¨ ¨   ¥ ¡ R ¡ R ¥ d ¥  ©B    ¥¢     ¢ ¡ © ¥ 7¥ R   R  r¡ R R R 6R ¥ ¡   ¡ 5 ¨   . The barycentric subdivision of a simplex is obtained by adding the barycenter of (also known as the centroid or center of mass) as a new vertex and connecting it to the simplices in the barycentric subdivisions of the faces. Binomial coefficients. A half-plane is the set of points on or on one side of a line in . (ii) half-spaces in . and in the collection such that (a) for the orthocenter of . (i) Look up the standard geometric model (determined by radii. Is this bound tight? (ii) Prove that in general the number of (maximal) circular arcs in the boundary of the union is at most . [You will need to use weights to make the barycentric subdivision of the tetrahedron the Delaunay triangulation of the points. subsequences of the form and are prohibited. Empty Voronoi cell. The filtration of water. Tree-like sequences. In other words. triangles and tetrahedra are in the barycentric subdivision of a tetrahedron? (ii) Use the Alpha Shape software to create the barycentric subdivision of a regular tetrahedron. Is this bound tight? 3.34 II G EOMETRIC M ODELS (i) Show that (ii) Show that  ¡ 5. that works for all posi- ¢ ¡   ¥    ©¥  ¥ ¥ ¥    ¥       ¥     1. (iii) caps on a sphere in  disks in 2. Given an alphabet of letters. prove that if is redundant then there exist disks . Is this bound tight?   ¥ ¡ (i) Show that (ii) Give a formula for tive . ¦ [You might consider answering question (ii) before question (i). Let be a set of the plane. and (b) lies in the triangle then is redundant. bond length and bond angle). Barycentric subdivision. and a cap is the intersection of a sphere with a half-space. What is the maximum number of independent (i) half-planes in . use tree-like cyclic sequences to prove that it consists of at most (maximal) circular arcs. The sequence is tree-like if there are no two letters that alternate more than twice. 8.]  ¡ ¥  ¢ ¡ ¨ ¢ ¡  ¡ ¨ ¢ ¡ ¥ (i) Prove that a tree-like sequence over an alphabet of letters has length at most . edges. form a sequence but refrain from placing any letter twice in a row. A water molecule consists of one oxygen and two hydrogens: H O.] ¥    £ Exercises £ ¥          ¡ ¥ £ £ £   £  £ £ ¥£   . Sphere arrangements. 4. Let be the maximum number of cells we get by drawing spheres in . The generalization is not quite as neat if we sum powers rather than binomial coefficients. .  unless . Examples of tree-like sequences of four letters are and . Number of arcs. Is this bound tight? (ii) Define a tree-like cyclic sequence by prohibiting cyclic subsequences of the form . Let be two positive integers and recall that the binomial coefficient is the number of ways we can choose elements from a collection of elements. Recall also that ¥ £  ¥  ? 7. (ii) Prove that the necessary conditions given in (i) are also sufficient. Independent half-spaces. The boundary of the union of the disks consists of circular arcs contributed by the circles. (i) How many vertices.

The molecular skin also lends itself to represent deformations. another the continuity of the maximum principal curvature. Its surface consists of spheres connected by blending hyperboloid patches and inverted sphere patches. We have also discussed the molecular surface model that is obtained by rolling a sphere about the van der Waals model.4 Molecular Skin Curvature Adaptive Meshing Skin Software Exercises 35 . Another interesting property is an inside-outside symmetry that implies the existence of locally perfectly complementary molecular skin models.Chapter III Surface Meshing Recall the different types of space-filling diagrams we discussed in Chapter II.2 III.and three-dimensional space. Finally in Section III. we discuss various notions of curvature of a surface. and we show that the maximal principal curvature is a continuous map over the molecular skin. One is the continuity of the normal direction. III. In Section III.2. we give the geometric definition of the molecular skin and show how it can be decomposed into quadratic patches. Both properties are crucial for the construction of good quality meshes. In Section III. and we use that software to illustrate some of the properties of these curves and surfaces.1. This chapter is organized in four sections.3 III. we present software for constructing molecular skin in two.1 III. The van der Waals and the solvent accessible models are both unions of finitely many balls in three-dimensional space and differ only in the radii. In Section III. The surface is piecewise quadratic and has a number of attractive properties not shared by the other space-filling models. and some of the possibilities along these lines will be discussed in Chapter VIII. which may be used to support numerical computations over the surface. We call this the molecular skin model. In this chapter.3. In other words. Corners and crevices are filled up and the surface consists of spheres connected by blending torus patches and inverted sphere patches. for each cavity we may construct a molecular skin representation whose boundary matches that of the molecule. we describe the algorithm that constructs a molecular skin in terms of a triangle mesh.4. we introduce model that is similar to the molecular surface.

We have    ¡       #B   p  p ¡ p   ¡ 0   p      0     ¡  0  p  p    p p B     p  p     rp B r¤p    0  p  !p  B     ©   ¡ ©B  © ¡  ¥   ¢ ¥  ©B    ©B     ¥  ©          © ¡      B !p ©B   p   0        ¡  ¢    £ §  ¢   §£ ¡ ¦ ¥      ¢ ¢ ¥     AB¡  B R   B   B ¡  ¡ ¡ ©        ¡     © ¢ ¡ ¡  © ©   © ¡ £ ¤        ¡    ¡ ¡ © © © ¡ ¡ ¡ ¡        ¦         B ©B ¢   ¡    §      ¦£ ¥   © ¦P§  © ©   ©       ¢ © . we can generate another such function by affine combination.1. where the are real numbers with .  §   The center is therefore and the square radius is .1 Molecular Skin Almost everything we will say in this section applies equally well to spheres of any fixed dimension. like the vertical family sketched in Figure III. is the zero-set of its weighted square  0   0  p   p ¥  ¢ ¢   ¥  ¢     ¥ ¢ ¥ ©     dius of the zero-set of ¢ . We will use only a subspace of that vector space. that arise as weighted square distance functions have the . We compute the center and ra- III. . If instead of the affine hull we take the convex hull. Given a collection of circles.  0 ¥ Figure III. its graph is a paraboloid of revolution in that intersects in the circle. If and are disjoint then the affine hull is again a pencil but this time of pairwise disjoint circles. the affine hull is the set of zero-sets of affine combinations of the corresponding weighted square distance functions. As illustrated in Figure III. if then for all coefficients and . Even is most relevant for the though the case of spheres in study of molecules. ©   is orthogonal to if . . If is orthogonal to and to then it is also orthogonal to every circle in the affine hull of and .   0  p  ¤p  ¥  ¢   ¥ ¢ ¥     ¥ ¥ ¢   ¢ ¢   ¢ ¥ ¢ ¢   Figure III. Given a collection of such functions . Indeed. and similarly the convex hull is the subset of zero-sets of convex combinations. Circles and paraboloids. the affine hull consists of all circles that pass through the same two intersection points. The three paform rameters correspond to the three degrees of freedom represented by the center and the radius. It is possibly easier to develop an intuition for combining circles than for combining paraboloids. To see this elementary fact. note that ¡ ¥ ¥ ¥ ¥ ¥ ¢  ¥ ¥ ¢ ¢ Recall that a circle ¥ ¢ ¡   0  0  p   p    0  0  p   ¤p     0 ¡ 0  p  ¡ p    ¡ ¡ ¡ ¡  ¡   p      0  ¡ 0  ¢   ¢ 0 ¢ ¥ ¢ ¢    ¢ ¢ ¥ ¥     ¢   ¥ ¡ ¡  p  # © ¢   © Functions form a vector space under the usual notions of scaling and addition. the circle is the zero-set of the weighted . there is sufficient pedagogical advantage to first talk about circles in .1: A circle in distance function.2: Circles sampled from a coaxal system consisting of two orthogonal pencils. Given two intersecting circles and .36 III S URFACE M ESHING Pencils. The new function is a convex combination of the if all are non-negative. namely the one consisting of functions of the above form. In other words. All paraboloids square distance function. We call the resulting family a pencil of circles.2. Recall that the weighted square distance function of a circle is the map defined by . then we get the subset of circles whose centers are the points on the line segment with endpoints and . The centers of the circles in the affine hull are therefore the points on the line that passes through and .

we have two pencils in which each circle in the first pencil is orthogonal to each circle in the second pencil. The body is the union of disks bounded by circles in . The envelope is therefore the zero-set of . The corresponding radius is . which sketches a shrunken pencil of circles.   Figure III. We are interested in the envelope of a shrunken pencil. the skin is the boundary of the body. Suppose is a pencil and all its circles pass through the points and . Similarly.3. The smallest non-trivial example is the skin of two circles. then shrinking every circle in the family.4: Sections of the zero-set of tive direction. Envelopes.4. The envelope of is the projection of the silhouette of as viewed along the direction. The collection of all reduced circles is the projection of the entire zero-set. The same parametrization of the family of reduced circles. In other words. as in Figure III. The skin of three circles is already more difficult to understand. as shown in Figure III.III. we have equality iff . If these circles intersect in two points then the skin is a dumbbell. Orthogonality and complementarity. and finally taking the envelope. for a family of circles we define . More general curves than just hyperbolas can be constructed by taking the convex hull of a finite collection of circles. and symmetrically. Formally. but the union of their disks is just the union of the two original disks. It is the region in bounded by the skin. . Skin and body. at least directly. It is the set of points for which vanishes. 37 for fixed value of .5: The skin of two intersecting circles is the envelope of a reduced line segment of circles. We thus take an indirect approach and first study what happens when orthogonal circles shrink. In other words. It consists of two circles connected by a blending hyperbola arc. Furthermore. . . viewed from the posi- Taking roots left and right implies that the radii of and add up to at most the distance between the two centers. the reduced versions of any two orthogonal circles     ¡ 0 0   0     ¥     0  0          0      00   00 ¥    ¡     ¢ ¢ ¥ ¢ ¡    The reduced circle with center is the zero-set of 0  B  ¥ ¥    B B   Bt   d XB XB  d  ¢ B ¥ ¦¢         ©B    ¡     ¥ ¥ ¡ ¢  ¥     © ¦P§     ¥   © ¦!§  ©  ¡         " £ ¤¢      ¥¦¢ £ ¤¢  p  p  0     ¡      ¤ ©   ¡ ¢  ¡ ¢ B    ¡   ¡    " ¡ ¡ ¡ ¥   B    " ¡ ¡¢ ¡ ¡   "         ¢ ¡  ¥  ¡   ¡ ¡   ¡ ¡   ¡      ©  ©B   ¥   © "   0       B XB   ¢    ©        "   ¡   ¡ " . Let and be two orthogonal circles. We introduce a shrinking operation that reduces small circles less than big ones and this way generates a smooth envelope. We parametrize by the coordinate of the circle centers. An example can be seen in Figure III. Then every circle in the affine hull of and is orthogonal to both and and thus to every circle in the affine hull of and . We thus have ¢   Figure III. which is Suppose we are now given two circles and and two more circles and both orthogonal to and . It can be visualized as a leaning hour-glass of circles. Specifically.2 and is referred to as a coaxal system. The convex hull of two circles is an infinite family of circles.1 Molecular Skin and thus vanishes as required.5. From we get . gives  Figure III. Such a configuration is illustrated in Figure III. we define .3: The dotted circles belong to the affine hull and the solid circles are reduced. the skin of the collection of circles is the envelope of the reduced circles. which is a hyperbola.

we first note that a circle in can at most touch the hyperbola. The corresponding Delaunay simplex is . To see this. If the mixed cell is the shrunken and translated copy of a two-dimensional Voronoi cell. and two osculating circles. If then is the Minkowski sum of two orthogonal edges and therefore a rectangle. smallest separating circle. The mixed complex consists of all mixed cells and their faces. A single circle defines a (smaller) circle.] As explained there. we would have two crossing reduced circles contradicting the orthogonality of the two corresponding original circles. We will not prove this claim and instead give an explicit construction of the decomposition. In other words. as sketched in Figure III. contains a circle    Decomposition. We claim that the envelope of is the exact same hyperbola. the envelope of is a hyperbola. The skin of is trivially a circle. or equivalently. The skin of any finite set of circles can be decomposed into simple pieces. The two envelopes are therefore the same hyperbola. Symmetry. connected to each other by blending hyperbola and inverted circle arcs. is the affine hull of two intersecting circles.7: The mixed complex and the skin of four circles. These circles touch the hyperbola and have the same curvature as the hyperbola at that point. . A Figure III.          III S URFACE M ESHING skin of consists of circles. which implies that the skin of is the same circle. [The order of the chapters on skin and pockets has changed now. If then is a shrunken and translated copy of a Delaunay triangle. and a triplet of circles defines an inverted circle. the two asymptotic lines of the hyperbola intersect at a right angle. The complementarity of the bodies extends from the case of two orthogonal pencils to the case in which consists of a single circle and contains all circles orthogonal to . which are the convex hulls of corresponding Voronoi polyhedra and Delaunay simplices.4.7 illustrates the construction by showing the mixed complex decomposing the skin into circle and hyperbola arcs. Figure III. for if it crossed. The mixed complex is then obtained by intersecting the pyramids and tetrahedra with the plane parallel to and halfway between the other two planes. Figure III. a pair of circles defines a hyperbola. every circle in for which there is an equally large circle in touches the hyperbola because it touches that circle. Note that the construction of the mixed complex is symmetric in the Voronoi diagram and the Delaunay triangulation. As shown in Figure III.6: Hyperbola with orthogonal asymptotic lines. We decompose the slab between the two planes into pyramids and tetrahedra. Suppose contains only circles with real radii. which is facilitated by a complex assembled from Voronoi and Delaunay polyhedra.38 touch if they are of the same size and they are disjoint in all other cases.8. rather intuitive explanation of the construction can be obtained by drawing the Voronoi diagram and the Delaunay triangulation on two parallel planes in . we let be an index set and use it to denote the Voronoi polyhedron . We thus claim that the ¥   £      £ ¥£§ ¥¨  £ ¡ £ ¢  ¡ © ¦P§   ¥ © centered at each ¡ ¢ £ ¢  # ¢ ¡  ¥ ¢ £¥ £ §    ¡   ¥ ¥¨ #   ¥ # ¥     ¥ ¡ £ ¢ £ £¥ § £       " " " "  "   "      ¡   "    ¡ "  . As usual. We apply this result to the coaxal system consisting of orthogonal pencils and . the mixed complex of is the same as the mixed complex of the collection of circles introduced in Section V. each defined by at most three of the circles. which requires a local rewrite here and in Section III. Furthermore.1. As shown earlier. The set is a two-parameter family spanned by three circles.6. The smallest separating circle that touches both branches belongs to and has the same size as the two osculating circles that both belong to . The corresponding mixed cell is the Minkowski sum of shrunken copies of both.

Voronoi vertex (including those at infinity) with the radius chosen so that is orthogonal to the circles that define . middle. DARBOUX . Similarly. [5] D. K. where skin surfaces are introduced as orientable manifolds in . 87–115. Under this interpretation. Geometry: a Comprehensive Course. [4] G.    ¥       ¤ ©        ¡    ¤ ©    0 ¡ ¢            ¡ ¡¥¢ ¡ ¥¢    ¢ ¡            ¦ §¡ ¡ ¢¢ £ ¤ ¢¢ ¡ ¡ ¢  ¢ ¡    p 0  p W ¥      ¢    ¢   . That paper also proves that the body of a finite collection of spheres has the same homotopy type as the dual complex. We have seen that the skins of two orthogonal pencils are the same hyperbola. Problem 1748. G. The material of this section is taken from [3]. F ROBENIUS . De points. 79 (1875). [1] W. and bottom planes carry the Delaunay triangulation. J. 1988. Figure III. Since the mixed complex decomposes the entire skin of into such cases. Bibliographic notes. 21 (1999). the Delaunay triangulation of is the Voronoi diagram of . de cercles et de spheres. 323–392. Math. E DELSBRUNNER . Series 2 (1872). and the mixed complexes of and are the same. P EDOE . Note however that the two bodies are not the same but rather complementary. the skins of one circle and the affine hull of three orthogonal circles are the same circle. Anwendungen der Determinantentheorie auf die Geometrie des Masses. Darboux [2]. It identifies each circle in with the point in . New York. Discrete Comput. Mathematical Questions and Solutions from the Educational Times 44 (1865). and the Voronoi diagram. Deformable smooth surface design. 144. the convex hull of a set of circles corresponds to the usual convex hull of points in . It has been discovered in the nineteenth century and published at more or less then same time in three different languages by Clifford [1]. Dover.III.8: The top. [3] H. Annales de L’Ecole Normale. Geom.1 Molecular Skin 39 [2] M. the mixed complex. This interpretation is prominently used in the geometry text by Pedoe [5]. 185–247. it follows that the skin of is the same as that of . The Voronoi diagram of is then the Delaunay triangulation of . and Frobenius [4]. and the symmetry between and can be explained as a polarity between two convex polyhedra. There is another interpretation of the vector space of circles exploited in this section. Reine Angew. C LIFFORD .

This is a famous result of Gauss. There are several notions of curvature of a surface. . is normal to the first. the principal curvatures determine all other normal curvatures at . it is preserved by isometries. . Usually we need only a small number of derivatives.    ¤      ¤    ¤  ¤   ¤ ¤         ¤ It is often convenient to assume unit speed. Geometrically. A closed space curve is a map of a circle to three-dimensional space. . we let be a neighborhood.1 generalize .3. Figure III. E ULER ’ S T HEOREM .10: Construction of tangent plane from two tangent vectors. and if and are orthen ¦    ¢ c ¢ I¥ £ ` ¢   §  $ ¢     ¤  § © ¤   P§ P§  § £   $ §¢  $ ¢      ¤ ¤   ¤ ¢  © ¤   ¤ §  § ¡            ¤   ¤     ¤ ©B ¢         § §  ¤ £ £    £   $ Curves. and if it does we call the normal curvature of at in the direction of the tangent vector . Note that a curve has a parametrization and the counter-clockwise orientation of the circle gives a sense of direction. The second contribution vanishes for geodesics. which are therefore unique.10. we take the tangent vectors of two curves that cross at . Similarly. the curvature is one over the radius of the osculating circle at . Derivatives are taken along curves on the surface. and the Gaussian curvature. By a result of Euler. and a parametrization. The principal curvatures at are the minimum and maximum normal curvatures. #B ¢   ¦  III. Two other common notions of curvature are the mean curvature. In this case and the second derivative. It is a geodesic at if its normal agrees with the surface consists of a portion normal at . # x f y Let and be the corresponding tangent directions. as illustrated in Figure III. For a point   be a smooth surface or 2-manifold in . In contrast to the other notions.2 Curvature ¢ £¡ ¦ © § ¢ ¡ © #D $   #D $     #D S $   ¢ ¡ p D© S $ q ©D S $ p ¢    ¡§ $   p #D E $ ' #D E $   #D ¢ p p ©D E p #D I £¢¢ ¢   ©D E $ $ ©D ¤ ©D ¥ ¢   p #D S $ p   #D     #D E $      ¤ #D ¢ ¡ ¢ . The normal vector is the normalized second derivative.   T HEOREMA E GREGIUM . we define the curvature at in sections. We can think of as the Gauss map from to . .      ©  §         ©  Figure III. The velocity vector at the point is and the speed is the length of that vector. which are transformations that preserve the distance between points measured as lengths of connecting paths. .9: A closed space curve to the left and its Gauss map to the right. .40 III S URFACE M ESHING an open set in . They span the tangent plane. The curvature is the length of that second derivative. In this section. which is defined as long as . The directions thogonal. The tangent vector is the normalized velocity vector. In other words. There is a circle of tangent vectors. which is the circle in the plane spanned by the tangent vector and the normal vector. . ¤ ¦ ¨ # ¦ §¡ ¦ Surfaces. If then all normal curvatures are the same and the point is an umbilic point of the surface. The Curvature Variation Lemma proved at the end of this section will play a major role in the meshing algorithm to be discussed in Section III. and for each one we get a normal curvature. is an isometric invariant. It is smooth if the derivatives of all orders exist. .9. to compute the tangent plane at .     The skin curves introduced in Section III. For example. as illustrated in Figure III. The curvature of forced by how the surfaces curves in space and another portion accounting for how curves within the surface. the Gaussian curvature is intrinsic. and all are obtained by considering the curvature of curves drawn on the surface. we straightforwardly to surfaces in study the curvature of these surfaces. For each curve in the plane we consider the space curve . This implies that if then all other normal curvatures are strictly between the two principal curvatures. Let . which is defined as long as the speed is non-zero. and the assumption of the existence of infinitely many is convenient but not necessary.

We have a one-sheeted hyperboloid for and a two-sheeted one for . Similarly. . In the case of the hyperboloid is the affine hull of the Delaunay edge and the (orthogonal) symmetry plane is the affine hull of the Voronoi polygon. or two-sheeted depends on whether the two spheres orthogonal to the three ¦   Figure III. and the symmetry plane . and in the case .13: Every point of the hyperbola is sandwiched between two equally large circles. the body lies on the side of the infinite circle in the symmetry plane.1: The cardinality of listed in the first column determines the dimensions of the corresponding Voronoi polyhedron and Delaunay simplex as well as the type of the mixed cell and of the skin patch. Either way. ©B ¡ B cases are symmetric and differ from each other by the surface orientation: in the case .11. Maximum normal curvature. the two hyperboloid cases are symmetric and differ from each other by the surface . Whether the hyperboloid is one-sheeted.  ¡ # Table III. ery point is in every tangent direction. touch in a point. We have a one-sheeted hyperboloid if the two spheres intersect in a circle and a two-sheeted one if they are disjoint. Furthermore. We can translate and rotate every sphere and hyperboloid to standard form. the body lies locally inside. is one over the radius of ¢   B      "   ¥   ¢   ¢ B  B  B B   B   B ¥ £      ¢ B      ¡ © ¨¡ § £ ¡ ¤¢  ¦ ¥ ¥ £  £ ¢ ¦£ § ¥ ¢ ¦£ § ¥   £   © ¦!§  ©  ¡  !   ¥ ¢ ¦£ § ¥    ¤  £ £    ¢ £ £¥ §  ¡ ¡ ©   ¡    ¤ © ¦  ¦    ¢ ¡ . the mixed complex defined by the circles decomposes the skin into circle and hyperbola arcs.13. From left Figure III. Within each mixed cell. and note that both the one-sheeted and the two-sheeted hyperboloid can be obtained by rotating the hyperbola about a symmetry axis. or are disjoint. The cases are summarized in Table III. as shown in Figure III. a double-cone. we 41 spheres with indices in intersect in a circle. The two sphere 1 2 3 4 3 2 1 0 0 1 2 3 mixed cell convex polyhedron polygonal prism triangular prism tetrahedron skin patch sphere hyperboloid hyperboloid sphere  ¤ ¦ vature at a point . the symmetry axis along . The situation is more complicated for the hyperboloid.12: The sphere.III.1. The common limiting case is a double-cone defined by two touching spheres. Similarly. the normal curvature at ev- have a sphere or a hyperboloid patch. In the case . The mixed complex that decomposes the surface consists of the four types of cells illustrated in Figure III. Recall that the skin defined by a finite set of circles in is the envelope of the infinite family of circles in the convex hull. and the two-sheeted hyperboloid. as illustrated in Figure III. Consider the hyperbola in standard form in . the symmetry plane is the affine hull of the Delaunay triangle and the symmetry axis is the affine hull of the Voronoi edge. which we define as   The second equation defines a hyperboloid with the apex at the origin. the symmetry axis orientation. the skin of a finite set of spheres in is . each reduced by a factor . The hyperboloid can either be one-sheeted (an hour-glass) or two-sheeted. In either case. the maximum normal cur- r r r x . £     £ Figure III.12. the body is on the side of the infinite ends of the symmetry axis. it lies locally outside the sphere. Either way.2 Curvature Skin surfaces.11: Typical mixed cells to right we have and 4. For the sphere. the one-sheeted hyperboloid.

As shown in Figure III. Deformable smooth surface design. Bibliographic notes. D EY. 525–568. 1992. this is a continuous function on . [1] J. is simply the distance to the center. we obtain the result. B RUCE AND P. Curvature variation. The specific results on the curvature and the curvature variation of skin surfaces are taken from [2]. C HENG . we extend to a function defined on all of and show that has Lipschitz constant one. for every point of a sphere or hyperboloid in standard form. G IBLIN . In short. In fact. Dynamic skin triangulation. S ULLIVAN . 1997. ¦  ¡ B ¡ ¢ ¡ ¢ p ¢ ¡  ¢ ¡ p ¥      p p B ¥ B ¤ B rp ¤ ¢ ¥ B [3] H.13. Discrete Comput. from to . Elementary Differential Geometry. Second edition.-L. [2] H. The skin surfaces in are obtained by extending the results of Section III. this radius is the same as the distance of from the origin. [4] B. T. San Diego. For all points we have  We note that the extension of to a function describes the maximal normal function of all skin surfaces in the family defined by the power growth model of the spheres. Geom. Academic Press. O’N EILL . We have seen that within a mixed cell. the triangle inequality gives the Lipschitz bound. W. E DELSBRUNNER . 21 (1999). The maximum normal curvature varies continuously over the skin because the common radius of the sandwiching spheres varies continuously. Within the mixed cell. The books by Bruce and Giblin [1] and by O’Neill [4] are good introductory texts to curves and surfaces and other topics in differential geometry. as introduced in Section II. Curves and Singularities. E DELSBRUNNER AND J. Cambridge Univ. 25 (2001). J. . H.1 by one dimension. England. A more direct treatment of the general-dimensional case can be found in [3]. K. Second edition. By the definition of the mixed complex. Press. Discrete Comput. 87–115. III S URFACE M ESHING By applying this to the pieces of the line segment from to contained in different mixed cells. Geom.   ¢ ¡ ¤ p rp ¤ B B !p ¢   ¢ ¢  ¢    ¥ ¥ ¤       ¤  ¢ £¡ ¤     G ¤   ¥  ¢      ¢ #B        #B ¤ ¤  ¢ ¥ ¤ 0 ©B    ¤ ¢ ¡ ¢   ¢ ¡ B .   ¡ C URVATURE VARIATION L EMMA .42 the largest sphere that passes through and touches but does not cross the hyperboloid. We strengthen the result by showing that varies rather slowly.2.

III.3 Adaptive Meshing point
#

43

Closed ball property. One trouble with the restricted Delaunay triangulation is that it may not be homeomorphic to and thus not triangulate the surface. Indeed, it is easy to come up with cases where is not even a 2-manifold. A sufficient condition for to triangulate is what we call the closed ball property. It requires that each common intersection of restricted Voronoi cells is topologically a closed ball of the appropriate dimension. We formulate this condition in terms of the threedimensional Voronoi polyhedra defined by . Assuming general position, the Voronoi polyhedron has dimension , and we require that is either empty or homeomorphic to a closed ball . Depending on the cardinality of dimension of we have a closed disk, a closed interval, or a single point.

Figure III.14: Local decomposition into restricted Voronoi cells and dotted dual restricted Delaunay triangulation.

Figure III.15: To the left a barycentric subdivision of a portion of a Voronoi diagram drawn with solid lines. To the right the isomorphic barycentric subdivision of the corresponding portion of the dual Delaunay triangulation drawn with dashed lines.

¦

Let be the set of points sampled on . We use it as the vertex set of the triangulation, which we construct as the dual of a decomposition of . Specifically, for each

Proving that the closed ball property implies triangulates is not difficult. Decompose the restricted Voronoi diagram by adding a point in the middle of each

¤ ¢    ¤ ©  ©

#

¢ )

#

¢ £¡

 

¢ )

 

¤

¢ )

#

  

¢ )  ¢    ¢ ¤ ©     ¤  ¢  ¦§1¨ ¢    ¡ © !§   ¥ ¡  ¤ ¢ )

¦¢ £ £¥ § ¦¢ £ ¨£¥ §

¥ ¦

¥

#

¥

¦

¤ ¡

#

¦

¦

#

¦

Triangulations. Recall that a triangulation of a surface is a simplicial complex whose underlying space is homeomorphic to . Since is a 2-manifold, it follows that the simplicial complex is the closure of its triangle set, every edge belongs to exactly two triangles, and the star of every vertex forms a disk. Note that the last property implies the first two. We construct a triangulation by first selecting points on and second connecting these points with edges and triangles. Given the Delaunay triangulation of , we have sufficient information to sample points and to compute their maximum normal curvature values. Specifically, for each Delaunay simplex we construct the mixed cell . The center of this cell is the point at which the affine hull of intersects the affine hull of . It is also the center of the corresponding sphere or the apex of the corresponding hyperboloid. Next, we rotate the mixed cell so its center moves to the origin. Furthermore, if or is an edge then we rotate it into vertical position. The sphere or hyperboloid defined by is then in standard form, which can be sampled. For each sampled point we compute the maximum normal curvature from its distance to the origin and we obtain the corresponding point on by the inverse rotation.

where distance is measured in , as usual. It is the intersection of with the Voronoi polyhedron of in , . The restricted cells decompose into closed regions that overlap along common pieces of their boundaries. Locally the picture is rather similar to that of a Voronoi diagram in . The restricted Delaunay triangulation, , is the collection of simplices with non-empty common intersection of the corresponding restricted Voronoi cells, . The construction is illustrated in Figure III.14. We note that is a subcomplex of the (unrestricted) Delaunay triangulation of in . 

 

¡

¢ ¡ 

¡

#

¦   ¦ 

£ ¢

#

¢ ¡

In this section, we focus on constructing an explicit representation of a molecular skin surface. We choose a triangle mesh realized in that is a good approximation of the surface and has good numerical properties. 

& 

¥ § p    ¦

¥

B rp ¤ p    ¥rp ¢ ¦ ¡ B ¡ B

  ¡¡   
 

III.3 Adaptive Meshing

, the restricted Voronoi cell is 

£ ¢

¦

¥¨

¥

#

¥¨  ¥

¥¨ ¥ 

¦

¦

¦

#

#  

 

¦

¦

¥¨

¥

¡

¦

¢ £¡

£

 

   ¤ ©

 

 

¦

44 arc and inside each cell and connect each point to the points on the boundary. The star of every point inside a restricted cell is a triangular decomposition of that cell. The star of every restricted Voronoi vertex consists of six triangular regions that can be homeomorphically mapped to the six triangles in the barycentric subdivision of the dual restricted Delaunay triangle. By construction of , the triangles in the two barycentric subdivisions are connected the same way so we have a homeomorphism between and the underlying space of , which is illustrated in Figure III.15. -sampling. The question remains how we sample the points such that the restricted Voronoi diagram has the closed ball property. Since is smooth, small neighborhoods are fairly flat and the restricted Voronoi diagram behaves locally similar to the (unrestricted) Voronoi diagram of a set of points in the plane. In other words, a dense enough sample of points should have the closed ball property. This intuition can be made precise by formalizing the concept of density. Recall that is the maximum normal curvature at a point . Around we spread points at distance roughly proportional to . We therefore define and call it the length scale at . The Curvature Variation Lemma of Section III.2 states that for any two points , the difference in length scale is at most the distance between them in , . An -sampling is a subset such that for each point there exists a point at distance . Showing that a sufficiently small implies the closed ball property for the restricted Voronoi diagram is rather tedious and we omit the proof. H OMEOMORPHISM T HEOREM . If is an -sampling of with , then the restricted Delaunay triangulation of is homeomorphic to . The precise upper bound for is a root of the function 

III S URFACE M ESHING arbitrarily ugly. To improve the mesh, we impose conditions on the size of edges and triangles that imply both upper and lower bounds on the spacing between sampled points. , Let the size of an edge be half its length, and the size of a triangle be the radius of its circumcircle, . For edges we worry about them getting too short, so we compare size with the larger length scale at the endpoints, . For triangles we worry about them getting too large, so we compare size with the minimum length scale at the vertices, . We use two constants, and , to express the conditions on the size. The constant controls how closely the triangulation approximates , and controls the quality of the triangles. We refer to the two conditions as the Lower and Upper Size Bounds, [L] [U] for every edge , .
 

for every triangle

It is not necessary to bound the edge lengths from above would belong to because an edge with two triangles that both violate [L]. Symmetrically, we do not need to bound the triangle sizes from below because a triangle with would have three edges that violate [L]. Mesh quality. The constants and have to be chosen judiciously. For example would immediately lead to irreconcilable requirements on edge and triangle sizes. Furthermore, cannot be too large, else we would contradict the -sampling condition stated in the Homeomorphism Theorem. Without going into details, we state that and are feasible choices. In particular, these constants imply that is an -sampling for sufficiently small value of . More precisely, they imply that is either an -sampling or it grossly violates the condition for -sampling. An example of such a gross violation are four points close together on a sphere. The points form a tetrahedron whose edges and triangles may very well satisfy the Size Bounds, but the boundary of the tetrahedron is a miserable approximation of the much larger sphere. Fortunately, such a gross violation of the condition cannot be created from an -sampling without the intermediate generation of triangles that grossly violate [U]. The algorithm discussed below is unable to generate such triangles. The two Size Bounds together imply a reasonably large lower bound on the angles inside triangles of the restricted Delaunay triangulation.

Even sampling. The points of an -sampling can locally not be too far apart, but they can be arbitrarily close together. In other words, on a microscopic scale, the points can be placed every way one likes and the mesh can be

 

which arises in the proof of the Homeomorphism Theorem.

   

¡£
¦ 

 

  

¤  £  

¢ ) ¡ #§  

§  ¨¡    ¡ ¡ £ £ 

¢ ) ¡ §   

 

¡

 

¦ ¡ ¦  ¤ § £ $ § £   

¡  £ $  £   

 ¨ ¡

 

¤ 

£

 

£   

¢

§  ©¡    ¡ ¡ 
   

£¡

  

 !"§§ #   §§ #  

  

  

£ &  

¦ £ § ¥ 
£

  

 ¦ § £ ¡   
  

 

¦

#B

  

B

 

¤

¢ )

¥   

 

¦

¥

  ¡  

 

¦

©B

  

© § ¥ £ 

¡ B

¦

¤

  ¡

¡ B

¦

 

 

©B 

  

#B ¡   p   rp B ¦ ¡ B      B rp ¤ ¢  ¢ ©B ¡ ¢ ¢ ¡ ¡ 

¢ )

p 

 ! §    %

¤ 

 

¦

 

¥   

 

¥ 

 

  

 © § ¥ £   ¤ © §

©B ¡

¤  

 

P  

¥

  

B

  ¢

¦

¥

 

III.3 Adaptive Meshing M INIMUM A NGLE L EMMA . A triangle that satisfies [U] and whose edges satisfy [L] has minimum angle larger than . P ROOF. Let be the triangle and its circumradius. Assuming is the smallest angle, we have of length as the shortest edge. We have by definition of length scale. Using [L] and [U] we thus get 

45 violate the Upper Size Bound. It is possible that an edge contraction causes a vertex insertion, but a vertex insertion cannot create edges of size below the allowed threshold. This is what prevents infinite loops in spite of the algorithm’s partially conflicting efforts to simultaneously avoid short edges and large triangles. To prove this claim, that causes the addition of its we consider a triangle dual restricted Voronoi vertex .

Hence

endwhile.

The details of the algorithm that modifies the restricted Delaunay triangulation to reflect the addition of are omitted. A vertex insertion may cause other vertex insertions, but this cannot go on forever because we will eventually violate the Lower Size Bound. Given an edge that violates [L], we contract it by removing one of its endpoints. We are not able to exclude the possibility that the removal creates new violations of [L], and it certainly can create new violations of [U]. void E DGE C ONTRACTION: while edge violating [L] do if then endif; ; V ERTEX I NSERTION endwhile. The details of the algorithm are again omitted. An edge contraction may perhaps cause other edge contractions, but this cannot go on forever because we will eventually

Scheduling. [Summarize the results on scheduling edge contractions and vertex insertions described in [5].] Bibliographic notes. The restricted Delaunay triangulation is a generalization of the dual complex of a ball union. It can be used to triangulate surfaces and other spaces embedded in a Euclidean space. Besides the dual complex literature, there are several other partially dependent roots of the idea, namely the surface meshing method by Chew [3], the neural net work by Martinetz and Schulten [6], the formulation of the closed ball property by Edelsbrunner and Shah [4], and the surface reconstruction algorithm by Amenta and Bern [1]. The last of the four papers also introduces -samplings of surfaces, although in a slightly different formulation in which the distance to the medial axis replaces the length scale. All results that are specific to skin surfaces are taken from [2]. The algorithm in that paper is more general than 

£    

For therefore

and

we have , as claimed.  

¢

 

£ ¡   ¥G $ '¤££ G   

 

  

void V ERTEX I NSERTION: while triangle violating [U] do

and  

  

p

p

  

¥

¥ 

Brp       ¤ p B!p         ¤

   

¦     ¦ ¨¡  ©B  ¡ ¡ £ £ 

£

¡

 

¥

p

 

B !p  ©B ¡ ¤  ¡      ¡ ¤ ©B ¡ 

¥

B rp

£   ¤G

 

 

Density modification. Given an -sampling, we can enforce the Size Bounds by contracting short edges and inserting points near the circumcenters of large triangles. Given a triangle that violates [U], we add the dual restricted Voronoi vertex as a new point to . The insertion may cause new violations of [U] and thus trigger new point insertions.

¡

p

 

 

¥

B rp

B

 

B

¡

¥ ¥ £ ¨¦  

B

 

§

 

¦

©

 

¥    !  

B 

§ 

§ ¡ §  

#§  

¢ 

§ ¡ 

B¡ £       ¢

¥    

For

, the minimum angle is thus larger than , and the maximum angle is smaller than .

B

   ¡ § £ ¡ ¦ ¦§ £ ¡      ¦ £ § ¥  

¡ 

¡

Hence

.

. The sphere with P ROOF. We have center that passes through , , and has radius and contains no other vertices than inhas therefore length side. Every new edge . Assume without loss of generality that . We use the Curvature Variation Lemma to derive upper bounds for the length scales at and :   

 

£ ¥G

N O -S HORT-E DGE L EMMA . Every edge ing the addition of has ratio

B 

¡ £ $ ¤G  

§

¦

¡ CB
 
¡

¦ ¦ § £ ¡  § £   

§ 

B

B

¦ § £    

 

£    V    © § ¥ £  # ¡  £   © § ¥ £ ¢       §   £¡ p §  p  ¤ ¦§ £ ¡       £ ¡  §¦ £ ¡   ¤ p   p § § 

  

§

 

  ¡  ©

  

    #§   
  

V 

  

¥  

 © § ¥ £ 

#§  
¥

created dur.

 

¥

¥   ¦ £        ¥   ¤£      ¦

¢

    ¡  

 

§ 

46 what is explained in this section and maintains the surface mesh while it moves in space.
[1] N. A MENTA AND M. B ERN . Surface reconstruction by Voronoi filtering. Discrete Comput. Geom. 22 (1999), 481– 504. [2] H.-L. C HENG , T. K. D EY, H. E DELSBRUNNER AND J. S ULLIVAN . Dynamic skin triangulation. Discrete Comput. Geom. 25 (2001), 525–568. [3] L. P. C HEW. Guaranteed-quality mesh generation for curved surfaces. In “Proc. 9th Ann. Sympos. Comput. Geom., 1993”, 274–280. [4] H. E DELSBRUNNER AND N. R. S HAH . Triangulating topological spaces. Internat. J. Comput. Geom. Appl. 7 (1997), 365–378. ¨ ¨ [5] H. E DELSBRUNNER AND A. U NG OR . Relaxed scheduling in dynamic skin triangulation. In “Japanese Conf. Comput. Geom., 2002”, to appear. [6] T. M ARTINETZ AND K. S CHULTEN . Topology representing networks. Neural Networks 7 (1994), 507–522.

III S URFACE M ESHING

we can visualize concepts that are difficult if not impossible to show in . the -skin is the envelope of the circles in the convex hull that are reduced by a factor . The Morfi software is two-dimensional and constructs skin curves from finite sets of circles.4 Skin Software In this section. The D ¡         D D D   0   ¡ ¦ ¡    ¡ ¢   §    ¡ ¢          §¡  0     ¡ ¡   ¡ ¢    union contains the body and the body contains the dual complex. . One function in this family is the trajectory of the skin curve.1 we claimed that there is an infinite family of of that all have smooth approximations the same critical points. that maps each point to the moment in time at which belongs to the skin of . we think of as time and denote the collection of circles at time by . We choose and construct the family such that and approaches as goes to 1. This is always true.17: Decomposition of the skin and body by the mixed complex.1. The skin shrinks the arcs in the boundary of the disk union and smoothly blends between the shrunken arcs using pieces of hyperbolas and inverted circles. Where is its center in Figure III. rectangles. ¡ ¦  ¡ §       ¢ ¡ . and dual complex. we use two pieces of software to visualize the various geometric concepts introduced earlier in this chapter. the disk union. Skin curves.3. and shrunken Delaunay polygons. In Section V. Following the notation in Section II. Using the Morfi software. Simulated smoothing. In Figure III. Note that the disk Figure III. The zero-set of is the envelope of the circles .16 we see seven disks whose union is decomposed into convex regions by the Voronoi diagram. One of the quadrangles contains most of the hole in the body. We see only seven of them in Figure III. Superimposed on this decomposition is the skin curve with shaded body and the dual complex. which is converted into an almost entirely circular hole in the body. portion of the hole boundary inside that quadrangle is circular while the portions outside the quadrangle are hyperbolic. it consists of shrunken Voronoi polygons. Specifically. where we considered the minimum weighted square distance function of a collection of circles .16? Figure III.III. It decomposes the skin into circular and hyperbolic arcs. namely the points where dually corresponding Voronoi and Delaunay polyhedra intersect. We generalize this construction to any by letting be the trajectory of the modified skin curves.1.16 because one of the eight radii is imaginary.17 is degenerate. An example is the mixed complex illustrated in Figure III. Observe also that the five Delaunay polygons visible within the mixed complex apparently have eight vertices (not double-counting the shared ones). Most striking is the blending for the quadrangular hole roughly in the middle of the figure. the body.17. which can be seen from the fact that there are three shrunken Delaunay triangles but also two shrunken Delaunay quadrangles. We return to an issue left open in Section V. ¡ ¡ YD     © ¦P§ ¡  0   ¢ 0 ¡   ©  ¦P § ¡ ©©  ¡¡  ¡ ¥ ¤¡ £   ¡ ¢  ¢ ¡   ¡      ¢   ¡ ¡Y¡ ¡ B ¡        ¡  ¤ © B  Mixed complex. body. and the dual complex all have the same homotopy type.16: Voronoi decomposition of disk union with superimposed skin. As explained in Section III.4 Skin Software 47 III. and the preimage of any real value is the envelope of the circles . Furthermore. The collection of circles generating the diagram in Figure III.

the skin is the empty surface.   D ¡    ¡ ¡ B      B       ¡ ¡ ¢ £¡ ¡ g¡ D     ¡ ¡ #B ¡     ¦  ¥ ¤¡ £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡    ©  D § ¤ ¡    ¡       ¤ ©     Figure III. The function maps every point to the moment in time at which belongs to . As it turns out.1. For . At the beginning. Observe that the bod- III S URFACE M ESHING ies bounded by the -skins are nested. Note that and is the envelope of the original disks. Figure III. we compute triangulated skin surfaces using the Skin Meshing software. The image of the mesh in Figure III.19 should be compared with the ren £   ¡  ¤ ©   Figure III. Shape adaptation. Recall that the conditions [L] and [U] given in Section III.20. ¡ Curvature adaptation. with the time continuously increasing from minus infinity to zero. all spheres are imaginary.19: Cut-away view of the mesh of a small molecule of about forty atoms. As mentioned earlier. defined for . with as usual. D   ¡ ¡ ¡   ¡ ¡ dering of the same surface in Figure III. The apparent smoothness is an illusion created by Gouraud shading. The growth of the spheres implies a deformation of the surface. Growing the mesh. the height function is differentiable and assuming non-degeneracy of the input circles. It takes as input a set of spheres and constructs a mesh by maintaining a triangulation of the set of spheres .19 shows a portion of this mesh for a small molecule. The algorithm thus reduces to executing a sequence of elementary operations. the mesh is constructed by maintaining it while growing the spheres. . which is a graphics technique that interpolates between normal directions to generate the smooth impression. and the mesh is the empty complex. At time we have the mesh of the skin of . and the slicing plane is chosen to cut right through the narrow part of the tunnel. is also the envelope of the orthogonal circles as defined in Section III. along the integral lines of the skin trajectory.3 guarantee that the mesh adapts its local density to the maximum normal curvature. The complete surface has genus one. The image is created by slicing the surface with a plane and removing the front portion of the surface. software updates the mesh accordingly. As time increases. § ¥ ¤¡ £   ¢ £¡ ¡ ¦ ¢ ¡ Meshed skin surfaces.20 correspond to high density regions in Figure III. the innermost -skin. Note that highly curved areas detectable in Figure III.1. Only the edges of the mesh and the cut boundary are shown.1 to define pockets.19. We classify the operations according to the adaptation purpose they serve. which is . This is sufficient to justify the Morse theoretic reasoning about the non-smooth function used in Section V. it is twice differentiable at the critical points.18: From inside out the sequence of skins for . Figure III.48 is the skin as defined in Section III. the surface moves and the .18 illustrates the construction by showing the modified skins for several values of . We use edge contractions to eliminate edges that violate [L] and vertex insertions to eliminate triangles that violate [U]. We use edge flips to maintain the mesh as the restricted Delaunay triangulation of the moving vertices. The algorithm moves vertices normal to the surface. which is facilitated by a motion of the mesh vertices in . In .

which control how the metamorphoses are performed. the software works fine for small violations but breaks down for moderate ones.22 shows the panel after the construction of a mesh. In our experience.20: Smoothly shaded rendering of the mesh in Figure III. [This panel needs to be updated to fit the text.III. or vice versa. As proved in Section III. Computer graphics techniques used in displaying shapes. The three-dimensional Skin Meshing software has been developed by Ho-Lun Cheng [1. which controls the numerical approximation of the surface. which is . namely a two-sheeted hyperboloid that flips over to a one-sheeted hyperboloid. and . Observe that the surface around a handle is the same as that around a tunnel. we see two new handles appear. the ratios all lie inside the allowed interval. Topology adaptation. V   © § ¥ £ £    ¢         £  ¥ £ &&!        . The quality measures do not include the special edges and triangles that facilitate topological changes and purposely violate some of the properties required for the rest of the mesh. which controls the size of the angles. can be found in [3]. From the first snapshot to the second. the algorithm guarantees that the smallest angle inside any (nonspecial) triangle in the mesh is larger than . Figure III. and the smallest angle observed in the mesh is indeed . We use metamorphoses to change the mesh connectivity accordingly. Two of the four types of metamorphoses can be seen at work in Figure III. and a void is filled at a maximum. and they correspond to the four types of generic critical points of threedimensional Morse functions.4 Skin Software 49 are . provides various measurements of mesh quality. For the standard setting of .] Figure III. The Skin Meshing software comes with a quantification panel that displays parameters used in the meshing algorithm. The two-dimensional Morfi software has been developed by Ka-Po (Patrick) Lam. and . and is described in his master thesis [4]. There are four types of topological changes that occur. The three other parameters shown in the panel Figure III. The correctness of the algorithm is guaranteed only if the inequalities referred to as Conditions (I) to (V) are all satisfied. By closing a tunnel we also remove the handle that forms it. and indicates the number of operations executed during the construction. 5]. The software has been used in [2] to explain two-dimensional skin geometry and its application to deforming two-dimensional shapes into each other. we see both tunnels disappear again. a handle is created at an index-1 saddle. The software permits other parameter settings since a violation of the inequalities does not necessarily imply a failure of the algorithm.22. including size versus length scale ratios of edges and triangles and the angles inside and between triangles.19. this is roughly . From the second snapshot to the third. Quantification. ¥ §£ £     Bibliographic notes. A component is born at a minimum. The only difference is the reversal of inside and outside.22: The quantification panel of the Skin Meshing software. a tunnel is closed at an index-2 saddle. Each handle creates a tunnel in the complement.21. Note that in Figure III. It displays measurements of mesh quality. The two most important parameters are .3. including Gouraud shading. .

D. Sci. P.-L. [1] H. Reading. Dept. L AM . Hong Kong University of Science and Technology. F OLEY. 205–218.21: Three snap-shots of the deforming triangulation of a molecular skin defined by continuously growing spheres. [2] S. [4] K. 1996. H UGHES . Appl. Urbana. 19 (2001). From center to right.-W. Comput. Addison-Wesley. F EINER AND J. H. we note a metamorphosis that closes a tunnel on the left. Internat. thesis. E DELSBRUNNER . P. Illinois. F U AND K.duke. [3] J. Second edition. Geom. A. Dept.edu. we note two metamorphoses that each add a handle in the front. Massachusetts. Two-dimensional geometric morphing.. Sci.50 III S URFACE M ESHING Figure III. S.. L AM . Ph. Dynamic and Adaptive Surface Meshing under Motion. From left to center. P. . Comput. Principles and Practice. C HENG . [5] Molecular Skin web-site in the software collection at biogeometry. 1990. Master thesis. Design and analysis of planar shape deformation. 2001. VAN DAM . C HENG . Univ. Comput. J. Computer Graphics.

Let us extend the concept of a coaxal system of circles to three dimensions. 5. (i) Give an example illustrating that is not continuous. Something about triangles. Note that the curvature of a molecular skin curve in is not continuous. what is the analog of a coaxal system in ? ¦  pi § p g   p p  p i § p p    p § £  p g   p p    p § ¥ ¥    ¥ ¥       1. and .Exercises 51 and passing through and . Let be a triangle in the plane. (iii) Prove that the number of points in a minimal -sampling of (as defined in Section III. 4. (ii) Prove that every affine combination of . (ii) Calculate for the portion of a double-cone within a unit-sphere around its apex. we write for the heights of and . Define the total curvature of a surface as the integral of the maximum principal curvature:  (i) Calculate for a sphere . Define the total square curvature of a surface as the integral of the maximum principal curvature squared:  (i) Calculate for a sphere . Prove that the radius of the circumcircle satisfies 2. and is orthogonal to and . and . Curvature in the plane. We write for the height of defined as the distance of from the closest point on the line  ¥    ¥   (i) Prove that every affine combination of and is orthogonal to . Show that goes to infinity as the hyperboloid approaches its asymptotic double-cone. Total square curvature. Total curvature. (iii) In the light of (i) and (ii). 3. Pencils of spheres. For this purpose assume and are two sphere that are both orthogonal to the spheres . (ii) Introduce a new function (perhaps similar to ) that is continuous over . (ii) Let be the portion of a hyperboloid of revolution within a unit sphere around the apex. Similarly.3 is proportional to .      § § Exercises ¦  § ¦¡ ¢¡ ¤ ¤ ¡   ¡   ¡ ¤ ¥  #§   ¦  B #B ¡ B #B  ¤ ¢ © G ¡ ¢ ¢ ¥ ¡   ¦¡ ¢¡  ¦ ¦ ¤ ¡ ¦ ¢©G ¦¡ ¢¡ ¢ £¡ £      ¦  ¡      ¦         ¡ ¥ ¥   ¥ ¥ ¡ ¦ ¡   ¦ ¦ ¦ .

52 III S URFACE M ESHING .

In this chapter. for two spaces and to be “connected the same way”. we prove that space-filling diagrams are homotopy equivalent to their dual alpha shapes. It might not be immediately obvious what this question means. [We should stress that homology in this topological context has a precise algebraic meaning. could mean they are topologically equivalent ( ).] Given two triangulated spaces. which is fast but limited to complexes in three dimensions. homology modeling of proteins). they are homotopy equivalent ( ). which implies the two have isomorphic homology groups. meaning they are neither homotopy equivalent nor topologically equivalent. we formally ¡ ¢ ££   ¡ §¥ ©  ¨ ¦§¥       ¦  ¦ ¡ ¤ ¡ §¥  ¨ ¦§¥ ¦ ©   ¦ ¡ ¤ ££   ¡   IV.3 IV. In Section IV. which is in sharp contrast to how the term is used in biology (eg. we can draw from precise definitions developed in topology to answer the question. where it indicates a vague notion of similarity. homology is the most important tool to study connectivity. For example.2. we describe an incremental algorithm for Betti numbers. we need to be aware that there are perfectly well-defined and reasonable but different precise notions that correspond to the intuitive idea of connectivity.1 IV. there is a polynomialtime algorithm that computes and compares their homology groups.2 IV. The three notions are progressively weaker: define homology groups and their ranks. In Section IV. or they have isomorphic homology groups ( . If the groups are not isomorphic then we know that the two space are different. which is significantly slower but not limited to three-dimensional space. In Section IV.3.4.4 Equivalence of Spaces Homology Groups Incremental Algorithm Matrix Algorithm Exercises     ¡ ¢   53 . the Betti numbers.1. In spite of the apparent weakness. we focus on algorithms computing the homology groups of molecules represented by space-filling diagrams. In Section IV. we present the classic matrix algorithm for Betti numbers. the classification of spaces by homology groups is coarser than that by homotopy equivalence. we can ask whether or how it is connected. if their homology groups are isomorphic then we still do not know whether the two spaces are the same also under the two stricter definitions of sameness.Chapter IV Connectivity Given a shape or a space. However. which in turn is coarser than that defined by topological equivalence. However. In words.

An interesting example of a pair of IV. Let be the three-dimensional Euclidean space. To get comfortable with these abstract ideas requires a number of concrete examples. but there is no homeomorphism between and . and with induced subspace topology it is a topological space. and there are spaces that look quite similar and do not have the same topological type. being homeomorphic is reflexive. For example. we can define when two are the same. ¢ is Topological spaces. (i) (ii) (iii) and . for every subsystem . is a subset of . topologically equivalent. The lower hemisphere maps to the shaded disk and the upper hemisphere to the complement of that disk. we thus have be able to measure the distance between points in both sets. A homeomorphism is a bijective map that is continuous and whose inverse is continuous. and an open set is a union of open balls. To check whether or not is continuous. the inverse of a homeomorphism is a homeomorphism. According to a more general definition. there are spaces that have the same topological type and look vastly different. we can induce the subspace topology.1. This map between and is indeed a homeomorphism. but this is not necessarily true for infinitely many open sets. and that they have the same topological type. . symmetric and transitive. A topological space is a set together with a system of subsets of such that  ¦  ¡ ¢   ¡ ¡ ¦     ¢ ¡ ¢ ¡   ¥       urp ¢ pB ¢ ¡ ¡ ¡ ¥     ¡   ¥     ¢  £ ¡   ¡   §¡ ¢      ¢ ¡ ¢  £   ¡ B¡ ¢      ¡ ¢        &   ¡        ¢ £¡       ¡ ¢ ¡   ¥   ¡¢ ¡  ¢  ©B ¢   ¡ B ¢ £¡ ¢ ¡ ¢ ¢ ¡ £  £ ¢ ¡ ¤¢ ¢ ¡    £    ¢ ¡ ¢ ¡ ¢ ¢  . We study the connectivity of this space by considering equivalence classes defined by continuous maps between spaces.  ¡    ¢ ¡  ¡  ¥ ¡¥ ¥    The system is called the topology of and the sets in are the open sets of . Note that the identity is a homeomorphism. and the composition of two homeomorphisms is a homeomorphism. . Note that the common intersection of finitely many open sets is again open. Now that we know what a topological space is. If . In other words. which is not an open set. We thus see that the restriction to finite subsystems in condition (iii) is necessary. and for every finite subsystem . is continuous if the preimage of every open set in is open in .1: The circle on the left is topologically equivalent to the trefoil knot in the middle. as illustrated in Figure IV. topological subspace of the pair non-homeomorphic spaces are the sphere and the plane.2: The stereographic projection maps the sphere (minus the north-pole) to the plane. we can map points from the sphere to the plane by stereographic projection from the north-pole. Recall that a map continuous if for every there is a such that if have distance less than then the points have distance less than . is just the origin itself. Here is one. so is indeed an equivalence relation for topological spaces. The space together with the system is a . . Figure IV. Another topological subspace of is the two-dimensional Euclidean plane. We write if a homeomorphism exists and say that and are homeomorphic. then it is a topological subspace of . N Figure IV. Here we only need to distinguish between open and non-open sets. The two-dimensional sphere. This distinction is the motivation for the following definition.2. which is the system . As suggested by Figure IV. but both are not topologically equivalent to the annulus on the right. for . and if we choose its intersections with open sets in as the open sets in its topology.54 IV C ONNECTIVITY Topological equivalence.1 Equivalence of Spaces The space-filling diagram of a molecule is a subset of . An open ball is the set of points at distance less than some from a fixed point. the common intersection of the open balls of points at distance less than from the origin. After embedding both in .

For example. and that it is a map. We begin by comparing maps between the same spaces. namely triangles and disk sectors. but there is no deformation retraction to the circle. a disk is contractible but a circle is not. We construct a deformation retraction between a union of balls and its dual complex using a decomposition into joins. which is the identity on . and both map the circle into maps the circle times to three-dimensional space. The simplest homotopy type is that of a point. If is a deformation retraction from to then and are homotopy equivalent. For example. . which is the same as saying that the image of may be self-intersecting. symmetric. To see that the reverse is not true we note that the annulus in Figure IV.       ¡ ©B £    ¡ 9D  ¥   ¡ £  ¤      ¢¦ ¡ §   ¥ ¤             ¡ #B ¦   ¡ ¡   ¡ B   CB ¡   ¡   CB ¡ ¥ ¢ ¦   ¦       ¡  § ¢ ¡     ¡    ¦ ¨      ¡    § B ¦ D #B    ©B ¥     § ¦¥      § "    B# " #B "      ©B I ¢ #B    ¤   ¦ . but the two are not topologically equivalent. a join between two sets and in some Euclidean space is the union of closed line segments that connect points in with points in . (Why not?) im k im H im h Figure IV. and transitive and is therefore indeed an equivalence relation for topological spaces. and the cylinder connecting the two images of the circle. ¡ "   ¡   Figure IV. We may think of the parameter as (iii) . Decomposition into joins. for all .5 uses two kinds of joins to decompose the difference between the union and the dual complex of a set of disks. A deformation retraction from to is a continuous map with    Two spaces and are homotopy equivalent if there are continuous maps and such that is homotopic to the identity on and is homotopic to the identity on . for all . Next we introduce an equivalence relation that is less sensitive to the local dimension of spaces than topological equivalence. Define and . This definition is illustrated in Figure IV. a ball is contractible but a sphere is not. Figure IV. for all .4. Two continuous maps are homotopic if there is a continuous map with and . It is easy to show that two topologically equivalent spaces are also homotopy equivalent.4: The arrows indicate a deformation retraction from the double annulus to the figure-8 curve. ¦ P ROOF. and . We write and call a homotopy between and . Furthermore.IV. A space is contractible if it is homotopy equivalent to a point. Similarly. We construct maps and with the required properties. If is a topological subspace of then we may prove that the two spaces are homotopy equivalent by constructing a map that retracts to . for all     55 and all ¡  Note that is a homotopy between . Then is homotopic to the identity on because is a homotopy between the two maps.3. We write and say that the two spaces have the same homotopy type. In general. and and it is defined iff any two such line segments are either disjoint or meet at a common endpoint. ¤ ¥  ¡      ¡ 8D ¡ "   ¡ ¡  "   " D  "     . As illustrated in Figure IV. ends with . is equal to the identity on and therefore certainly homotopic to it. D EFORMATION R ETRACTION L EMMA .1 is homotopy equivalent to the circle.1 Equivalence of Spaces Homotopy equivalence. Note that is reflexive. which maps to . there is a deformation retraction from the double annulus to the figure-8 curve. ¦ ¦   ©B  " ¡% §   ¡ #B ¡¢ ¦  § ¢  "   ¢  ¦  ¦  "¢        ¦   time and sweep out the image of by the images of the . ¡      (ii) ¡ ¡   (i) .3: In this example. is not required to be injective. The only requirements has to satisfy is that it starts with . A triangle is the join between a    B   © £ 8 © G ! ©      § ¨ Deformation retraction.

It does not have to be connected or simply connected.7 shows an entire sequence of shapes during the deformation retraction visualized for the model of gramicidin also shown in Figure II. By choosing small. We shrink IV C ONNECTIVITY point and an edge and a sector is the join between a circular arc and a vertex. and triangle. for every point on the line segment . ary of consists of sphere patches separated by circular arcs connecting corners.56 boundary of . The deformation retraction is obtained by shrinking all joins simultaneously.) There are also four arcs that consist of more than one component each. The decomposition is constructed by forming the join between every patch. including Seifert and Threlfall [6] and Munkres [5].4. To be specific.5 illustrates the construction in the plane. Let be a finite collection of closed balls in . such an edge is referred to as singular. Figure IV. . It maintains its shape while getting smaller until it reaches the size of a point. we define an arc and a corner as the contribution of the intersection of two and of three spheres to the boundary of . A disk sector shrinks from its outer arc towards its center. There is a technical problem at the very beginning of the shrinking process that arises already in two dimensions. Figure IV. To finesse this difficulty. In the assumed case in which is in general position.3. We get a deformation retraction from to by shrinking joins from outside in. an edge is principal if it is not face of any other simplex in the complex. homotopies.5: The union of disks is decomposed into the underlying space of the dual complex and two types of joins connecting that complex to the boundary of the union. (As defined in Section II. this initial motion needs to bridge the non-zero gap between the boundary of and the boundary of the image of at time . There are four corners that are point pairs. we define a patch as the contribution of the sphere bounding to the boundary of . Specifically. and corner and its dual vertex. Similarly.6: The decomposition after shrinking the joins half way to zero. We assume general position and construct a deformation retraction from the union. which is a vertex of the dual complex. which shows the image of the retraction at time . It is illustrated in Figure IV. Each join is the union of line segments with on the boundary of and on the   Bibliographic notes. Recall that the boundof the dual complex. An arc may be a full circle. A corner may be empty. Homeomorphisms. Subtleties of the definitions of a topology      D          " £      D Figure IV. or a pair of points. which belongs to the dual complex.     B D   D $ ¥      B   D  " $   $   ¡ by defining D   § "   ¡ ¡ ¢ ¡ ¢¢ ¤ ¢¢   ¡   ¢¢   ¤ ¢¢ ¢   ¡       ¢       B       ¦      B          . Shrinking joins. and deformation retractions are covered in most texts of algebraic topology. and they correspond to the vertices on the boundary of the dual complex that are exposed to the outside in more than one interval of directions. Figure IV. a point. A triangle in the decomposition shrinks from its outer vertex towards the opposite edge.6. the outer vertex of each triangle join belongs to more than one line segment and thus retracts towards more than one point of the dual complex. and they correspond to the four principal edges of the dual complex. we choose and move the points differently in the time interval . we can make the gap arbitrarily small and easy to bridge. It turns into a trapezium whose height decreases and reaches zero at time . or any number of intervals along the circle. In the Alpha Shape software. edge. arc. to the underlying space .

L ERAY. Sur la forme des espaces topologiques et sur les points fixes des repr´ sentations. New York. The Nerve Lemma says that a space is homotopy equivalent to the nerve of a finite open cover whose sets have either empty or contractible common intersections. M UNKRES . Redwood City.IV. R. 95–167. S EIFERT AND W. [2] J. [5] J. A Textbook of Topology. Discrete Comput. Academic Press. E DELSBRUNNER . . Topology. The particular deformation retraction used to prove the homotopy equivalence between a union of balls and its dual complex is taken from Edelsbrunner [1]. 1980. A First Course. K ELLEY. [3] J.1 Equivalence of Spaces 57 Figure IV. R. T HRELFALL . J. Elements of Algebraic Topology. That equivalence can also be derived from general theorems about coverings. Pure Appl. 1975. Prentice Hall. and of a topological space are discussed in texts on general topology. We can turn the Voronoi cells of a union of balls into such a cover and get the homotopy equivalence result from that lemma. New Jersey. 1955. E.7: Six snap-shots of the deformation retraction from the union of balls representation of gramicidin to the dual complex. 24 e (1945). The history of the Nerve Lemma is complicated because different versions have been discovered independently by different people. Springer-Verlag. [1] H. 1984. 13 (1995). Maybe the paper by Leray [3] is the first publication on that topic. M UNKRES . San Diego. The union of balls and its dual shape. 415–440. Englewood Cliffs. [4] J. General Topology. including Kelley [2] and Munkres [4]. Math. California. Addison-Wesley. [6] H. Geom.

and § ¦ Triangulations. we have talked about triangulations in an intuitive geometric sense. and we write . Chain complex.      ¥ Suppose is abelian and is a subgroup. A face of is the convex hull of a subset . Call a set of -simplices a -chain. is the set of -chains and is the group of -chains. We have .8: Partition of into cosets defined by in which contains a quarter of the elements. for the case   ¥    ¢ ¥  B % ¡    This section introduces homology groups as an algebraic means to characterize the connectivity of a topological space. A subset forms a subgroup if is a group. . Examples are the infinite group of integers with addition. The group is abelian if the operation is commutative. and the finite cyclic group of elements. The kernel of is the subset of whose elements map to . A simplicial complex is a finite collection of simplices with pairwise proper intersections that is closed under the face relation. which we now develop. A simplex is the convex hull of an affinely independent point set. and because implies . Since has subsets. since a chain belongs to iff it belongs to neither or to both chains. A simplicial complex can be used to represent a topological space. The quotient divided by . that is. the term has a precise meaning. A homomorphism between groups and is a function that commutes with addition. denoted as .58 IV C ONNECTIVITY . and we have seen an example in Section II. introduces the algebraic concepts we will use to define homology groups of triangulated spaces. where the dual complex of a space-filling diagram was used to represent a molecule. In words.   ©B ¥   ¥     ¥  #B ¢ §   IV. including the empty set and as its two improper faces.1 that the underlying space of the dual complex is homotopy equivalent to the spacefilling diagram.2 Homology Groups  ¥ ¡  ¢ ¥ ¥¢    B¡       ¥ ¤ ¡     © ¨ ¦P§ ¢    ¨ ¤ ¨  £ ¤ ¨¡¨ ¡ ¨ ¨ ¨ ¤ ¤ ¤¡ ¨ ¨ ¤    ¨ ¥ C ¡ ¥ ¨ ¢   ¤ ¨    "¥ ¥ . the sum of two -chains is the symmetric difference of the two sets. so addition is indeed well defined. The resulting coset is always the same. We construct groups by defining what it means to add sets of simplices. In topology. there is a bijection between and each coset ¢ This is like adding modulo 2 where . A group is a set together with an associative operation for which there is a zero and an inverse for every group element. We note that it does not matter which representatives we choose in computing the sum of the two cosets. has the same number of faces.  then is either empty or a face of . To keep the discussion reasonably elementary. Its kernel is the zero element of and its image is the entire . is the collection of cosets. We proved in Section IV. The zero of this chain group is the empty set.   ¢ ¢ ¥  ¡ ¥ B ¡ ¥ ¥ ¥  B    ¥    ©  ¨ B B (i) if and then . In the preceding chapters. So if and then . The remainder of this section ically equivalent. mod .3. . two cosets are either disjoint or the same. By definition. Addition in the quotient group is defined by . We thus define a triangulation of a topological space as a simplicial complex whose underlying space is topolog. .  ¥   with ¥          #B    § ¥ ¡ ¥     ¡C6¢ ¥ ¡ B ©B   ¢¢ ¡ B ¡¡       ¥ ¤   ¥ ¡        ¥          ¥ ¦ ¡¤ £ ¤   § ©B       Recall that the underlying space of is the union of all simplices. A topologically more accurate representation would have a homeomorphic underlying space. We connect chain groups of different dimensions by          ¡    ¥   £   ¥      ¥      Abelian groups.  x+ H H 0 y+ H Figure IV. If has cardinality then has dimension and is also referred to as a -simplex. Let be a simplicial complex. If is finite this implies that all cosets have the same cardinality and . and the image is the subset of whose elements have preimages in :  B©   ¥   ¥ ¢ ¦£ § ¤£¥ §   ¥ ¢ ¥    ¢ £   ¢ £¥ £ §       ¥   B   B ¥ ¢   B   ¥  ¢   ¥       ¤  ¤     £¦ ¢  ¡     ¤ ¢ ¡     ¢ ¢¢ ¤ ¢¢ §  ¤ ¢ ¥ ¤   ¢¢ ¤ ¢¢    ¥ ¡ (ii) if both. Observe that implies G x+y+ H An isomorphism is a bijective homomorphism. . we restrict it to triangulated spaces and to addition modulo 2.

which implies that is a subgroup of . If then is the trivial group consisting only of one element. The boundary of a chain is the sum of boundaries of its simplices. There are two types of chains that are particularly important to us: the ones without boundary and the ones that bound. Homology groups. ¥ §  ! ¥ #§ ! ¥ "§  ¨      F UNDAMENTAL L EMMA OF H OMOLOGY. and boundaries as sketched in Figure IV.  . cycles.10. Similarly. . as required.IV. . The set which there exists a of -boundaries is the image of the -st boundary homomorphism.10.2 Homology Groups homomorphisms that map chains to their boundary. . which is the group of elements with component-wise addition modulo 2. Equivalently. We prove that is a subgroup of . The rest follows because taking boundary commutes with adding:   which is the empty set. In other words. There is only one non-empty 2-cycle.9. the boundary of every boundary is empty. This is because every -simplex belongs to exactly two -simplices. else would not be defined. The set of -cycles is the kernel of the th boundary homomorphism. . It is isomorphic to . We thus have a boundary homomorphism .9 illustrates the sequence but contains information about subgroups that will be introduced shortly. In particular. namely the ones with even cardinality. which implies that is a subgroup of . We can therefore draw the relationship between the sets of chains. Two -boundaries add up to another -boundary. for every . . ¥          ¢    ¥   ¥ ¡ £¦  ¨      ¥    £ ¤   ¦ ¥ ¥   ¥       ¢ ¥              ¡§     § © £¦ ¤¤¢  ¥         ¡ ¨   ¢© ¨ ¤ ¡   ¨   £   ¥    ¢  ¥ ¡ ¤    ¢ Ck+1 Z k+1 Bk+1 k+2 k+1 Ck Zk Bk k Ck−1 Z k−1 Bk−1 k−1 ¥ ¡  ¦ £ ¢  ¨   ¦   ©  ¨   ¦               ©         ¥         ¥ ¥    ¥ ¥ ¥ § ¥ £¦ ¨¦¤¢  ¥   ¥ £       ¥  ¨          V ¦¥¤£¢                ¦    ¥               ¥       ¥ ¥   0 0 0 Figure IV. . A -boundary is a -chain for -chain with . the homology groups of (any triangulation of) a union of balls are the same as the homology groups of the dual complex.    ¤  ¥ ©    ¡    ¤ ©    ¡ ¡  ¢ ¡  R      '     ¤ ¡ ¥   ¤  &            ¤ ©         ¢ Cycles and boundaries. This assumes of course that and have the same dimension. The two nonbounding 1-cycles labeled and generate a first homology group of four elements. Two -cycles add up to another -cycle.9: The chain complex and the groups of cycles and boundaries contained in the chain groups. The -th homology group is the quotient of the -th cycle group divided by the -th boundary An important property of homology groups is that they are the same for triangulations of homeomorphic and of homotopy equivalent spaces. the homology groups are properties of the space and not artifacts of the complexes used to represent that space. All 0-chains are 0-cycles and half of them are 0-boundaries. . Hence .10: The curves and represent the homology classes and . Hence and no non-empty 2-boundary. . As an example consider a triangulated torus. The cosets are the elements of and are referred to as homology classes. as sketched in Figure IV. For this purpose we define . Note that for every -simplex . The sequence of chain groups connected by boundary homomorphisms is the chain complex of . Observe that the boundary of the sum of two chains is the sum of their boundaries. b a 0 a b a+b 0 a a+b 0 a+b b b a+b Figure IV. group. we get the same homology groups for different triangulations of a topological space. which generate the homology group .  ¥ 59 0 a a b a+b b a+b b a 0 0 a P ROOF. as shown in Figure IV. . The size of is a measure of how many -cycles are not -boundaries. ¥ ¥ ¥ ¥ Figure IV. A -cycle is a -chain with . Proving that this is really the case is beyond the scope of this book.

Bibliographic notes. that size is the binary logarithm of the number of group elements.60 Betti numbers. homology is a general method within algebraic topology. the 1-st and 2-nd Betti numbers have intuitive interpretations as the number of independent non-bounding loops and the number of independent non-bounding shells. This is hardly surprising but not easy to prove with elementary means.. All these groups are idempotent. This relation can often be used to quickly find the Euler characteristic of a space without constructing a triangulation and counting simplices. the rank is known as the -th Betti number of that space: . the Euler characteristic is the alternating sum of these numbers: . Given a subset of such a group . Today. Homology groups have been developed at the end of the nineteenth and the beginning of the twentieth centuries. . and .   ¥  In general. ¤ ¡  ¥  ¢    0  © ¡ ©   ¥  ¡           ¤     ¡¡£   ¡ ¤ ¥ ¥    ¢ ¦£  § ¥      £ ¥          £   ¢ 0    ¡   ¢ ££   ¤ £¥   ¤ £¥ 0 £      ¢'    ¥  ¥ ¤ £¥    0      ¡ ©   ¥          ¡ 0 ¡    ¡ ¤ ¢  ¥ 0 ¥ ¥    £ ¦¨¥  ¡ ¥V £   £ ¥    ¥ ¡   ¥   ¥  ¡ 0   ¥   ¡ ¡    ¤   ¡       ¡ ¢    B ¦¥ ¥ ©      ¤  ¡  £ ¥   ¡§ 0 ¡     ¥ 0    £ ¢ ¥         ¥ ¤ £¥       £ §  ¤ ¡ ¡ 0 0    ¢ ¦£ § ¥ § ¨   ¥ ¢    ¡ ¡ ¦ B  B     ¡   ¢           ¢ ¦£ § ¥ . If the group is the -th homology group of a space. with and .) and the coefficient groups they used ( . We refer        ¥      ¤      ¥          ¤   ¥     ¤ Revisiting the example above. Since we have Since and IV C ONNECTIVITY is a homomorphism. all other Betti numbers vanish. Earlier we derived . who introduced a slightly different version of the numbers years earlier. The homology groups of dimensions are all trivial and the corresponding Betti numbers are all zero. By definition. all this work was unified by axiomizing the assumptions under which homology groups exist [1]. we have Using corresponding lowercase letters for ranks. In this case. Indeed. If there are components and vertices then and . Note that if is a homomorphism. For the closed disk we have . . we see that the Betti numbers of the torus are . Similar to . Even though there is no unique basis. which have intuitive interpretations in terms of the connectivity of the space. where the subgroup is knows as the linear hull. The most useful aspects of homology groups are their ranks..    ¡ ¤     ¥ 0   ¡  ¡       ¤  £ ¥    ¤ ¦¨£¥  ¡       ¡ ¥ ¡ 0 ¡   ¡ ¡     ¥    ¤ ¢ ¢   ¥    ¥    ¥    ¥  ¢                ¡     ¤ £¥     ¡    §   ¦ £ ¤ . Similarly for the two-dimensional sphere we have . The beginning of the twentieth century witnessed parallel developments of homology groups that differed in the elements they added (simplices. Similarly. the Euler charactherefore teristic of the two-dimensional sphere is and that of the torus is . two spaces with different Euler characteristics have homology groups that are different in at least one dimension.). hence We state this result because it is important and so we can use it for later reference. This subset is a basis if it is minimal and generates the entire group. He named the ranks of the homology groups after the English mathematician Betti. Consider a simplicial complex and let be the number of its -simplices. . We show that is also the alternating sum of Betti numbers. . and . and and therefore . and . cubes. the rank of is the size of a basis: . cycle.. This operation can also be expressed in the terminology of linear algebra. The French mathematician Henri Poincar´ is usually credited with the conception of the idea e [4]. Eventually. ¡ ¨ ¤  ¤ Euler characteristic. that is. . boundary and homology groups. The number of -simplices in the complex is also the rank of the chain group. general cells. we rewrite this relation as . then the rank of is equal to the sum of ranks of the kernel and the image. no shell. the spaces are neither homotopy equivalent nor topologically equivalent. consisting of all . As for the torus. To see this remember that a 0-cycle bounds iff it contains an even number of vertices in each component.   § ¡ ¡  £  0 0 . . . the sphere and the torus are pairwise non-homeomorphic. The concept of a rank applies equally well to chain. the closed disk has one component. and because is idempotent. all bases have the same size. By definition.. Note also that exactly half of the subsets of a finite set have even cardinality. For example. §¥   ´ E ULER -P OINCAR E T HEOREM . for every . . we can form all sums of elements in and thus generate a subgroup. no non-bounding loop. . and . the 0-th Betti number is the number of connected components. It follows that . . Note that this implies that the disk.

1984. 61 . 1952. Rendiconti e ` del Circolo Matematico di Palermo 13 (1899). New Jersey. An Introduction to Algebraic Topology. J. Redwood City. [5] J.IV. Elements of Algebraic Topology. Chapman and Hall. [1] S. [3] J. Press. ROTMAN .2 Homology Groups to Giblin [2] for an intuitive introduction to that area and to Munkres [3] and Rotman [5] for more comprehensive sources. E ILENBERG AND N. Compl´ ment a l’analysis situs. Addison-Wesley. 1981. P OINCAR E . Surfaces and Homology. S TEENROD . Graphs. London. 285–343. ´ [4] H. R. G IBLIN . J. [2] P. M UNKRES . Princeton Univ. Springer-Verlag. 1988. New York. Foundations of Algebraic Topology.

we . and in the second case have . This is justified by the equation developed in Section IV. integer B ETTI: . both illustrated in Figure IV. it just closes a tunnel formed by the surface holes.12: To the left.62 IV C ONNECTIVITY Observe that the four cases follow one and the same rule: if belongs to a non-bounding cycle in then we increment the Betti number of the dimension of and. the triangle completes a surface. ¢ £  £ ¤  ¢ ¥ £   ¨ for if ¤ ¨   ¥     0   ¢ u v ¨¡ £     0¨ ¡  ¤   ¤    ¤ is a vertex. cannot connect to and thus forms a component by itself. while to the right.  ¡ 0 0  Figure IV. To compute the Betti numbers of a complex. Alternatively.3. so is also a complex. and it does this by either incrementing the rank of the -th cycle group or that of the -st boundary group. but we have to avoid pitfalls such as creating edges that share more than one endpoint and triangles that share more than one edge. the filtration contains all alpha complexes and we get the Betti number of all of them in one sweep. we may use the filtration of a Delaunay triangulation introduced in Section II. and it is convenient to assume that any two contiguous complexes differ by only one simplex: . Both cases are illustrated in Figure IV. Adding a simplex. closes a tunnel and we have . Assuming is a complex in . Case is a tetrahedron. A valid triangulation is shown in Figure IV. By observing how fits into .   ¥  0  ¥  0  ¡ 0 ¨ Case is a triangle. Hence. by adding one simplex at a time. In the latter case. In this section.3 Incremental Algorithm The Betti numbers of a simplicial complex can be computed incrementally. otherwise. Therefore.13.14. belongs to a -cycle in then else endif endfor. we need a triangulation of the dunce cap. Betti numbers of the dunce cap.2: adding a -simplex always increments the rank of the -th chain group. return The only difficult part of the algorithm is deciding whether or not belongs to a -cycle. Again we have two sub-cases. To run our algorithm. The dunce cap is best created from a triangular piece of soft cloth. There are two sub-cases depending on whether the endpoints of belong to the same component or to two different components. Case 0 Case is an edge.11. We analyze what happens to the Betti numbers when we add a simplex to a complex . ++ -- ¨ ¥ ¤       ¡ ¤ ¥ ¥ ¢    ¢  ¨ ¤ ¢ ¡ ¤  ¡ ¡   ¢       ¡ ¥ 0 ¨ ¥ ¨ ¨ ¤   ¨ ¨ ¤    ¥ 0 ¨ ¤   ¤ ¨  ¨   ¡ 0     0   ¨  ¤  ¤   ¤ ¢  £ © 0       ¥  ¤ ¡ 0 0 0   ¤ ¡ 0    0    ¨¡     ¡   ¤ £     0 ¤ ¤   ¡   ¡ ¤ ¤ ¨¤ ¨ ¨ ¨¢ ¡ ¡ ¡   0   . we mention only the Betti numbers that change. Let and assume that all proper faces of belong to . . In the case analysis. which is particularly well-suited for filtrations. Algorithm. For example. If completes a 2-cycle then . . to do . we can determine the Betti numbers of from those of . It is not difficult to construct one. . we form a filtration that ends with that complex:  IV. all three sides are equally long and are glued to each other with matching orientations. When we run our algorithm. The algorithm is but a simple scan along the filtration. Being a vertex. σj σj Figure IV. As illustrated in Figure IV. we may sort the simplices in non-decreasing order of dimension and take all prefixes of that sequence. We study this problem after illustrating the algorithm for a small example. we first add all vertices. Adding can therefore only turn a non-bounding 2-cycle (its boundary) into a 2-boundary.11: The edge closes a loop on the left and connects two components on the right. then all edges. we describe the details of this algorithm. In the first case.12. it cannot have any 3-cycle. Otherwise. we decrement the Betti number of dimension one less than that of . u v All are complexes.

IV.3 Incremental Algorithm

63 Classifying vertices and edges. We now return to the problem of deciding whether the addition of a simplex increases the rank of a cycle group or that of a boundary group. In the former case, we say the simplex creates, and in the latter case it destroys. All vertices create, but edges in Figcan create or destroy. For example, the edge ure IV.11 creates on the left and destroys on the right. To distinguish between the two cases, we maintain the components of the complex throughout the filtration using a union-find data structure, which represents a system of pairwise disjoint sets: the elements are the vertices and the sets are components of the complex at any moment in time. The data structure supports three types of operations:

Figure IV.13: In the first step, we glue two sides of the triangle, thus forming a cone with a seam. In the second step, we glue the seam along the rim of the cone (not shown).
1

3 8 2

2 4 9 A B 2 C 3 1 D

1

Figure IV.14: A triangulation of the dunce cap.

The algorithm scans the filtration from left to right and classifies each vertex and each edge as either creating or destroying: for to do case is a vertex : creates; A DD ; case is an edge : F IND ; F IND ; if then creates else destroys; U NION endif endfor.

Classifying triangles and tetrahedra. In three-dimensional Euclidean space, every tetrahedron destroys but triangles can destroy or create. Deciding whether or not a triangle belongs to a cycle is not quite as straightforward

£ 

¡

¤

£

triangulation, each closing a tunnel and thus decrementing . Indeed, no collection of triangles has zero boundary, which can be proved by observing that three edges belong to three triangles each and all other edges belong to two triangles each. The final result is therefore and . Indeed, the dunce cap is connected, all its closed curves bound, and the surface formed by the triangles does not enclose any volume in .
 

£

£

£ 

¡

£

 

¡

0

¢ ¡

 

Table IV.1: Evolution of and triangulation in Figure IV.14.

while adding the edges of the

Standard implementations of the union-find data structure take barely more than constant time per operation. To be more precise, let be the extremely fast growing Ackermann function. Its inverse is extremely slow growing. To get a faint idea of how slow the inverse grows, we note that cannot be bounded from above by any constant, but unless is larger than the estimated number of electrons in the universe. Any sequence of operations takes time at most proportional to . For all practical purposes, this means that each operation takes only constant time. 

¤

£

¤

12 12 0 28 3 1 3C 2 10 56 1 19

13 11 0 29 3 2 45 1 10 5D 1 20

16 10 0 2A 3 3 46 1 11 67 1 21

17 9 0 2B 2 3 47 1 12 78 1 22

19 8 0 2D 2 4 48 1 13 89 1 23

1A 7 0 35 2 5 49 1 14 9A 1 24

1C 6 0 36 2 6 4A 1 15 AB 1 25

1D 5 0 37 2 7 4B 1 16 BC 1 26

23 5 1 38 2 8 4C 1 17 CD 1 27

25 4 1 3B 2 9 4D 1 18

#

©

§

§    

¨ 

¨

 

 

#

£¤

  

¡

#

¤

  © 

¨

  © 

¨

and finally all triangles. After adding the thirteen vertices, we have , and . The evolution of Betti numbers while adding the edges in lexicographic order is shown in Table IV.1. There are 27 triangles in the 

£ 

¨

 

  ¨

 

A DD

add

as a new singleton set to the system.

£

substitute U NION the system. 

 ¡

§ 

§ 

5

for the sets

and

#

©

 

#

©

#

 

F IND

§©

7

6

3

return the set that contains vertex . in

§ 

  

0

¥ 

 

¢ £

  

0

¥   

 

 

¡

0 

0

 

§ § § §
   

  ¡¥   ¡¥

¥ ¥ ¥ ¥ ¥ ¥ 


0 0

64 as it is for an edge. However, with an extra assumption on the filtration, we can use the dual graph of the complement to classify triangles and tetrahedra the same way as we classified edges and vertices. The most convenient version of this assumption is that the last complex in the filtration, , is a triangulation of . Think of as the one-point compactification of . Given a Delaunay triangulation in , we can construct such a triangulation by adding a dummy vertex and connecting it to all boundary simplices of the Delaunay triangulation. In and also in , every closed surface bounds a volume. In other words, a triangle completes a 2-cycle iff it decomposes a component of the complement into two. We keep track of the connectivity of the complement through its dual graph, whose nodes are the tetrahedra and whose arcs are the triangles. Figure IV.15 illustrates this construction in two dimensions. Adding a triangle to the

IV C ONNECTIVITY tetrahedra, but this is exactly what compactification does for us when it adds tetrahedra outside the boundary triangles of the Delaunay triangulation. The running time for classifying all triangles and tetrahedra is again propor. tional to Summary. The entire algorithm consists of three passes over the filtration: 1. a forward pass to classify all vertices and edges, 2. a backward pass to classify all triangles and tetrahedra, 3. a forward pass to compute the Betti numbers. Figure IV.16 illustrates the result of the algorithm. In the first two passes, we maintain a union-find data structure, which takes time proportional to . The third pass does only a constant amount of work per step, namely incrementing or decrementing a counter. The total running . time is therefore at most proportional to

Figure IV.15: A subcomplex of the Delaunay triangulation and the dual graph of the complement. The region outside the Delaunay triangulation is represented by a single node.

complex effectively removes an arc from the dual graph of the complement. Deciding whether removing an arc splits a component is more difficult than deciding whether adding an arc connects two components. We therefore scan the filtration backward, from right to left:
 

for downto do case is a tetrahedron: destroys, unless , in which case it creates; A DD ; case is a triangle: let and be the tetrahedra that share ; F IND ; F IND ; if then destroys else creates; U NION endif endfor. The algorithm requires that each triangle is shared by two

Figure IV.16: The evolution of the Betti number (the number of tunnels) in the filtration of gramicidin, which is shown in Figures II.3 and II.15.

Bibliographic notes. The incremental algorithm for computing Betti numbers described in this section is taken from [2]. It exploits the fact that the connectivity of the complex determines the connectivity of the complement. This relation is a manifestation of Alexander duality, which is studied in algebraic topology [3, Chapter 3]. This algorithm has been implemented as part of the Alpha Shape software, which computes the Betti numbers of

£  ¡

¥

¤

£ 

£

¡

¤

£

£  ¡

¤

£

¢

  

¨

¢ 

$

 

¢

  ¢ ¡ 

¨ 

£

  ¨ 

¨ 

¨

¢

 

  £¢

¢

 

¨

¢ £¡ 

¢ 

¨  ¨

¡ ¤
  

¨ £ ¨ ¨  

 

¨ 

¢ ¡

 

IV.3 Incremental Algorithm typically thousands of complexes in the filtration of a protein structure in less than a second. The key to achieving this performance is a fast implementation of the union-find data structure, namely one with running time proportional to for operations. The details of such an implementation can be found in most algorithm texts, including [1, Chapter 22]. A proof that the running time cannot be improved from to has been given by Tarjan [4].
[1] T. H. C ORMEN , C. E. L EISERSON AND R. L. R IVEST. Introduction to Algorithms. MIT Press, Cambridge, Massachusetts, 1990. [2] C. J. A. D ELFINADO AND H. E DELSBRUNNER . An incremental algorithm for Betti numbers of simplicial complexes on the 3-sphere. Comput. Aided Geom. Design 12 (1995), 771–784. [3] A. H ATCHER . Algebraic Topology. Cambridge Univ. Press, England, 2002. [4] R. E. TARJAN . A class of algorithms which require nonlinear time to maintain disjoint sets. J. Comput. System Sci. 18 (1979), 110–127.

65

£

£  ¡

¤

£

£

£ 

¡

¤

£

.17: The effect of elementary row and column operations on the bases of and . Using this notation. -with . . adding row to row has the effect of replacing by . : do . Adding column to column has the effect of replacing by . we need to consider more general bases. Incidence matrices. After explaining the algorithm both for addition modulo two. As illustrated in Figure IV. and similarly the form a basis of . add column to column . is non-zero.. ¥  ¨ ©§ ¨ ©§   ¤ ¤ ¤ ¥£ ¡    ¥ ¡   ¡  ¡   ¡          ¡ ¥ ¡     3     ¡              ¦ ¥  ¡ ¥      0 ¡        ¤ be a simplicial complex -simplices . (Since we deal with idempotent groups. while the basis of changes at the modifying column. add row to row . The function fails to make non-zero iff all entries in the remaining sub-matrix are zero. . The + £ ¨   £      ¡     ¨    0   ¨ ¥ ¦ ¡ ¡ ¢        ¡ ¦ £   ¥  ¨ ¦  £   ¦   £   £    3    ¥ ¦  ¤           ¥ Figure IV. that can be handled symmetrically.o. we can write the -th boundary homomorphism in matrix form:  col endif exchange row with row . forall columns do if then col col endfor endif endfor. subtraction is the same as addition.) Note that the effect is not symmetric: the basis of changes at the modified row. It does this by exchanging rows and columns. is no longer the -th incidence We use the phrase “assume without loss of generality” as a short-form for expressing that there is another case. ¤ ¡ £  0   ¤  ¤ 0 £ X£  £  £ ) ¡   )   ¡    £       £    ¤¦ ¦ . . . £  col . To make this interpretation of the incidence matrix useful for computing Betti numbers. Let with -simplices and -th incidence matrix is hj hj − h s hs gi + gr gr + gi where iff is a face of . namely .l.g. . matrix. boolean NON Z ERO and while assume w.   .66 IV C ONNECTIVITY IV. exchange column with column .17. .4 Matrix Algorithm In this section. we extend it to integer addition.   .18. We can use Gaussian elimination to transform the incidence matrix into normal form. row     £ Normal form algorithm. The algorithm uses a boolean function NON Z ERO that makes sure that during the -th iteration the -th diagonal entry. After a few elementary row and column operations. return . These can be generated by performing elementary row and column operations:  row endif £ 0 for to do if NON Z ERO then forall rows do if then row row endfor. that if col then col else find row endif endwhile. ¦ . £ £     ££   £ ) ¡ £ £ ¤ £ £ )X   ¡ £     ¡ £  £       ¡   Recall that the form a basis of the -th chain group. .  . The above formula thus expresses the boundary of every basis element of as a sum of basis elements of . The al-    ¤ 0   £ £ 0 ¦ ¦  © © £              £    Exchanging two rows or columns is equivalent to reindexing the or . as illustrated in Figure IV. but it is still describes a correspondence between and . we develop the linear algebra view of homology and formulate a matrix algorithm for computing Betti numbers. . The matrix is in normal form if bases of its non-zero entries are lined up along an initial segment of the main diagonal.

We can thus derive group: the Betti numbers from the sizes and numbers of non-zero entries in the normal form matrices. and for a given oriented simplex . . by adding the coefficients of like simplices: ¤ ¥  ¨ £¨£R      ¨ R  ¨ R      ¨  ¡ R   ¨  £ ¢ ¥    ¥      . . It does not occur for spaces that can be embedded in . We can check that the boundary is independent of the ordering. Before discussing the necessary modifications. the boundary of alternating sum of ordered dropping one vertex at a time:  ¨  ¡   #R   ¨ ¡    ¨ R  is the -simplices obtained by  ¡      ¨ ¥             ¨   Deriving the Betti numbers.    ¨   ¥ ¥ ¥    Figure IV. . It can be constructed from a rectangular    ¢       ¥ ¥   ¥ ¤ ¦¨£¥    0  ¥ ¥ ¥ the -th cycle group minus the rank of the -th boundary . we define the group of -chains. Maybe the simplest topological space whose homology groups have torsion is the Klein bottle. The -th Betti number is the rank of where is the function value of . this function as a formal polynomial: ¥     ¥   ¤    ¡     ¥ ¡ ¡ ¤ ¥  ¥ ¡ ¡     ¡  ¨  ¢ ¨ ¥ ¤ ¥ ¥  ¥ ¡ ¨  ¥ ¡ ¢ ¡      0  ¢ ¥ ¥ ¥ ¥ ¤  ¥ . It follows that the number of non-zero entries along the main diagonal is . Either way.4 Matrix Algorithm gorithm consists of three nested loops. and the -th Betti number is the rank of that homology group: . except if it is a vertex. The zero-rows correspond to -cycles.  "    ¥  ¡ 1                ¡   ¡               ¥ By definition. we give each simplex in an arbitrary but fixed orientation. where the hat marks the deleted vertex.18. we write for the other orientation of the otherwise same simplex. of which we have many. and the group of -boundaries.IV. Each simplex has two orientations. The matrix algorithm can be extended to coefficients in instead of .19: A triangulated rectangular piece of paper glued to form a Klein bottle. the running time of the algorithm is cubic in the number of simplices in the complex. the -th matrix has rows and columns. A -chain is a function from the -simplices to the integers. piece of paper by gluing opposite sides as shown in Figure ¢ ¡ We note that the ranks of the incidence matrices suffice for computing the Betti numbers and it is not necessary to go all the way to normal form. the group of -cycles. Similarly. Letting the running time is therefore at most proportional to 67 1 1 bk −1 ck −1 bk −1 zk ck Integer coefficients. . It is convenient to write  1 4 5 1 2 3 3 2 1 4 5 1 Figure IV. Suppose we have transinto normal form. we can check that the Fundamental Lemma of Homology still holds: . To set the stage. as long as it belongs to the same orientation. and we write . we talk about what this means in terms of adding simplices and chains. formed all incidence matrices of As illustrated in Figure IV. Torsion. Two ordered simplices have the same orientation if their orderings differ by an even number of transpositions. The -th homology group is again . We add two chains componentwise. and that it is the negative boundary for an ordering of the opposite orientation: . We start at the beginning. An ordered -simplex is an ordering of the vertices of a -simplex. in which case it has only one. so it is not part of people’s immediate experience.18: The normal form of the -th incidence matrix. A curious new phenomenon that arises with the use of integer addition is algebraic torsion. As before.

namely . the initial sequence of ones is followed by integers . This is what causes torsion. the algorithm generates the torsion coefficients with the required properties. and for addition modulo 2 and . The 1-cycle marked around the neck of the bottle does not bound.  ¤ ¡   ¡      £      ¡ ¤               ¤  £ ¡ ¤    ¡   £  0 0  ¤ 0 ¤  ©   ¥  ©  ¥ ¡ ¥  ©  ©  ¡ ¥      ¤      ¥ ¤   ¤  ¤ £     V    ¡ ¡¤     ¨   ¤ ¤ ¨ ¥    § ¨ § £       ¢ £¡  ¤      ¡0 ¤      £  £ ¤    ¤¤ ¥ ¦ ¤   £ £       £   ¢    £¡    £ ¥      £  §  ¥  ¡ ¤ ¤    £  ©   ¥  ¥ ©  ¤ ¤  ¤ £ 0    . BACHEM . The normal form it uses is sometimes referred to as the Smith normal form [3]. which is not an integer multiple of . California. Sympos. but their differences are predictable and described by the Universal Coefficient Theorem of Homology [2. assume there is an entry . Bibliographic notes. and we can make zero with a row operation.. with . 8 (1979). if we get such a positive integer in a single row operation. . Polynomial algorithms for computing the Smith and Hermite normal forms of an integer matrix. Redwood City. S TORJOHANN . The normal form of a bases transition matrix is the same as before. and similarly. [4] A. [3] H. 4]. On systems of indeterminate equations and congruences. Philos. and the rest. 151 (1861). Symbol. and for integer addition. it is possible to modify the algorithm to guarantee polynomial running time [1. Indeed. Near optimal algorithm for computing Smith normal forms of integer matrices. As for coefficients. Chapter 1]. the Euler-Poincar´ Theorem is true independent of the type e of coefficients we choose to define homology groups and Betti numbers. First we extend the elementary row and column operations by allowing the multiplication of entire rows or columns by non-zero integers. Symmetrically. Since divides every entry in the remaining sub-matrix. Algebraic Comput. and when we draw it. We get the torsion coefficients from the -st normal form matrix: they are the diagonal entries that exceed one. To see this property. Otherwise. J.68 IV. A more substantial modification is needed within the function NON Z ERO. and it is not even clear whether or not it is polynomial in the input size. .19. we can determine the homology groups directly from the normal forms of all incidence matrices. 1984. However. The abelian group is thus the direct sum of a free subgroup. The matrix algorithm presented in this section is taken from [2. 293–326. Since it has torsion. it is unclear whether or not its running time is polynomial in the input size. Trans. The rank of the group is the number of copies of . we have to allow for a self-intersection. this attempt will be successful and will divide every entry in the sub-matrix. Hence. which now attempts to turn the next diagonal entry. except that we now allow entries in the main diagonal that are neither zero nor one. we have . . For integer coefficients. By adding row to row we keep unchanged and we change to . the sequence of operations is sensitive to the size of the integers that arise. For the Klein bottle. Unless the entire remaining sub-matrix is zero. Addison-Wesley. it will also divide the future nonzero diagonal entries. S MITH . which is referred to as its torsion subgroup. the algorithm is sometimes called the Smith normal form algorithm.. R. The are the torsion coefficients. In “Proc. as before. Internat. We get the rank of the -th homology group from the -th and the -st normal form matrices: . M UNKRES . To describe the phenomenon more generally. . . 499–507. but their alternating sums are both equal to the Euler characteristic: . The running time of the algorithm is no longer guaranteed to be at most cubic in the number of simplices. .  ¤ ¤ Algorithm revisited. 1997”. Now we get a positive integer smaller than in a single column operation. . that is not an integer multiple of : ¡ ¥   ¤ ¥ ¡  ¦      ¡ ¥   ¥ 0 ¥    ¥ Furthermore. Specifically. we may require that all are larger than one and that divides for each . This extra condition fixes and the indices . We thus get different Betti numbers for addition modulo 2 and for integer addition. [2] J. but twice that 1-cycle bounds. Elements of Algebraic Topology. into the smallest positive entry achievable by row and column operations. we need the fact that every finitely generated abelian group is isomorphic to a direct sum (Cartesian product) of copies of and of cyclic groups: ¥ ¦ ¤ ¤ IV C ONNECTIVITY If we get a positive integer smaller than in a single column operation. The Betti numbers obtained for and (or other coefficient groups) are not necessarily the same. . Indeed. Chapter 7]. such that divides . 267–274. which is . we know that the Klein bottle cannot be embedded in . . we may assume that divides both and . all larger than one. namely . SIAM J. for each . [1] R. Comput. We modify the above algorithm to transform the incidence matrix into normal form. K ANNAN AND A.

Simple graphs. the halfway plane separates the two line segments.8 and I. be the dual complex of a 4. Define the star of a vertex as the collection of simplices that contain . (i) Triangulate the rectangle such that you get a valid triangulation for both ways of gluing its sides. 5. Since the line segments are skew. Consider the following topological spaces: a circle. Use the language of homology groups to re-confirm the following formulas. every face of a simplex in the link also belongs to the link. 2. Equivalence classes. that is. each time with matching orientations. Take the graphs drawn in Figures I. 3. Here an atom is a vertex and a bond is an edge. Draw the decomposition and highlight the intersection with the halfway plane. if the graph is connected. Download a protein structure from the pdb database and use the Alpha Shape software to compute the Betti numbers of its van der Waals and its solvent accessible diagrams. Protein structure. (iii) Partition the collection of graphs into classes of the same homotopy type. (i) Are there any two amino acids with isomorphic graphs? If yes. (i) Partition the collection into classes of same topological type. (ii) Compute the Betti numbers of the torus and the projective plane by running either the incremental or the matrix algorithm (by hand) on your triangulations. #   ¡ ¡ ¦   ¤ £¡   Exercises #   ¦ ¨    £ ¡ ¢  ¡¨    ¤ ¨ ¤¢¤   ¢ ¢  § ¢ £¡ ¤ ¤ ¡¡  ¡ ¤ ¨ ¡   ¤ ¡  ¤ £ ¨£           ¤   ¡ ¢  £ . You get a projective plane if you glue again the left to the right and the top to bottom sides but now with opposing orientations. (ii) Partition the collection into classes of same homotopy type. (ii) Decomposing the line segments into and pieces implies a decomposition of the tetrahedron into joins. which are well-known for simple graphs: 1. a M¨ obius strip. The sphere (ii) Assume is the center of bounding intersects all other balls in caps. no matter whether or not it has (partial) double bond character. Stars and links.Exercises 69 . Amino acids. Torus and projective plane. You get a torus if you glue the left side to the right side and the top side to the bottom side. Joins and simplices. which are smaller tetrahedra. The halfway plane is parallel to both line segments and lies exactly halfway between them. which ones? (ii) Calculate the Betti numbers and Euler characteristics of the graphs. Take a rectangular piece of paper and orient the left and right sides from top to bottom and the top and bottom sides from left to right. a trefoil knot. A tetrahedron can be defined as the join of two skew line segments in space. (i) Show that the halfway plane intersects the tetrahedron in a parallelogram. and the link as the collection of faces of simplices in the star that do not belong to the star: 7. (i) Show that is a complex. in general. and a plane with origin removed. 6.9 as definitions of the amino acids as (onedimensional) topological spaces. a sphere with north-pole and south-pole removed. Let be the number of vertices and the number of edges. A simple graph is a simplicial complex that consists of vertices and edges but has not triangles or higher-dimensional simplices. Let finite collection of balls in . Show that is isomorphic to the dual complex of that collection of caps.   0 0   ¡     0 ¥ ¥ ¥ # #     ¦ ¦ (i) (ii) (iii) if the graph is a tree.

70 IV C ONNECTIVITY .

3 V. as discussed in Chapter IV. To decide what is appropriate. is an important first step.3. we need to have a purpose. While this idea seems simple enough. Finally.2. the situation is hopelessly complicated. and that local shape complementarity plays a significant role in making such events happen.1. but by itself is insufficient to appropriately characterize the shape of protein structures. We define it as a two-dimensional sheet separating the molecule. We do this be introducing three essentially new concepts. which are partially protected regions in the protein or molecular assembly.1 V. in Section V. In Section V. we make an attempt to give a precise meaning to cavities in proteins. There is overwhelming evidence that interesting events in such interactions happen preferably in cavities. the details are tricky and require that we use what we learned about pockets and topological persistence.4. We see this as a tool to cope with imperfections as it permits us to distinguish topological features from topological noise. interactions that are based on shape complementarity are not entirely so. In other words. In Section V. It appears that organic life is based on computations performed by dynamically matching the (changing) pieces of a three-dimensional puzzle. we make an attempt to give a precise meaning to interfaces between interacting molecules. In Section V. we illustrate the concepts using the Alpha Shape software and extensions. Our goal in this chapter is to introduce mathematical and computational methods that allow us to start talking about the real problem in more precise terms. we return to homology groups and introduce the concept of topological persistence. It is a measure of how important a topological feature is during the evolution. and this information is the evolution of the shape under growth. The goal we have in mind is understanding how proteins interact with each other and with other molecules.2 V. A statement like this needs to be accompanied by a series disclaimers: not every interaction is based on shape complementarity. The main idea here is to combine the topological concept of a hole with a minimum amount of geometric information. and the relevant shape complementarity is local and imperfect.4 Pockets Topological Persistence Molecular Interfaces Software for Shape Features Exercises 71 .Chapter V Shape Features The topological analysis of spaces. V.

To formalize this intuition.72 V S HAPE F EATURES from infinity. In other words.1 Pockets In this section. and all other components are voids. Starting at a point outside the space-filling diagram. Definition of pockets. Figure V. we need to settle on a growth model. In the interior of V. the center of the ball fixed and the radius at time is equal to the square root of . The boundary is a collection of triangles in . the boundaries of the voids form a basis of that homology group. Since is finite. See Figure V. Recall that Figure V. the vector field is defined by the sweeping spheres. a pocket is a maximal portion of space outside the spacefilling diagram that turns into a void before it is subsumed by the growing diagram.2 illustrates this view in two dimensions. This collection bounds in but not in . The points that flow to infinity form a single component. the Voronoi cells. It is convenient to use the one that gave rise to the sequence of alpha complexes. To make this idea concrete. the points in the shaded region have paths that end at Voronoi vertices. that is a finite collection of closed balls in and is the space-filling representation of a molecule. Note that voids are pockets without mouths. .   0 ¢  ¡   ¡ ¡ gD D   0 ) " ¤       ¤   ¤     ¢ ¤ "    ¤ ¤ )    ¤ " ¥ " ¢ ¡ ¤ " ¥ " ¢ ¡  0 . we described a deformation retraction from the space-filling diagram. in the direction normal to the surface. Indeed. The simplest type of pocket is a void. Figure V. A pocket generalizes the concept of a void by relaxing the requirement it be disconnected " ¥ ¢ ¡ ¡ B in Chapter II.2: The growing disks push the points on the boundary outwards. Following the vectors. Exactly on component is unbounded (infinitely large). Voids. We extend it to the rest of space by using the circles that sweep out the Voronoi polygons and the intervals that sweep out the Voronoi edges. Each pocket is open where it borders the space-filling diagram and closed where it borders the outside. we formalize the idea of a cavity in a protein by introducing the concept of a pocket in a spacefilling diagram. All we require is that a pocket be wider on the inside than at possible entrances from the outside. in normal direction. we can reverse the deformation retraction to show that the two voids have the same homotopy type. It follows that represents a homology class in the second homology group of . We define a pocket as a connected component of the set of points whose paths do not go to infinity. The latter set of points may formally be defined as the intersection of the pocket with the closure of the outside. The plain existence of that retraction implies that for each void in we have a void in that contains the void in . Suppose. we may think of each void in as a collection of tetrahedra. which we refer to as the outside. the balls cannot cover the entire space. . . consists of one or more connected components. for example. The corresponding void in the dual complex consists of five triangles. We may think of the growth as pushing the points on the boundary of the space-filling diagram outwards. Indeed. Since the dual complex is a subcomplex of the Delaunay triangulation. which we refer to as the mouths of the pocket. is the number of voids in . which implies that the complement. . we grow the space-filling diagram and observe how it changes: the relatively narrow entrances close before the inside disappears. to the dual complex. but we should keep in mind that this choice does affect what we do and do not call a pocket. Hence. Its connected components are open twodimensional sets.1: The union of disks has a single (shaded) void. which we define as a bounded connected component of the complement. we follow vectors and thus form a path that may or may not go to infinity.1 for an illustration of the definition in two dimensions. According to this in remains model. which is the same as the number of voids in .

We recall that is the point at which the affine hull of intersects the affine hull of its dual in the Voronoi diagram. Case C : . lies outside and sees one edge. In the first case. From left to right. Here we have two sub-cases depending on whether sees one or two edges from the outside.4: The thin solid lines represent polygons that meet along a common edge in space. it lies on ones side of the polygon. Case is a tetrahedron. which marks the orthocenter of the triangle. namely in              M1 C1 Case is a triangle and lies in the interior of the corresponding Voronoi edge. At the moment they touch.1 Pockets Evolution of dual complex.3. Metamorphoses and collapses.4.   ¢ Figure V. the orthocenter lies inside the triangle. This cell is encountered at time . Here we have three sub-cases depending on whether sees one. the balls touch the edge at the same moment they encounter the two polygons and one cell dual to the two visible edges and the vertex they share. On the left.3: The vertical lines are side views of polygons in space. The four balls completely surround the Voronoi vertex before they reach it.       ¨ ¨    ¢   Case M : . ¨ ¡  ¡ ¨ ¡  ¡ £ ¨ ¢¡        ¨ ¡ ¨ ¨ . polygon or cell of the Voronoi diagram. and the relative position of its orthocenter. Its orthocenter is necessarily the corresponding Voronoi vertex. sees a vertex of from the outside. The two balls approach the polygon from the same side. all illustrated in Figure V. the three balls touch the Voronoi edge at the same moment they encounter the Voronoi polygon dual to the visible edge. The three balls completely surround the Voronoi edge before they touch at . polygons and cells that correspond to the triangles. we may associate a pocket of the space-filling diagram with a pocket of the dual complex. The dual complex changes only at discrete moment. In Case C and in the last sub-case each of Cases C and C . Case M : is a vertex and the orthocenter lies in the interior of the corresponding Voronoi cell. . That edge appears as a solid dot. In the second case. Case C : .     73 Case C : . two or three triangles from the outside. edge. The two balls approach the Voronoi polygon from both sides. namely when the space-filling diagram encounters a new vertex. which is the moment when the -th ball changes from imaginary to real radius. Case is an edge and lies in the interior of the corresponding Voronoi polygon. . both illustrated in Figure V. This is unlikely to happen for molecular data and usually indicates a measurement or modeling mistake. The four balls touch the Voronoi vertex at the same moment they touch the Voronoi edges. lies outside and sees two edges and their shared vertex. There are two generic sub-cases. edges and vertices visible from . . Case M : . In four of the ten cases. eventually touching it at . Similar to voids. ¨ ¢¡ ¨ ¡  ¡ ¨ ¢¡  ¢ ¨  # ¨   ¨ C2 M2 C2    0 ¥   D ¨   Figure V. again by observing how the space-filling diagram changes as it grows. while on the right.V. The solid dot marks the orthocenter of the Delaunay edge. this edge intersects its dual Voronoi polygon. Case M : . There are ten cases distinguished by the dimension of the dual Delaunay simplex. There are three generic subcases. the smaller ball breaks through the outer sphere and starts sweeping out the Voronoi cell on the other side of the polygon. The latter is defined combinatorially. Assuming lies outside the space-filling diagram. this is only possible if the ball centered at that vertex is contained inside the union of the balls centered at the other vertices of . only one simplex is added to the dual complex.

its orthocenter is at infinity.  ¥ ¤  ¥ ¨      . They can be understood as inverses of the six types of collapses illustrated in Figure V. collapsing a triangle from an edge and a vertex. the transparent triangles. Each collapse can be realized as a deformation retraction that pushes a portion of ’s boundary through toward the remaining portion of the boundary. for . the changes in the dual complex described in Case C are caused by inverses of -collapses. two or three of the triangles. the operation does not affect the homotopy type of the complex. we introduce a dummy tetrahedron. To cover the case in which the triangle lies on the boundary of the Delaunay triangulation.5. A proper face of a principal simplex is free if all simplices that contain are faces of . we define . this is true because the orthoradius of is infinity. by definition. It is convenient to specify the type using the dimensions and and to talk about -collapses. the predecessors of the predecessors.  01−collapse ¨  12−collapse 02−collapse implies that the square radius of the orNote that thosphere of is less than that of the orthosphere of . As illustrated in Figure V. As noted in Case C .6: Think of the triangles as projections of tetrahedra and the circles of projections of spheres. top to bottom: collapsing a tetrahedron from a triangle. In the process. the dashed edges. the two orthospheres intersect in a circle that lies in the separating plane and the orthocenter of is further from that plane than the orthocenter of . We are only interested in tetrahedra. We are now ready to define and compute the pockets of the dual complex using the partial order over the tetrahedra. For each triangle visible from . The ancestor set of a tetrahedron contains . if the orthocenter of a Delaunay tetrahedron lies outside then it sees either one. we call these operations metamorphoses. M . they define metamorphoses in the evolution of the dual complex. that represents the space outside the triangulation.    ¢ ¢  ¨ ¢  © ¨ ¨ ¨ ¨  © ¨   ¢    ¢  ¡  ¢   ¤    ¨   ¨  £ ¤    ¥   ¥  ¥ ¤      ¨ £ ¤  ¢    ¨   ¥     ¤ ¨ ¢ ¤ ¨ ¥  ¤ ¨ ¢ ¡ ¤ ¨     ¡ the flow along normal vectors. Consistent with the discussion in Chapter III.74 Cases M . Formally. and so on:  ¨ §  ¢ ¤!£ © ¥    £  ¡  ¢¡   £ § ¢ ) ¡   ¨ Figure V. the complex obtained from by collapsing the pair is . If . an edge and a vertex.6. which we think of as a discretization of Figure V. and collapsing an edge from a vertex. Such a pair defines a collapse. . this is true because their orthocenters are Voronoi vertices that lie on the same side of the plane separating and . Pockets of dual complex. since they change the homotopy type. By definition. and neither does its inverse. With this notation. and the dotted vertices. Being a deformation retraction. its predecessors. where is the tetrahedron on the other side of the shared triangle. the retraction removes and all faces of that contain . We will see shortly that the remaining six cases do not affect the homotopy type. This is what we call a sink of the relation. Recall that a princi- V S HAPE F EATURES 23−collapse 13−collapse 03−collapse pal simplex is not face of any other simplex in the complex. the collapse removes the tetrahedron. we may introduce a partial order on the Delaunay simplices. M and M . In each case. Using the classification into ten different operations. which is the operation that removes all simplices between and including and . if any. Hence. is acyclic and its transitive closure is transitive. The other sinks are the tetrahedra that contain their orthocenters. so can only be a successor but not a predecessor of other tetrahedra. for . Partial order. If and are both (finite) Delaunay tetrahedra. This implies that the square radius increases along every chain of the relation.5: From left to right. The centers of both (dotted) orthospheres lie on the right of the separating plane.

7. Holes and Other Superficialities. Collect the boundary triangles not in Step 2. The pockets in the dual complex are defined by the tetrahedra that neither belong to the dual complex nor to the ancestor set of . 1078–1082.8 for a two-dimensional illustration. eds. namely the growth model of the input balls. E DELSBRUNNER . Springer-Verlag. we collect the triangles in that belong to exactly one pocket tetrahedron. A. Bibliographic notes. FACELLO AND J. In Step 1. [4] I. We compute the pockets in two steps: ¥ ¥ 75 complex. we use the same standard graph algorithms to compute components. such as the ones counted by the Betti numbers. we assume the Delaunay simplices are given in a list ordered by birth-time. Berlin. Step 1. The formalization as pockets introduced in this section has been described in [3] and implemented as part of the Alpha Shapes software. Cambridge. Sharir. Based on this adjacency information. In everyday language we barely make any difference between pockets and other holes. . which connects to the outside along one mouth. MIT Press. we call two triangles adjacent if they share an edge does not belong to . which form a prefix of the sub-list of tetrahedra.1 Pockets We have seen that a tetrahedron can have more than one successor. M. who introduce a concept they call a hollow which is similar at least in spirit to our formal notion of a normal pocket. E DELSBRUNNER . we mark the tetrahedra in the ancestor set of by searching backward from along the pairs of the relation. To complete Step 1. Structure-based strategies for drug design and discovery. [1] R. 83–102. Basu. Partition this collection into components. It is also possible that it belongs to more than one ancestor set. Note that this is more conservative than collecting all tetrahedra outside that belong to ancestor sets of finite sinks. Call two tetrahedra in this collection are adjacent if they share a triangle that is not in the dual ¤ To collect the tetrahedra. On the definition and the construction of pockets in macromolecules. As illustrated in Figure V. Science 257 (1992). such as depth-first search or union-find. Surface reconstruction by wrapping finite sets in space. S. we now collect all unmarked tetrahedra in a single scan through the list. [3] H. C ASATI AND A. The corresponding pocket in the dual complex consists of four triangles and a single mouth edge. Step 1. Partition this collection into components. Massachusetts. C.8: The eight disks form one pocket. ω K Figure V. mark the tetrahedra in the dual complex. The importance of cavities in drug design and discovery has been known for a while [4]. We may do the computation for individual pockets or for all pockets at once. VARZI . Computing mouths is similar to computing pockets. B. D. to appear. Next. 88 (1998). although this is not the common case. J. Aronov. Discrete Appl. An extension to include simplices of all dimensions has been used for reconstructing the surface of scanned point sets [2] and might have further applications in the analysis of protein shape. This has also been noticed by the philosophers Casati and Varzi [1]. [2] H. The definition of a pocket is not purely topological and requires a crucial geometric component. Pach and M. the relation over the tetrahedra is acyclic and goes monotonically from left to right. The resulting collection contains the tetrahe¢ ¢ Figure V. ¤ ¢ §  ¢ ¤ ) ¤ ¢ . See Figure V. L IANG . Discrete and Computational Geometry — The Goodman-Pollack Festschrift. dra of all pockets. In Step 2. This growth model forms the basis of the partial order over the Delaunay tetrahedra. Math.V. Finally. Collect the tetrahedra in . K UNTZ . we can compute the connected components using standard graph algorithms. only one dimension lower. We ¤ ¥ ) Step 2. 1994.7: Ordered list of simplices with relation over the tetrahedra indicated by arrows.

For example. which are displayed in Figure V. a component gets destroyed. We will see that even if a triangle and a tetrahedron are added at different moments. Then belongs to a -cycle. in the generic case. A prime example of an evolving topological space is a space-filling diagram that grows in the way discussed in the preceding section. ¡ 0  V. If it does. any two contiguous complexes differ either by a metamorphosis or an anti-collapse.3. it is possible to decide in an unambiguous manner whether or not the tetrahedron destroys what the triangle created. namely when the two components get born at the points labeled M and when the components merge the second time at the upper point labeled M . Case creates.10. and that the lower M destroys what the right M created. The new column of the matrix   The intuition. The life-time of this void is zero because the triangle and the tetrahedron are added at the same moment. and when the hole gets filled. The measure can be used to distinguish between pockets with relatively wide and narrow entrances and they are essential in the definition of molecular interfaces discussed in the next section.9: The region grows from two vertices. and we may interpret that life-time as a measure of significance of the void. then we are talking about a void with positive life-time.3 and depends on the effect on the Betti numbers: a -simplex creates if its addition increases and  ¤  ¡  ¤ Figure V. There are three events at which homology classes are created. We may also interpret it as a shape measure of the corresponding pocket. The labels indicate the types of metamorphoses that correspond to the topological changes. Let the dimension of be . Hence . When the components merge the first time. We can thus write the Betti numbers of in terms of the ranks of various groups defined for as follows:   ¥ ¨  ¡ ¨ ¡ M1 ¡ ¥ ¤   ¡ ¤  M0 M0    ¤ M2 of is zero because is not a face of any simplex in . Recall that a single step in that algorithm computes the Betti numbers of a complex from the Betti numbers of .76 V S HAPE F EATURES . the two components merge twice. Consider the it destroys if its addition decreases evolving two-dimensional space illustrated in Figure V. which implies that its row in the matrix of can be zeroed out.9 as an example. The only matrices affected by adding to the complex are the ones of and of . We will formalize the idea of pairing creations with destructions by revisiting the incremental algorithm for Betti numbers presented in Section IV. The are the complexes that arise during the evolution and. Each anti-collapse may be viewed as a sequence of metamorphoses in which the later simplices destroy the topological features created by the earlier simplices. a 23-collapse consists of a triangle creating a void and a tetrahedron filling the same. may remain the same or it may increase. Nobody destroys the component created by the left M . is the same for hand. As before. It should be clear that M destroys what the upper M created. and the second merge creates a void that eventually disappears. We study the algorithm in terms of matrices of boundary homomorphisms. a 1-cycle gets destroyed. Incremental algorithm revisited. ¡ ¡  Ck C k −1 C k +1 0 Ck   ¡  ¡ ¥ ¡  ¡ ¡     ¡  ¡   ¥  ¡ ¥ ¥       ¤  0  ¤ ¡   0 The idea of creation and destruction is the same as in Section IV.  ¥ ¨ ¢ ¨ £ ¥ ¤¨ ¢ M1 Figure V. we measure the life-time or persistence of a topological feature in an evolving topological space. we write ¡  In this section. the rank of the -th boundary as it is for .  ¡  ¤  ¡   ¨ ¦  ¡§     £ ¤  ¢     ¡© ¨ ¥  ¨ ¨¡   £ ¦   ¡   ¡§       ¤       ¤ for the corresponding filtration. On the other group.10: The addition of to the complex appends a column to the matrix of and a row to the matrix of .2 Topological Persistence 0 )   ¡¤  ¤ ¢    ¢  ¤ ¢¡ ¤   ¥ .

which are slower but more general. Instead of a unionfind data structure. ¨ $ £ In words. Given a column . In general. To explain the algorithm. We argue below that Function DOES C REATE computes more than just Betti numbers: it also determines how long a homological feature lasts along the filtration. It returns zero if the row is not defined. return TRUE. Its row in the matrix of can therefore not be zeroed out and we get a new non-zero entry in the normal form of that matrix.11: The shaded rightmost non-zero entries identify last columns of rows. Besides re-proving the correctness of the incremental algorithm. When we add .V. To describe how this is done. Hence. For example. we attempt to zero out its row from right to left. it returns zero if that last column does not exist. After that addition. In other words. Since we only use row operations.11 before the shaded last row is added. we also assume a function ROW that returns the Figure V. boolean DOES C REATE int while L AST C OL do if ROW then row row row else return FALSE endif endwhile. Figure V. we return to the situation in which the filtration represents meaningful information. we say Keeping this convention in mind. To make this precise. we now define the persistent -th homology group of as the cycle group divided by the boundary group at positions later in the filtration: Taking the intersection of the boundary group with the cycle group is necessary for technical reasons to define the quotient group.12: The cycle group and its decompositions into solid -persistent homology classes and dotted 0-persistent homology classes. and we assume a function L AST C OL that returns the index of the last column. we let be the index of 1 1 1 1 1 1 1 1 1 1 1 After running Function DOES C REATE for the -th row. we use elementary row operations. in which destroys.    77 rows. but to simplify matters here. case Persistent homology. the index. or it has a unique last column. among the first the last column. Conversely. the -st Betti number decreases by one and the -th Betti number remains unchanged.   ¥ The case analysis confirms that the incremental algorithm as described in Section IV.3 computes the Betti numbers correctly.12 illustrates the difference Zj 0 B j+p Bj Figure V. we define persistence so it depends on the time when simplices are added to the complex in the filtration. the changed and the -th Betti number increases by one. in which case the corresponding simplex creates. Recognizing creations. we maintain inductively that each column is last for at most one row. columns in the matrix of correspond to individual -simplices and rows represent cycles. this property is satisfied by the matrix in Figure V. for which is index of the row.2 Topological Persistence -st Betti number remains unIn words. such as scale in the case of alpha shapes. we call the column of the rightmost non-zero entry in a row its last column. each row has at most one last column. Clearly. the row that corresponds to the new simplex . Case destroys. we re-define time equal to is added at time . Then does not belong to a -cycle. that row is either zero.   ¥ 0  £    ¡ ¡ ¦     ¢    £  ¤ ¨ ¡ $ £  £  £          ¥     0 ¥ ¨ ¨ ¢     ¡    ¡  ¡      ¡  ¡ ¡   ¡  ¡   ¥ ¥ ¥ ¥ ¨     £ ¥ ¥ ¥    ¨   ¨ ¥ ¥     ¥ ¥ ¥  ¤  0  ¤ ¡    ¥    ¥ 0  ¨ . we use row operations to reinstate the property before adding the next row. the above analysis points the way to an alternative procedure for distinguishing creating from destroying simplices.

13: Each right-angled isosceles triangle in the indexpersistence plane represents a non-bounding cycle that persists over the complexes covered by its interval. except that some closing parentheses may be missing at the end.   ¥ $ 6 5 4 3 2 1 0 0 I NTERVAL P ROPERTY. The -persistent -th Betti number at position is the number of intervals that simultaneously contain and . the number of destroying -simplices is the rank of the boundary group: . The Betti number is the surplus of creating versus destroying simplices: . there is exactly one pairing that has the following stronger property for persistent Betti numbers:    ' ¨ $  ¤ ¥ ¥  ¥ ¨ ¥    ¥   ¤    £ ¥   ¥   ¡    ¤  £ ¨ ¥    ¥ $ ¨ $ ¢ $   ¤ ¥ ¤  £ ¥      0 ¨ ¥ ¨   ¥ ¡  ¥ ¥ ¢   ¨ ¥    0 ¥ persistence . Indeed. The pairing of simplices to obtain intervals satisfying the Interval Property is done using Function DOE S C REATE explained above. Note that this simplex indeed creates. each destroying -simplex corresponds to a non-zero row in the matrix of and is paired with the -simplex that corresponds to the last column in that row.4.78 between the -persistent homology group and the usual or 0-persistent homology group. Pairing.14: Graph of scale for gramicidin. Similarly. We can therefore pair them up and form vertex disjoint intervals. We use intervals that are closed to the left and open to the right. which shows the persistent first Betti numbers of the space-filling diagram modeling the gramicidin protein. (Can you prove that?) In contrast. Note that the number of creating -simplices until position in the filtration is the rank of the cycle group: . Because Betti numbers are non-negative. The -persistence -th Betti number of is represented by the point in the index-persistence plane. the creating -simplices and destroying -simplices are arranged like opening and closing parentheses in an expression. We develop an intuitive picture of persistence using the distinction between creating and destroying simplices.13. each taking time at most proportional to . Each triangle is closed along the top and left edges but open along the hypotenuse. the persistence is the difference between indices: . The -persistent -th Betti number is the rank of the -persistent -th homology group: . it is the number of right-angled isosceles triangles that contain this point. According to the Interval Property. £ £ £ Figure V. Function DOES C REATE spends fewer than row operations per simplex. $ 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 [ [ [ ) [ ) ) ) index . Specifically. This is the convention we used to generate Figure IV. In particular.  ¨  ¨  ¥ #     ¥ £¨ ¥ §¥ ¥ ¨    £ £ ¡ ¤¢  ¨   ¥ Interval property of persistence. the number of tunnels in logFigure V. The Betti number at position is then the number of intervals that contain . The index in the filtration varies from left to right and the persistence from back to front. every prefix contains at least as many creating -simplices as destroying simplices. Observe the large triangular plateau. which is at most some constant times . namely cubic in the number of simplices. which corresponds to the dominant tunnel that passes through gramicidin. each starting at the position of a creating -simplex and ending at the position of a destroying -simplex (or extending to infinity if there are no destroying simplices left). In the assumed simplified case in which is added at time . The persistence of a pair is the time-lag between the additions of the two simplices to the complex in the filtration. Any arbitrary pairing creating vertex disjoint intervals has this property for Betti numbers. as shown in Figure V. $ V S HAPE F EATURES We illustrate this property by drawing a right-angled isosceles triangle below every interval. as it witnessed by the cycle represented by the row. The running time of the pairing algorithm is roughly the same as that of the normal form algorithm described in Section IV.16.

which are special tables of related homology groups [2]. that the implementation in [1] differs in two possibly significant aspects from the algorithm described in this section. We should note. A User’s Guide to Spectral Sequences.2 Topological Persistence Bibliographic notes. Z OMORO DIAN . where we find the definition of persistent Betti numbers. 511–533. [1] H. however. 503–532. L ETSCHER AND A. England. Second edition. the algorithm and its correctness proof. Geom. it uses a sparse matrix representation that permits row operations in time proportional to the number of non-zero entries. The algorithm has been implemented and experimental results suggest it is considerably faster than the obvious cubic time bound. Press. 79 . Topology Proceedings 24 (1999). Topological persistence and simplification. Cambridge Univ. ROBINS . E DELSBRUNNER . who uses them to study the fractal nature of two-dimensional point patterns. [3] V. and second. It might be interesting to explore the other groups in that table and to find meaningful interpretations in the context of alpha complexes. 28 (2002). Discrete Comput. Toward computing homology from finite approximations. 2001. M C C LEARY. [2] J. Persistent homology groups are embedded in spectral sequences.V. First. Persistent Betti numbers have been defined independently by Robins [3]. D. the implementation uses a union-find data structure to classify simplices as creating or destroying. The material for this section is taken from [1].

The dotted mono-chromatic edges show the rest of the Voronoi diagram. By construction. On the left. In the generic case. every interface vertex is a four-chromatic vertex in the Voronoi diagram. Consider an assembly of molecules. every edge belongs to three and every vertex to four Voronoi cells. Our definition of a molecular interface is a formalization of two intuitions. Specifically. each represented by a collection of balls in . we have two cells of each color. the local neighborhood of both types of vertices is a topological disk. Figure V. Figure V. While all cells are mono-chromatic. Similarly. For any two colors. that 2manifold is orientable. and the boundary of every curve consists of finitely many interface vertices.16: The shaded polygons and their edges belong to the interface.3 Molecular Interfaces The interface between two or more interacting molecules is the location of that interaction. curves and vertices form a complex in the sense that the boundary of every sheet consists of finitely many pairwise disjoint curves and vertices.16. In other words. edges and vertices get their colors from the cells they belong to. edges and vertices of a given color pair. We conclude that in the generic case  V. Every sheet is a maximal component consisting of bi-chromatic polygons. curves and vertices. and exactly two of the three polygons sharing the edge are bi-chromatic and thus belong to the interface. An interface edge belongs to two cells of one and to one cell of the other color. Finally. but now these 2-manifolds meet along curves formed by tri-chromatic edges. The interface between the is the subcomplex of the Voronoi diagram consisting of all colors.15: The solid bi-chromatic edges form the interface of the two collections of disks. One of its applications is to display functions defined over the interface. Together. we have three cells of one and one cell of the other color. Recall that the Voronoi diagram of consists of a polyhedral cell for each ball and of the polygons. the local structure of the interface can For be more complicated because we may have tri-chromatic edges and tri. This implies that for colors.15 illustrates the definition by showing the interface of two collections of disks in the plane. if belongs to then we say and have the color .   ¡          ¢   ¨      #  # ¢ ¡   ¡        ¡ ¡ ¡ . Local structure. which is a topological space in which every point has an open neighborhood homeomorphic to . edges and vertices shared by the cells. we present a proposal for a surface or complex of surfaces that geometrically represents that interface. the interface is a two-dimensional complex of sheets. As illustrated in Figure V. we get a 2-manifold. every curve is a maximal component consisting of tri-chromatic edges and vertices of a given color triplet. There are two types of interface vertices: those that belong to three cells of one and one cell of the other color and those that belong to two cells of each color. with the cells of one color on one side and the cells of the other color on the other side. On the right. the sheets.    Figure V.80 V S HAPE F EATURES bi-chromatic polygons and their edges and vertices. The polygons. namely that the best separation of two or more molecules is part of the Voronoi diagram and that the interesting portion of that separation is protected by a relatively tight seal. and let be the collection of all balls. Interfaces without boundary.  ¡ the interface for colors is a -manifold.and four-chromatic vertices. the interface has a particularly simple local geometric structure. We will come back to the second intuition later and formalize the first intuition now. We use colors to keep track of the correspondence between balls and molecules. a polygon can be mono-chromatic or bi-chromatic depending on whether the two cells that share the polygon have the same or different colors. In this section.

The boldface interface is dual to and clipped at the boundary of this collection. Figure V. It seems natural to do this with a distance threshold. More specifically. which represents the space outside the Delaunay triangulation. To describe the shrinking process. edges and vertices as soon as they arise. Finally.    ¨   ¨   ¤  ¤   ¨ void C OLLAPSE if and forall faces endif. We have mono-chromatic vertices and mono. which is sometimes a disadvantage. We therefore shrink from outside in and use relative rather than absolute distance measurements to decide where to stop the process. we delete principal triangles. This is equivalent to saying that the effect of the 23-collapse is the inverse of that anticollapse. During the process. we retract the interface back to the multi-chromatic dual of the dual complex and its pockets. In the implementation of this operation. which happens in rare cases.  In this context. Initially.as well as multichromatic edges.3 Molecular Interfaces Retraction. In the latter case. we clip the polygon before adding it to the interface. The interface as defined above is dual to the subset of multi-chromatic simplices in . we connect the cut points in contiguous pairs and retain the portions of the polygon with vertices of the first type. Clipping. Our goal here is to shrink the interface back to where the molecules are sufficiently close to interact. : is collapsible then do delete endfor     ¨   ) ¨ ¨ ¤ ¤ ¢¡    ¨ ) Figure V. There are. this stack contains all boundary triangles of the Delaunay triangulation together with their incident tetrahedra. edges and vertices. The result of the retraction is the collection of tetrahedra in the dual complex together with the tetrahedra in the pockets. In the second step. We further remove all mono-chromatic tetrahedra and let denote the remaining collection of multi-chromatic tetrahedra.V.   © ¢ ¨ ¦ ¤ ¢   §§£§¥£¡ 81 We may think of a retraction as successively removing sinks from an acyclic directed graph. edge corresponds to a polygon with two types of vertices: those dual to tetrahedra in and the others. A partially surrounded bi-chromatic     Complex R ETRACT : while the stack is non-empty do P OP. we maintain a stack of candidate pairs. We simplify the algorithm by ignoring principal triangles. We clip the polygon by cutting each edge connecting vertices of different types with the plane of the corresponding boundary triangle. triangles and tetrahedra. If that plane does not intersect the dual Voronoi edge. In the first step.     ¨ . In other words. It follows that the result of the operation is independent of the sequence in which the collapses are performed. we take pairs from the stack and add new pairs whenever we create new boundary triangles by collapsing. As defined above. Let denote the dual complex.17 illustrates this idea in two dimensions. complications because such a bi-chromatic edge may either be completely or only partially surrounded by tetrahedra in . we consider collapsible if the pair is part of an anti-collapse in the construction of the filtration and the collapse of and renders the other simplices in this anti-collapse principal. for each bi-chromatic edge of the tetrahedra in . We define a retraction as a maximal sequence of collapses. in other words. we consider the Delaunay triangulation of the collection of balls . We will return to the second step later. C OLLAPSE endwhile. We use 23-collapses to remove these tetrahedra. the interface may go to infinity. however. we collapse as long as we can. but we should keep in mind that the situation in three dimensions is more complicated. The interface is now obtained as the dual of .17: The triangles drawn with solid edges are the bichromatic triangles constructed by the contraction algorithm. we clip at the endpoint that is closer to the plane. but this would most certainly lead to the deletion of interior portions and produce fractured surfaces. we use topological persistence to shrink the interface even further. Note that the first step of the shrinking process is equivalent to removing all tetrahedra outside the dual complex that belong to the ancestor set of the dummy tetrahedron. we add the dual polygon to the interface.

Indeed. the stack contains all pairs with  (V. but we have to modify the retraction to allow for collapses of simplices in the dual complex. although it can be. however. In this case. but it is more complicated because is generally not a face of . We may start with the set of all Delaunay tetrahedra. We now take the shrinking process beyond the retraction from the dummy tetrahedron. We think of the operation that removes and as a generalization of a 23-collapse. We do the operation only if is a boundary triangle of and does not belong to the dual complex. Global structure. edges and vertices. Since a smaller threshold permits as many or more removals than a larger threshold. and are the moments when and are born. The algorithm maintains a stack of triangletetrahedron pairs formed by the topological persistence algorithm.82 Further retraction. the interface is the original surface or complex defined by the set of bi-chromatic Voronoi . void R EMOVE : if then delete . the interface is a two-dimensional complex. Finally. The dimension of is one larger than that of . The running time is dominated by the topological persistence algorithm. Note that we may get different interfaces for different values of the threshold . Recall that the topological persistence algorithm of Section V. if then R EMOVE endwhile. we remove principal triangles. unless the polygons. We now restate the algorithm and simplify its description by declaring a 23-collapse as a special case of a removal. Note that for we have ¢   Complex R ETRACT M ORE while the stack is non-empty do P OP . the interface is empty. where and are the moments when and are born. Initially. A second potential advantage of this function over the inverse of the persistence is that it is dimensionless and thus amenable to the use of universally meaningful constant thresholds. we compare their persistence with a constant threshold and remove only if . we may bias the shrinking process against large triangles and tetrahedra by using . endfor. which would remain. There are two kinds of one-dimensional elements: the original tri-chromatic curves and the new bi-chromatic curves outlining the sheet boundary created by shrinking. If the retraction from reaches far enough. the interface shrinks with decreasing . V S HAPE F EATURES As before. However. gets deleted just because it becomes principal. For . it can happen that the retraction does not reach all the way. For a fixed . © ¨ ¦ ¤ §§¨ ¥   ¡ ¡ ¡ £      &¡  ¢     &¡     ¢   ¨  ¨ on the boundary of the current set . . we get the interface by duality from the computed collection of tetrahedra. the interface is guaranteed to be empty. we get a filtration that is parametrized in a way similar to the sequence of alpha shapes. Here. for . As before. is the tetrahedron that shares with . In other words. but we are only interested in the case in which is a triangle and is a tetrahedron. we can further decrease the interface by making negative. We note that it is possible to use other functions that satisfy the monotonicity property (V. With some care. forall triangles do P USH R ETRACT endif. We first delete and then retract from . £¤¥      This monotonicity property is important for the correctness of the algorithm because if the retraction from does not reach then this can only be because there is a triangle between and that split the void created by before it was destroyed by . But then the other part of the void must have been destroyed by a tetrahedron preceding in the filtration. we can implement the rest of the algorithm so it takes only constant time per simplex in the Delaunay triangulation. in which case we recurse for other pairs of simplices before deleting . Its two-dimensional elements are sheets defined by bi-chromatic Voronoi polygons. The monotonicity guarantees that the simplices between and are removed by recursive deletions so that can eventually be deleted.1). Note. which takes cubic time to form the triangle-tetrahedron pairs. Because of our policy to delete principal triangles. For example. This is done implicitly during the retraction.2 generates simplex pairs with the property that destroys what created. For dual complex of contains bi-chromatic triangles. :   ¨ ¨       D   ¨ ¨       ¡     ¨   ¡ ¡   ¡     ¡      ¡ ¡  ¡      ¡ ¡ ¨  ¡    ¨ ¡    ¨ ¢      ¤ ¨ ¨ ¡ ¡   £ ¨       &¡      &¡   ¡   ¨   ¨ ¤ ¢     ¨ ¨ ¨ ¨ ¢  ¢ ¡          ¨ ¨ ¡ ¡ ¡     ¢ £  D ¡     ¡ ¡ ¡ ¨ ¨   endif . Eventually. there are two kinds of zero-dimensional elements. To decide whether or not to remove and in the first place. all other collapses can be ignored.1)  ¡ ¡ ¡ £      &¡  £   ¢        Here. edges and vertices as soon as they get created. namely the original four-chromatic vertices and the new tri-chromatic vertices forming the curve boundary created by shrinking. We take all sheets and curves as open sets so the complex is a collection of pairwise disjoint open elements. if we use .

where two independent real parameters are used to define the interface as a portion of the molecular surfaces of the two or more molecules. A competing proposal for a geometric definition of molecular interfaces can be found in [3]. 1967.V. al [1]. In “Proc. C. edges and triangles of any arbitrary triangulation of the 2-manifold. [3] A. it is easy to compute its Euler characteristic and to determine its number of holes. Sci. D. The material in this section is taken from the recent manuscript by Ban et. [4] J. Each component of the boundary is a closed curve outlining a hole in the 2-manifold. The fact that the topological type of a connected orientable 2-manifold is determined by the genus and the number of holes can be found in a number of texts. M INOCHA . In topology. Proc. RUDOLPH . 1–6. Durham. New York. To explore this further. 1995”. Given a sheet.. Manuscript. [1] Y. 2002. and are the number of vertices. A definition of interfaces for protein oligomers. VARSHNEY. E DELSBRUNNER AND J. W RIGHT AND D. M ASSEY. 2-manifolds with and without boundary have been studies for more than a century.. 93 (1996). A. We then get the genus as . A classic result in topology says that two orientable 2-manifolds with boundary are homeomorphic if and only if they have the same genus and the same number of holes. P. Bibliographic notes. Duke Univ.3 Molecular Interfaces that the elements are not necessarily simply connected. Defining. R ICHARDSON . 36–43. North Carolina. H. F. We may think of this manifold as obtained by punching holes into a -fold torus. W. [2] W. There is evidence that the geometric interfaces shed new light on the hot-spot theory of protein-protein interaction [4]. we excise thin strips along the curves to turn each sheet into a connected 2-manifold with boundary. IEEE Visualization.-E. Algebraic Topology: an Introduction. J R . the Euler characteristic of a 2-manifold with genus and holes is  ¡ 83 where . Acad. W ELLS . Springer-Verlag. including [2]. Binding in the growth hormone receptor complex. computing and visualizing molecular interfaces. BAN . Furthermore. B ROOKS . S. ¦ ¦    ¥ ¦  ¥    "  ¦       ¥ ¥ # ¤ " ¥ ¨      ¦ ¤ ¦ #  ¡   . Natl. V.

which implies that both tunnel systems are open in the displayed complex. or more generally the difference #   §¥    £ £ ¡ ¤¢  Figure V.84 V S HAPE F EATURES V. ¢ ¡  0  0 ¡ 0 Figure V. we have experimented with other and more simple-minded ideas aimed at getting a handle on cavities in molecular data.19: The signature panel with the tunnel signature displayed in log-scale.20: The graph of . The noise in the signature decreases from back to front. They are computed by the algorithm explained in Section IV.354-th dual complex in the filtration of a periodic zeolite molecule consisting of 1. Figure V. then proceed to pockets.20 shows the two-dimensional Displaying pockets. It follows that there are complexes in the filtration that have the tunnels in the first system closed while the tunnels in the second system are still open.2. the components. 12 10 8 6 4 2 0 0 5000 10000 15000 20000 25000 30000 35000 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 . we explore extensions of the Alpha Shape software that are concerned with connectivity information and shape features. As explained in Section IV. One such idea was to display the difference between the Delaunay triangulation and the dual complex. The index 2. tunnel signature with filtration index increasing from left to right and persistence increasing from back to front. and . .19.3 and displayed to the right of the correspondingly labeled buttons in the signature panel shown in Figure V. Prior to developing and implementing pockets. We begin with signatures. The two persistent tunnel systems are visible as plateaus that escape the noise removal the longest. The persistence of the tunnels is formally defined in Section V. Betti number signatures. The two systems can be detected in the tunnel signature shown in Figure V. of the zeolite data.296 atoms. tunnels and voids of a complex in are counted by the Betti numbers .18: Three axis-parallel views of the 2.354 belongs to the higher of the two plateaus.18. As an example consider the zeolite data shown in Figure V.  ¤ ¥ ) Figure V. Note that the tunnels shown in the second view are smaller in diameter than those shown in the third view.19. the number of tunnels in log-scale.4 Software for Shape Features In this section. To the left of each button we can toggle the display of the evolution of the number as a function of the index in the filtration.2. Two of the three views are taken along tunnel systems that intersect orthogonally and give rise to a rather complicated cave system. We refer to these functions as signatures of the data set. and finally look at molecular interfaces.

22: Side view of the largest pocket of the collection shown in Figure V. such as side pockets of larger pockets.21. are treated like in the computation of pockets. can be computed in the Alpha Shape software by first selecting and and second pushing the ‘Difference’ button in the scene panel. The results are not encouraging because a typically large number of inessential simplices clutters the view of important cavities. For example. The main design of Figure V. The panel also provides a means to step through the sequence of individual pockets and to select pockets by their number of mouths. all tetrahedra . The second index. ¨ Figure V. Two boundary triangles that share a common edge may or may not belong to the same mouth depending on which shared edges belong to the pocket. We observe the same phenomenon for the mouths of a pocket. with . It is used to eliminate ancestor sets of tetrahedra whose indices are larger than or equal to .21 from a different angle.926. shown in Figure V. and Figure V. as shown in Figure II. is similar to that of the signature panel. as in Figure V.22 shows the largest of the pockets in Figure V. An ex- .4 Software for Shape Features . two pockets may appear connected but are not because of missing shared triangles. We should keep in mind that the pocket in the dual com- Remember that pockets in the dual complex are not ¥  ¡   ¨ ¢ ¨ ¥ plex is geometrically considerably larger than the pocket in the corresponding space-filling diagram. The mouth regions are therefore visually easily identifiable. ¥  ¤  ¤ ¥ ¨ Figure V. the dual set of a pocket usually gives a clear indication of the cavity. In other words. This effect is the reverse of that for the molecule.V. Pocket panel. which can be used to display the edge skeleton of the dual complex together with the pockets. the pocket panel. A useful feature is the ‘Shapewire’ button. In contrast. Pockets can be computed without opening the pocket panel. However.17. It contains a window for its own signatures. The interface also supports the ¥ 85 closed under the face relation. whose dual complex is considerably smaller than a corresponding space-filling diagram.23. . but a more detailed exploration requires interaction with the software.21. which may lead to confusion. and using the explosion function to separate all simplices.21: All pockets in the dual complex of the zeolite data for index 2. It is possible to visually inspect the connectivity by turning on the display of simplices of all dimensions in the scene panel. The skeleton does not block the view and helps positioning the pockets relative to the complex. the internal connectivity of the pockets is not immediately visible. This elimination of large pockets helps in the exploration of detailed structures. which start after the index of the first chosen complex. can be chosen anywhere between and the maximum. The software indicates the presence or absence of boundary triangles by the choice of color. This difference between two dual complexes. which is facilitated by that panel. display of individual pockets.23: Pocket panel of the Alpha Shape software.

Using this software. the largest pocket is assisted in its function by smaller auxiliary pockets in the vicinity. Urbana. which we remove for simplicity. and as can be seen in the first view. Z OMORODIAN .24: Three axis-parallel views of the pockets representing the narrow tunnel system decomposed into pieces by opening up the wide tunnel system.] [Talk about the weighted square distance function over the interface.. It is currently not part of the Alpha Shape software. The interface software has been developed by YihEn (Andrew) Ban but is not yet complete. FACELLO . H. The most interesting outcome of that study is perhaps that in about 80% of the cases. Comput. L IANG . A. V S HAPE F EATURES fifty-one proteins and their cavity structure. Both systems are shown as holes in Figure V.86 ample is shown in Figure V. [4] A. Illinois. [2] J. Urbana. Are proteins well-packed? Biophysics J. Comput.18. Ph. the pocket with the largest volume is also the biologically active site of the molecule. thesis. [The input is a complexed collection of proteins. [1] M. The persistence software has been developed by Afra Zomorodian and is described in his dissertation [4]. Analyzing and Comprehending the Topology of Spaces and Morse Functions.24. 2001. but with set such that the system of wider tunnels visible in the third view of Figure V. In many instances. Figure V.] [Show one figure with iso-lines of that function. Geometric Techniques for Molecular Shape Analysis. D. [3] J. Illinois. [Say a few works about the particular two proteins.] Bibliographic notes. Sci. The pockets thus only fill the remains of the narrow tunnels. Sci. Protein Science 7 (1998). L IANG AND K. Dept. W OODWARD . Displaying interfaces. Dept.. Liang and collaborators [3] studied ¥ . Ph. thesis.] [Show the sequence of figures illustrating the interface filtration. which shows the pockets filling the system of narrow tunnels visible in the second view in Figure V. A. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand binding. Liang and Dill [2] provide numerical evidence that proteins are packed tighter in the core than near the outside.18 are still open. The pocket software has been developed by Michael Facello and is described in his dissertation [1]. Some of these features can be seen in visualizations of interfaces presented in this section. Univ. 1996. In another application. these remains are not connected. D.18. It is built on top of the Alpha Shapes software but requires a variety of additional features to be useful to biologists. D ILL .] A human growth hormone example. Univ. 81 (2001).] [Mention the issue of water molecules. E DELSBRUNNER AND C. 751–766. 1884–1897.

2. ¥ d       ¡ §    pB u § p   sB p ¥  §  ¡  p ¥     ¤   ¢  §¡CB p§ ¢ £¡   © ¤  ¥ ¤  p ¢   £ . Ancestor sets in the plane. Let be a triangle and a .Exercises 87 (ii) Following your definition. The label of a vertex in that triangulation of lies on the edge is either or . (i) Prove that is a partial order. (i) Prove that there exists at least one triangle in whose vertices have three different labels. Clearly. Write if the two Delaunay triangles share an edge and both orthocenters lie on ’s side of that edge. (i) How would you define the Betti numbers of a void? ¥ £ 8. the pairs. Paired parentheses. Consider the Delaunay triangulation of a finite points set in . 5. (iii) What would be a natural generalization of these results from a triangle to a tetrahedron?  Exercises 1. (i) Given a pairing. (ii) Strengthen the result in (i) by proving that the number of triangles with three different labels is odd.  ¡ B ¤   ¤ ¤  ¥ ¤ ¤ ¥     B ¡R   ¡  ¡ R ¡R   B     ¡     ¡ R ¡ R ¥ ¡R  ¨ ©  ¤ ¤  ¡ ¥    ¤£ ¤ ¡ ¤ ¤  ¤ ¥  ¢ ¤ ¥ for all points . let be the sum of lengths of . A pairing is a perfect matching between the opening and closing parentheses such that the opening parenthesis precedes the closing parenthesis in every pair. The Gabriel graph of consists of all edges for which  (i) Prove that all edges in the Gabriel graph belong to the Delaunay triangulation of . (iii) Explain how the Gabriel graph relates to the ancestor sets of the sinks. if is collapsible then its underlying space is contractible. Recall that a 2-manifold is a topological space in which every point has an open neighborhood homeomorphic to . and the label of a vertex in the interior of is either . (ii) Prove that the Gabriel graph is connected. Each parenthesis has an integer position in the sequence. Prove that (ii) Prove that depends on the given sequence but not on the pairing. 7. Barycentric subdivision. or . Sperner’s Lemma. 2-manifolds. and the length of a pair is position of the closing minus the position of the opening parenthesis. A void of a space-filling diagram is by definition connected but can have handles and islands. such as for example . (i) Show that each -simplex in gives rise to -simplices in . (ii) Prove that the Euler characteristic of and are the same. Let be a finite set of points in . Recall that a contractible topological space has the homotopy type of a point. (i) Show that a two-dimensional simplicial complex in which every edge belongs to exactly two triangles is not necessarily a 2-manifold. Collapsible complexes. We call a simplicial complex collapsible if there is a sequence of collapses that reduces it to a single vertex. Connectivity of voids. Gabriel graph. for every . Let complex and let denote its barycentric subdivision. (ii) Prove that the ancestor sets of any two different sinks in the order are disjoint. (i) Prove that if is embedded in then is collapsible iff its underlying space is contractible. 3. for . Consider a sequence of parenthesis of a well-formed expression. (ii) Give an example of a simplicial complex embedded in that is not collapsible but whose underlying space is contractible. be a simplicial 4. (ii) Show that a simplicial complex in which the closed star of every vertex is the triangulation of a disk is necessarily a 2-manifold. can the Euler characteristic of a void be any integer or are there restrictions? 6.

88 V S HAPE F EATURES .

In the second section. Possibly the best known result in Morse theory is the relation between the critical points of a smooth real-valued function over a manifold and the Euler characteristic of that manifold.Chapter VI Density Maps Morse theory grew out of the study of the variational methods in analysis. While Morse theory requires differentiable spaces and thus seems to be built on rather specialized assumptions. Morse theory is sometimes also referred to as critical point theory. we make an effort to relate the Morse theoretic concepts with the discussion on connectivity. it brings order into the complicated world of geometric form. We use two sections to introduce the basic setting of Morse theory and one to explain the concept of molecular pockets in Morse theoretic terms. The initial interest focused on highand possibly infinite-dimensional settings. In this chapter. [The material will have to be partially rearranged according to the following plan of sections:] VI. Together with suitable non-degeneracy assumptions. we introduce Morse theory with an emphasis on the twoand three-dimensional cases. In some ways. we will see that many themes are familiar from Chapter IV.1 VI. Because of this relation. The differentiability assumption allows the introduction of otherwise undefined concepts.3 VI.4 Morse Funcitons Critical Points Morse-Smale Complexes Jacobian Submanifolds Exercises 89 . Morse theory is but a different language or framework to talk about connectivity.2 VI.

90 VI D ENSITY M APS do not belong to . We need some basic definitions from differential geometry to express these restrictions. A diffeomorphism is a smooth homeomorphism whose inverse is also smooth. which we refer to as the gluing map. the map is smooth if for every there exists an open set containing and a smooth map that coincides with throughout . For a point   A Morse function is a smooth real-valued map over a manifold that satisfies certain non-degeneracy assumptions. .  ¦   ¡ § ¢  ¦ ¦ © § ¢ ¡   ¡ ¥   ¡  ¡ B   © © ¡ B  ¡     VI. ¦ ¨¡ B ¥ (¢'     p  £ h (q)  B    q "     ¦  ¦ ¦ h ( r) ¡ ¥ r © ¦ ¦ h ( s) s ¥ ¤ ¨ pB s!p ¢ ¢ ¡ ¡ B ¡      ¦ ¡ ¥ ¦ ¡ §§ ¡ ¥ ¦ ¡ ¦ © § ¥ ¥ ¡  © ¦ ¡ ¦ ¡ B ¡  ¦ As illustrated in Figure VI. We can cover with six open hemispheres defined by for . is defined by mapping each point to its distance from the plane. Morse theory talks about manifolds and smooth functions over these manifolds.  B B   A map   h ( p) attach 0-cell 0 attach 1-cell attach 1-cell attach 2-cell Figure VI. .2: The upper open hemisphere is parametrized by projection to the -plane. Then with attached by is the space obtained by identifying every points with . and a 2-cell. we need to restrict ourselves to sets for which such properties are defined. Smooth manifolds. For we have empty of boundary. . is a space homeomorphic to the -dimensional ball.2. The tangent space at is the -dimensional hyperplane through the origin of that is parallel to this best approximating hyperplane. Formally.1 Smooth vs. To define what attaching a cell exactly means. we consider the set of points with height less than or equal to . For each . A subset is a smooth manifold of dimension if each has a neighborhood that is diffeomorphic to an open subset . In order to relate the topological type to differential properties. As shown in Figure VI. Note that the composition of two smooth maps is smooth. from an open set to another open set is smooth if the partial derivatives of all orders exist and are continuous. For general and not necessarily open sets and . . The primary goal is to find out about the topological type of the manifolds through a differential analysis of the functions. The standard introductory example is the torus embedded in upright position in and the height function this embedding defines. we can interpret this event as attaching a cell of some dimension. A -cell. and its inverse is called a coordinate system on . Sweeping a torus. Each time the homotopy type of changes. The attachment of to a space requires a continuous map . changes its topology only at certain critical values of . and two spaces are diffeomorphic if there is a diffeomorphism between them. note that the boundary of is a -sphere.1. This section introduces Morse functions as a crucial piece in the basic mathematical framework of Morse theory. a -dimensional hyperplane in that best approximates near . As an example we may consider the 2-sphere . The elements of the vector space are called tangent vectors to at ¦ ¨  ¡ B  ¡ G ¦© G ¦© £ ¨¥ ¨ B ¨ ¦ It is instructive to look at the evolution of the homotopy type of . Piecewise Linear   ¢  ¡   ¢ ¡  ¡    © # ¡     ¦     p B ¤ urp ¢  ¡ ¡ B       ¢ ¡  ¡    § ¦  R ¤ ©B ¢ ¢ ¦ ¡ B ¡ ¡ ¡ ¦ ¦ ¡ ¡ ©B  £    ¤£ ¡   ¦  R ¡ § ¦       ¦ ¦ R ¦ ¢ £¡ ¡ ¡B  ¡   ¡ B     ¥ ¡ YR ¥   ¡¦ ¥    ¡   ¦ ¦ ¥ . A particular diffeomorphism is called a parametrization of . so attaching a point or 0-cell is the same as taking the disjoint union.1. All interior points Figure VI. we can construct coordinate planes.1: Evolution of the torus in the sweep from bottom to top and the corresponding construction by attaching a 0-cell. The evolution of the torus during the sweep and the interpretation of attaching cells is illustrated in Figure VI. each hemisphere can be parametrized by orthogonal projection to one of the . two 1-cells.

and a minimum for . the Hessian of at is the matrix of second derivatives. a saddle for or . lar degenerate critical point exists for the monkey saddle shown in Figure VI. .4. Non-degenerate critical points are isolated. Geometrically. and . . Piecewise Linear . . .3 illustrates the instability of the degenerate critical point. These are the points with horizontal tangent planes. A simi A quadratic function in two variables has only three types of critical points. the tangent vector is a tangent vector and thus an element of . maxima. Recall that the eigenvectors define an orthogonal coordinate system in the Figure VI. Assuming a local coordinate system in a neighborhood. Critical points. Let be a non-degenerate critical point with index of .1 Smooth vs. all eigenvalues are non-zero. The origin is a critical point for every possible assignment of signs to . which is unfolded in different ways by the other two functions. 1. Degenerate critical points.1 are equal to the indices of the corresponding critical points. The derivative vanishes at 0. which means there is an open neighborhood without other critical points. the second derivative can be used to compute the best quadratic approximation. a circle drawn around a regular point has only one peak and one pit. It may be specified as the graph £ ¥ ¤§ ¥ ¡ ¨ ¢ ¨ B    ©B   ¡ § ¨£ ¡ ¦ ¡   ¡ £ £ §     ¡  ¥ ¦£  A critical point is non-degenerate if is nonsingular. Specifically. the indices of the critical points . and in Figure VI. . For example. The second derivative vanishes too. which identifies 0 as a degenerate critical point. which is homeomorphic to . The Hessian is symmetric and we can compute its eigenvalues. Note that the dimensions of the cells attached to the evolving torus in Figure VI. and it is a maximum for . Noncritical points and non-critical values are also referred to as regular points and regular values. and minima.3: From left to right. A 1-dimensional manifold is a closed curve. graphs of the function for . The homotopy type of the partial torus changes when passes the height value of the points . Critical points with small circles that oscillate more often than twice are necessarily degenerate. The index is then the number of eigenvector directions along which decreases.   £ ¦     £ ¡   ¡¡ ¥    ¢ ¡ ¦      ¦ 0 (  ¥            ¦     ¢¡ § ¡ $      ¦         ¥   ©   £ ¢  ©        ¢   ¦ ¡       ©       ¢    £             ¦    ¦ ¦ ¡ ¡ 7 #B  B    B   ¦   £ § $   ¦   ¢   ¡ B ¡   ¢      #B I ¢ ¢ ¢          B   B "   ¢  ¦ ¡   5 ¥        ¡ ¡   B  G ¦©   ¢    " ¥  ¡ R   ¡   XB ©B    ©B ¢     ¡     B 0  ( ¥  ¥   ¦ B tangent space of . The middle function has a degenerate critical point at 0. .1.VI. and marked in Figure VI. The index of at a non-degenerate critical point is the number of negative eigenvalues and is denoted as . A connected open subset is an open interval. This is generally the case because a critical point with index connects to the past along directions. 1. Critical points are marked. . saddles. In contrast.1 are 0. We call a Morse function if all critical points are non-degenerate. and 2. a maximum and a minimum. The saddle is the most interesting case of the three because a circle drawn around it has two peaks alternating with two pits. Consider the height function defined by . Note that for every smooth curve passing through . Assuming the Hessian is non-singular. 91 M ORSE L EMMA . . These directions span a dimensional cell needed to realize the connections. a point is a critical point of if all derivatives vanish. $ . There is a neighborhood of and a local coordinate system in with for all and If is a critical point then is a critical value. This fact is also expressed in the lemma of Morse. Index.  throughout . where is the dimension of the manifold . Just like the first derivative can be used to compute the best linear approximation to . the degeneracy is manifested by the fact that an arbitrarily small perturbation can remove the critical point or turn it into two non-degenerate ones. Figure VI. that is.

Introduction to Linear Algebra. and three pits at . M ORSE . Published in the United States by Chelsea. Wellesley. The only critical point is . Good introductory texts to the related subject of differential topology are the books by Guillemin and Pollack [1] and by Wallace [6]. In words. S EIFERT AND W. New York. New York. but there are others that are not. As always. and . A minimum example is the ordinary height function.92 . This implies that every Morse function of the sphere has at least two (non-degenerate) critical points. if we lay down the torus on its side. For example. First Steps. . Similarly. Figure VI.critical point so that can be constructed by successive attachment of these cells. for the entire -axis is critical. As we go around a circle centered at the origin. [5] G. Prentice-Hall.    ¡    ¤  §  ¥   ¥          ¦ ¡¢ B ¡    ¥ ¦ ¥ B ¢¢ ¦ £ ¦F ¤£   ¥    ¥ §  ¥ ¥ ¥ ¦ §   B   ¥  ¥  B ¢¢     ¦ ¥ £  B  ¥ £ & ¤     ¥ ¥   ¥ §  ¤ ¢         B  ¢ of         ¡ B d¥ B ¢ B        B ©B XB £ #B       XB ©B     ¥ . Variationsrechnen im Großen. 1951. Soc.. T HRELFALL . Differential Topology. Amer. 1963. Let be the number of critical points of index . which is also the alternating sum of critical points. the height function has a circle of minima and another circle of maxima. 1968.4: Monkey saddle with degenerate critical point. but none of its points are isolated. The original development of Morse theory from its variational background is described by Morse [3] and by Seifert and Threlfall [4]. WALLACE . A good introduction to linear algebra including an intuitive discussion of eigenvalues and eigenvectors is the book by Strang [5]. 1934. 1993. Massachusetts. which has a minimum at the south-pole and a maximum at the north-pole. Milnor’s later book [2] emphasizes the topological analysis of manifolds and has since become a standard reference in Morse theory. . for every minimum and maximum we get exactly one (non-degenerate) saddle point. the function has three peaks at . Let be a compact and smooth manifold without boundary and a Morse functions. Differential Topology. ¥ which is zero at 0. S TRANG . WellesleyCambridge Press. [2] J. and . [1] V. no matter what Morse function we use. which is the real part of . Bibliographic notes. For the sphere we get . The matrix of second derivatives at that point is ¥ & ¦  ¥ §     VI D ENSITY M APS For example for the torus we get . [6] A. The Calculus of Variations in the Large. Englewood Cliffs. M ILNOR . Press. We will see in Section VI. Princeton Univ.2 that we can construct a -cell for each index. New Jersey. the Euler characteristic is the alternating sum of cells. Euler characteristic. Morse Theory. New Jersey. All critical points in the above examples are isolated. 1974. [3] M. Benjamin. Math. P OLLACK . New York. [4] H. G UILLEMIN AND A.

which is the union of a circle of integral lines and maximum itself.5: From left to right. which is a solution to the ordinary differential equation . All three cases are illustrated in Figure VI. of the real line. Nevertheless. Each stable manifold is the injective image of an open balls.VI. a saddle. We can define it also without reference to a coordinate system. a saddle.5 Figure VI.6: From left to right. the unstable manifold is the union of integral lines with origin . which we refer to as its origin and destination. The gradient vanishes precisely at all critical points of . and a maximum of a two-dimensional Morse function. a minimum. maps every point to a tangent vector . Every regular point belongs to an integral line. VI. It depends smoothly on the initial condition. equivalently. For example. The stable manifold of a critical point is the union of integral lines with destination and. The dimension of the unstable manifold of a critical point is the co-dimension of the stable manifold. . . However. if we have a smooth curve with velocity vector then the derivative of can be computing using the gradient as The stable manifold of a minimum is the minimum itself. Assuming an orthonormal local coordinate system at . the closure of a stable manifold is not necessarily homeomorphic to a closed ball.6. It approaches two critical points. the closure of each stable manifold is the union of (open) stable manifolds. Neither can an integral line fork. we introduce the gradient of a Morse function and use it to construct the -cells whose inductive attachment reproduces the evolution of the homotopy type of . and a maximum. If we start at a regular point and follow the gradient we trace out a path. which is the union of two integral lines and the saddle itself. where is the directional derivative of along . everything we said about stable manifolds is also true for unstable manifolds. The same concept can also be defined for a Morse function . two integral lines can also not merge. that stable manifold of a minimum. Gradient flow. which is its regular starting points. It is convenient to consider each critical point as an integral line by itself so that the collection of integral lines partitions . . for continuously increasing real threshold . A vector field. the flow in the neighborhoods of a regular point. as indicated by the examples in Figure VI. and . the gradient of is . Note that the dimension of each stable manifold is the index of the critical point that defines it. symmetrically. This path is called an integral line. By symmetry. It is the projection of a normal vector of the graph of and points in the direction of the steepest ascent. Every maximal integral line is open at both ends and thus a map of an open interval or. same as for linear maps.6. The patterns of integral lines in the neighborhoods of a regular and several critical points on a smooth 2-manifold are shown in Figure VI.2 Morse-Smale Complexes 93 joint or the same. £ ¤  ¢   £ ¢ ¥ ¦ £ ¤  ¢       £ ¤  ¢ !§ Stable manifolds. and because we can reverse the gradient vector field by considering . The collection of stable manifolds thus satisfies the two conditions of an open complex: its cells partition and the boundary of every cell is a union of other cells. Two integral lines can therefore not cross. The gradient of a linear map is the vector .2 Morse-Smale Complexes In this section. The gradient is the particular vector field that satisfies . and two maximal integral lines are either dis-      © ¦ Figure VI. for every vector field .     ©D   ¤ §   ¥ ¦ I £ ¤  ¢ £ ¤  ¢      £  ¤  ¤ £  !   ¤  £  ¥ ©£F !¨ ¡ © £ ¡ ¦ ¢ ¦ ¥ ¡ D© § ¤ ¤     § ¡     © ¥ ¦ I   £  ¢     ¦ ¦ ¤ ¥ §  §   £ ¢ G ¦ © ¡ #B     B   ¡ # ©B ¦ R   ¢ ¢ § ¢ ¢  £¡¢     ¡¢  ¡¢       # G ¦  R  §          #  #   VG G ¢ £ ¦  ¥ #  ¦ § ¤   ¡ ¦ ¡ ¦ ¡ § ¤   R R  § D  ¤ ¦    ©B ¥ ©       ¢ £ ¦ ¥       ¢ £ ¦ ¥ ¢ £ ¦ ¥ §   EB § § ¤ D   ¦       ¦ ¡ B #      ¥   ¢ ¨  £ ¥  B R §   ¦ I ¡¢¥ ¢ # ©   . The stable manifold of a saddle is an open curve. In a 2manifold . the stable manifold of a maximum is an open disk. .

9. we can extend these values to a continuous function over the entire surface. Every 2-cell of a two-dimensional Morse-Smale complex is a quadrangle. and maxima remain maxima. Each point of a triangle is a convex combination of the three vertices. meets the unstable 1-manifold of the lower saddle. In other words.7: Solid stable and dashed unstable 1-manifolds with overlaid dotted iso-lines of a rectangular portion of a MorseSmale function. P ROOF. minimum saddle maximum $  ¡  R ¡R  B  0   B Figure VI. Assuming a Morse-Smale function. The vertices of a 2-cell alternate between saddles and other critical points. Height functions over manifolds occur in many practical problems. The common features of all 3-cells are that they have one minimum and one maximum. In doing so. and all 2-cells in the boundary are quadrangles. the dimension of the intersection of the two tangent spaces is . The intersection is transversal at if the tangent spaces and span the tangent space .7 that it is indeed necessary to take components. Q UADRANGLE L EMMA . The two bold 2-cells share the same origin and destination. This amounts to overlaying the two complexes. minima remain minima. and the function would be specified by its values at the vertices. We can see in Figure VI. and three index-1 saddles and the same number of index-2 saddles. Using linear interpolation. We take two copies of a -gon and glue them together along the shared boundary. A Morse-Smale function is a Morse function whose stable and unstable manifolds intersect only transversally. we consider a point common to and . Figure VI. it suffices to tilt it ever so slightly sideways in order to get transversality. but they are never smooth in the mathematical sense of the word. We may refine the complexes of stable and unstable manifolds by forming unions of integral lines that agree on both limiting critical points. . we define the Morse-Smale complex as the collection of connected components of intersections of stable and unstable manifolds. VI D ENSITY M APS Shape of Morse-Smale cells. the height function of the upright torus in Figure VI. for . Piecewise linear height functions.8.94 Morse-Smale functions. The Euler characteristic of the 2-sphere is . and the non-saddles alternate between minima and maxima. Saddles become regular points. which implies . Note that all 2-cells in Figure VI. provided we count an arc twice if it bounds the cell on both sides. . two. . From left to right they have one. A few examples of 3cells are shown in Figure VI. it is convenient to assume that the stable and unstable manifolds intersect in a generic manner. all two-dimensional Morse-Smale cells are quadrangles.8: Three 3-cells of a three-dimensional Morse-Smale complex. An example is a surface of a molecule model and the electrostatic potential on this surface. Any such cyclic sequence has length . as shown in Figure VI. along entire one-dimensional integral lines. For example. but they can also assume more general shapes with arbitrarily many saddles alternating between index-1 and index-2 separating the minimum from the maximum.7 have four sides. with ¤ ¥ ¦     ¥ ¦ ¥ ¥   ¥   ¡ ¥  ¥ ¦ ¥     ¦  ¦   § G © © ¡ G   © ¦ ( © ¡ 0   ( 0 B ¦ G © £ ¤  ¢ £ ¤ © £ ¤  ¢  ¢ ¤ ¥  ( G © © £ ¤  ¢ § ¦ £ ¤ ©   ©  ¢ ¥  0   © £ £ ¤  ¢      G ¦© B ¢             £ ¤   ¡ ¢ ¡ . The surface would typically be given as a triangulating simplicial complex . The result is a topological 2-sphere with minima and maxima. Morse-Smale functions are again dense in the set of maps from to .1 is Morse but not Morse-Smale because the stable 1-manifold of the upper saddle. We need some definitions to explain the linear interpolation. In the case of the upright torus. Equivalently. The 3-cells of a Morse-Smale complex may have the structure of a cube. To explain what this means.

The shaded portions are lower stars.9: Portion of a triangulated surface of a molecule. F. . Assuming all . It still shares many characteristics with Morse functions. Discrete Comput. we define as the the union of the first lower stars and note that is a simplicial complex. . §    ¢      ¨ § ¢    and . Z OMORODIAN . [1] T. we may consider a vertex whose circle of neighbors alternates ¦ ¡ ¦  ¤   £ ¤ ¦  ¤  ¡ ¢   ¤  ¤     Note that the barycentric coordinates of the vertex of are and .2 Morse-Smale Complexes 95 times between lower and higher values of as a -fold saddle. 245–256. the lower stars partition the complex . minima. More complicated lower stars are possible. H ARER AND A. Lower stars. Indexing the vertices accordingly. [2] H.VI. The values computed for within the two triangles that share thus agree. It is convenient to assume pairwise different height values at all vertices so that each simplex belongs to exactly one lower star. This interpretation is consistent with the result that regular minimum saddle maximum Figure VI. and maxima.  Another similarity between smooth and piecewise linear height functions arises when we sweep in the direcfor tion of increasing height. and the lower star as the subset for which is the highest vertex. to appear. which implies that is continuous.. . BANCHOFF . The idea of writing a triangulated manifold as the disjoint union of lower stars goes back to Banchoff [1]. and we cannot remove them just by perturbing the height values. It follows immediately that is the number of minima and maxima minus the number of saddles counted with multiplicity. and . E DELSBRUNNER . we sort the vertices in the order of increasing height. The height function is continuous but not smooth. Differential Geometry 1 (1967). The transversality condition for stable and unstable manifolds has its origin in dynamical system and is named after Steve Smale [4]. Geom. ¥ ¥    ¥     ¥    ¦ ¥   ¤¤  R ¤ ¡ $ R      ¨ ¤ § §    ¦ ¥ ¡  ¦    $  ¡    R   § ¢ ¢ ¤ ¢ ¢   ¡R        §  ¤ §      ¨ ¤ ¢  ¢ B  ¡ 0   $   B $ 0  B ¡ ¨ ¡ ¢  ¡ ¡ ¤ ¨ ¡       0       B     #B ¤         $ 0 $  0        ¡R ¡¢  ¡ ¢    ¡R    . saddles. Adding the lower star of a regular point does not change the homotopy type of . With this assumption. Figure VI. the alternating sum of critical points is equal to the Euler characteristic of . J. The value at is now defined as the analogous combination of values at the vertices. The Morse-Smale complex has been introduced recently in [2] along with algorithms for piecewise linear height functions over 2manifolds. saddle. is a filtration and a discrete version of the evolution of during the sweep. and adding the lower star of a critical point is similar to attaching a cell in the smooth case. Bibliographic notes. Furthermore. minimum. and -fold saddle are . The sequence of complexes ¤ Figure VI. The alternating sum of simplices in the lower stars of a regular point. The gradient and related concepts from vector calculus are intuitively described in the booklet by Schey [3]. Critical points and curvature for embedded polyhedra. for points along the edge we have . maximum. J. Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds. which implies that the linearly interpolated agrees with the value specified at . Instead.10 illustrates the definitions by showing the lower stars of vertices that behave like regular points. The three parameters are unique and referred to as the barycentric coordinates of .10: The star of every vertex in the triangulation of a 2-manifold is an open disk. Define the star of a vertex as the collection of simplices that contain .

and Related Topics. Grad. . Springer-Verlag. New York. Essays on Dynamical Systems. New York. S MALE . 1980. Curl and All That. Div. Second edition.96 VI D ENSITY M APS [3] H. An Informal Text on Vector Calculus. S CHEY. M. 1992. Economic Processes. [4] S. Norton. The Mathematics of Time.

S CHIKORE . North Carolina. VAN O OSTRUM . Comput.. V. Visualization of scalar topology for structural enhancement. VAN K REFELD . Durham.] [Again.] Bibliographic notes. Dept. BAJAJ . V. 212–220. E DELSBRUNNER . H ARER AND A. Contour trees and small seed sets for iso-surface traversal. IEEE Conf.] [The most important part of the algorithm is maybe the handle slide. E DELSBRUNNER . maybe the first time by Smale(?). Comput. Duke Univ. R. 13th Ann. Sympos. 1997”. Manuscript. NATARAJAN AND V. 9th Ann. [1] C. 1998”.3 Construction and Simplification 97 VI. 2001. to appear. [3] H.] [Build a hierarchy through prioritized cancellation. L. Hierarchy of Morse-Smale complexes for piecewise linear 3-manifolds.] [That operation has been used in early work on Morse theory. PASCUCCI AND D. R. PASCUCCI . J. Geom. Visualization. [4] M. Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds.. which is the only restructuring operation necessary to go between different complexes.3 Construction and Simplification [Explain the sweep construction for two-dimensional Morse-Smale complexes using the simulation of differetiability.. H ARER . Geom. R. [2] H. Discrete Comput. L. In “Proc.VI. Sci. BAJAJ . .. In “Proc. C. there should be reference to the early mathematics literature on the topic of cancellation.] [We can describe the cancellation as a combinatorial restructuring operation and we only need this one to go up the hierarchy. V. S CHIKORE . J. Z OMORODIAN . PASCUCCI AND D. 18–23.

4 Simultaneous Critical Points [Explain the work with John on the topic and mention papers by Hassler Whitney and books in Catastrophy Theory.98 VI D ENSITY M APS VI. [1] V. . [2] H. A RNOL’ D . Dover. P OSTON AND I. I. E DELSBRUNNER AND J. Manuscript. Durham. Catastrophy Theory. [3] T.. Germany. Catastrophy Theory and Its Applications. Berlin. 1978. 1992. Springer-Verlag. Jacobian submanifolds of multiple Morse functions. Third edition. H ARER . North Carolina. Mineola. New York. 2002. S TEWART. Duke Univ.] Bibliographic notes.

Let be a line that avoids all point. Prove that intersects at most edges of and that this upper bound is tight for every .Exercises 99 Exercises The credit assignment reflects a subjective assessment of difficulty. Section of triangulation. Let be a triangulation of a set of points in the plane. (2 credits).  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . 1. Every question can be answered using the material presented in this chapter.

100 VI D ENSITY M APS .

In Section VII.3 VII. It really makes sense only for space-filling diagrams and does not seem to apply to information expressed in terms of sequences and space curves. There are various approaches to the question applied to proteins.1 VII. questions are almost always about populations and rarely about individuals. In Section VII. we explore rigid motions in three-dimensional Euclidean space and introduce quaternions as a tool to specify and compute with rotations. we look at the problem of identifying matching subsequences with minimum root mean square distance and at score functions that assess the shape complementarity of two space-filling diagrams. we study the problem of finding the best rigid motion for matching one points set with another. Minor variations in the type or arrangement of the components are frequently inessential and do not alter the role of a molecule within the larger organization. VII. we may also ask the related question of how well two shapes fit side by side. But then again. and shapes formed by space-filling diagrams.3. we apply the methods to questions of similarity and complementarity. In Section VII. is at the root of natural and other re-production processes and it takes part in protein interaction. As always in this book. we look at the related problems of sampling a rigid motion and of covering the space of such motions with small neighborhoods.1.4 Rigid Motions Optimum Motion Sampling and Covering Alignment Exercises 101 . The measure of choice is the root mean square distance between the two sets. space curves modeling backbones. including the comparison of amino acid sequences. on the other hand. Instead of asking how similar two shapes are. The complementarity question is a similarity question between one shape and (a portion of) the complement of another shape. which forms the basis of functioning life.4. and how do we quantify and assess that notion of sameness. In particular. The molecules that participate in the mechanism of life tend to be large and composed of small molecules.2. there are seemingly small variations that do have significant consequences.2 VII.Chapter VII Match and Fit As a general theme in biology. The similarity question is at the core of human understanding. The complementarity question. which crucially relies on classification to simplify and create order. This is particularly true on the molecular level. The underlying question is one of definition: when do we call two molecules the same or of the same type. we focus on mathematical and algorithmic methods that shed light on the broader biological issues. In Section VII.

Indeed. Leonhard Euler proved that any rotation The product has a similar form but six of the terms have their signs changed. axis. Using matrix notion. Quaternions can be viewed as a generalization of complex numbers: ¤  B  B ¤ ¢ B ¥ ¦ ! ¥    ¦           ©            § ¥   ©  ¢ ¡     ¡ ¦ B         ¢ D  ¦ D D ¢ ¡   ©B   ¢ ¢ ¡ ¢ ¡ §   ¡ ¦ £ ¦ ¡   ¢ B  ¤  § © ¤  © ¥    B B ¢p £¡ P   ¢ #B   £ ¦  ¡ ¥ ¡ ¡ ¢ B ¢ ¢ ¢¢0    £  ¤ © ¤  © 0 0 !§ ©B p      ¢0 ¢ ¡ 0 0 0       ¢ ¢      ¢0 0 0 p    ¢ 0 0     ¥   B rp        ¡ £ ¦   ©B   ¢ ¢          ¢     B ¡ . and a translation is a rigid motion that preserves difference vectors. a rotation is a rigid motion that preserves the origin. For example. but there are exceptions. As illustrated in Figure VII. In preparation of an operation that multiplies two quaternions. If I J K is another quaternion then the product of and is   ¥ ¥ K J   ¥ K I  ¥   ¥ ¥   the composition of a rotation and a translation: . Consider for example a rotation by about the about the -axis.1: The translation of the boldface original coordinate system preserves the directions of the axes while the rotation preserves their anchor point. the map ¥ VII.102 VII M ATCH AND F IT can be obtained by a sequence of three rotations about coordinate axes. A rotation about a coordinate axis has a comparatively simple rotation matrix. . the rotations form the so-called special orthogonal group of 3-by-3 matrices. abbreviated as SO . where is an orthonormal 3-by-3 matrix with unit determinant and is a 3-vector: £ where . It is mostly true that two different triplets of angles specify different rotations. the composition of any two rotations is another rotation. In this section we consider different ways to mathematically represent rotations. More formally. It is important to specify the Euler angles in a fixed sequence as other sequences of the same angles usually specify different rotations. which provide a particularly elegant mathematical framework. that this group is not abelian because the multiplication of matrices and therefore the composition of rotations is not commutative.1. In other words. x2 x1 The angle of rotation about a coordinate axis is referred to as an Euler angle. This suggests that the Cartesian product of three circles is not an appropriate model and we will indeed see shortly that is not homeomorphic to the space of rotations. rotating about the -axis gives £ Note that reversing two different imaginary units changes the sign of the result. and that make up the columns of . A rigid motion in is an orientation-preserving isometry of three-dimensional Euclidean space. we may use quaternions to represent rotations. however.1 Rigid Motions A motion in three-dimensional Euclidean space can be decomposed into a rotation and a translation. we can write . followed by a rotation by followed by a rotation by about the -axis and note that we get the same composite rotation if we switch and . we specify how to multiply the imaginary units: I I J J K K J I  I J K ¢ ¢( (  (  (  ¡(    ( ( ¡( ¢ Figure VII. Rotation and translation. Sometimes it is more convenient to    ¥ K   ¥ J ¥ I (¢0 ¡ (¢  0 (¢0 ¢(¢0 ¢ 0  0  0 ¡0  (0  ¡(0 ¢( 0 ( 0 ¥    ¢   ( 0  ¢ ( ¡ 0    ¢ ( 0 ( ¡ 0   ( 0  ( ¡ 0  ( 0 ¡ ( ¡ 0 ¥ ¡  £  ¥   £  £ ¢   ¢£ The rotation matrix moves the unit coordinate vectors to the vectors . As an alternative to orthonormal 3-by-3 matrices. and we focus on quaternions. Note. Quaternions. and are real numbers and I. In general. J and K are three different imaginary units. it is a map such that and for every pair . Every rigid motion can be written as x3 SO is not injective.

as required. we  As illustrated in Figure VII.2. This implies in particular that multiplying with also preserves length: . Instead. This implies that every non-zero quaternion has an inverse. where is the 4-by-4 identity matrix. both and are orthonormal. we think of a quaternion as composed of a scalar and a vector. This 3-by-3 matrix is the familiar rotation matrix that takes to . Since the matrices are orthogonal. the products with their transposes are diagonal: . we can use the scalar product to define the length of a vector: . always assuming . is the result of applying the composite product with the unit quaternion to . If we now apply the composite product with a unit quaternion . a reflection reverses the orientation of a sequence of three vectors. Another possibility is that it represents a reflection. We expand the product of the two matrices in Table VII.VII. This is true from either side and we show it for multiplication from the left:   ¢     ¡  0 ©  ¥    $  © ©    P§   © $ ¥ $ ¡   ¡   §¡ 0 $ ¢£ ¢   ¡£ ¢ £ As usual. which also preserves scalar products. differs from by having the lower right 3-by-3 submatrix transposed. we use the composite product . we have  §£ ¦ ¢£ ¢   ¡£ § £ §   ¦ ¢ £  ¢   ¦ ¢  ¢  ¦ ¢ £ ¢    ¡  ¡ £ ¦ ¢ ¢  ¡ ¦ ¡   0   ¡#   ¡ 0  ¡   0  0¡¡  ¡¡0    ¡  ©0 ¡ ¡ ¡ 0   ¡  £ (  ¡(   ¥¢ §£ § ¢ 0 (  ( (   (     ¡ 0 £  ¢ 0  0  0   0 £ § £ §   ¦ §  ¡ £§ ¢ £ § § §   ¦ ¢ £ ¢ ¦   ¢£ ¢   p ¢p  ¢   ¢ ¢ £ §£  ¦£           £    ©   ¢ £  p ¢p 0 £ ¦£     £ ¢ §£ ¢ ¥    © ¢0¢ ¡  © ¥ ¤ ¥£ ¤ ¤ ¥£ ¤ £ ¦ ¦    (  0(  0 (  ¡0¡( 0   ¦££    ( ¢  ( ¢ ¡ ¡ ¦ £¥ £ ( ( ( ( ( ( £  ¡ ¡ ¢  ¡ ¡ ¢ § £ § £   ¥   © ¦ ¡  ¤ ¤ cannot use simple multiplications to represent rotations because the product of a unit quaternion and a purely imaginary quaternion is not in general purely imaginary. the imaginary parts vanish when we multiply a quaternion with its conjugate: . The justification for to represent a rotation is not yet complete. In the reverse direction. We use purely imaginary quaternions to represent vectors in and compound multiplication with unit quaternions to represent rotations. Observe that the matrices associated with are the transposes of those associated with . However. We can express think of a quaternion as a vector in the product of two quaternions in terms of an orthogonal 4-by-4 matrix and a vector. namely . Axis and angle. since . and .1 and see that is purely imaginary. which shows that the composite product preserves cross-products. It follows that the lower right 3-by-3 submatrix of is also orthonormal.1 Rigid Motions . However. the conjugate of a quaternion is obtained by negating the imaginary parts: I J K . we show that the rotation by an angle about the axis defined by the unit vector can be represented by the unit quaternion  ¢ because . We start with a few properties. To do this. This can be done by expanding either the first or the second quaternion to a matrix: 103 Take a moment to verify that the matrices and are indeed orthogonal. Furthermore. unit length. Observe that ¤ ¥£ ¤ ¥£ 0 £   ¦££   ¦ ¦ ¢ ¡   ¡ ¡ ¢  ¢  ¥ 0 0 0 0 0 0 0 0  £ ¡ ¥ ¥ ¥ ¥  0   ¦£ £    ¡ £   ¢0   ¢0 ¡ ¡  0 0 0 0 0 0  £ ¦¥ £ ¢  £  £ ¦  ¥ ¥ ¥ ¥ ©  ¢ ¢©   §  ©¨§   ¥ ¡ ¡  ¢ ¢  £  ¤ £ ¦£   £ ¢    0 0 0 0 0 0 © 0 0     £   ¢§ £ § © ¥ ¥ ¥ ¥ ¦£   p ¢p ©    ¢  ¢ ¡ ¡    0 0 0 0 0 0 0 0  £ ¢ © £ ¢ ¢ ¥ ©   ¡ ¡ ¢  ¡ ¡ ¢ ¡ £     ¢ £ ¢ ¢£ . The rules for computing can be rewritten as When and are purely imaginary then these results simplify to and . Same as rotation. First.1 provides an explicit method for computing the orthonormal rotation matrix from the unit quaternion. The imaginary part of gives       I J K     ¢           ¢    §£ § Representing rotations. multiplication with a unit quaternion neither changes the angle nor the length. Similarly. with . we get and . Notice that Hence. the scalar product is preserved if we multiply with . and we can check that composite multiplication does not. The expansion of given in Table VII. the scalar product is a real number:  where and are the 4-by-4 matrices that correspond to . Similar to complex numbers. While the product of two quaternions is another quaternion. In the special case when has . an observer who looks against the direction of the axis sees the vector rotate in a counterclockwise order.

and from the product it is easy to again get the axis and the angle.104 VII M ATCH AND F IT Table VII. We have . where .     £     © 0     ux r To prove the claimed correspondence. they just need to pass through the axis of rotation and enclose half the angle of rotation. or for short.4. The dashed great-circle through the two poles represents the set of rotations about a fixed axis. In other words. and . Note that the same rotation as and that non-antipodal pairs of unit quaternions represent different rotations. Tedious but straightforward calculations show Composing rotations.and south-poles correspond to the identity.2: The rotation of the vector by an angle of about the line spanned by . the unit sphere in is a double cover of the space of rotations in . θ r. and the real part deterrepresents mines the angle of the rotation. ¦   ¢£ ¢   ¡£   ¤ ¥£ ¤ ¦   ¢ (   (  (   ¡ ( ( ¡  ( ¢ ( ¢ ( ( (  ( ¡ ( ¢(¡( ( (  ¢ ( ( ©  § P  ¡ ¥     ¥  ¥     (    © ©0    P§ 0 ¥ ©   ¡ 0      ¥   ¡£   ¥   (¡(  ¢ ( (    (  ¡ ( ¢ ¢ ( ¡  (  ( ( ( (  0   0 ¡  0 ¥    ¤ ¥ ¡ ¥ ¢ £  ¡ ¥  ¥ ¢   ( ( ( (  (  ¡ (  p p ¢  ¢ ¢     ¥   ¢       ¡ ¡ ¢   ¦ ¡ §£ § ¢ ¢ ¢ ¢  ¡ ¢ ¦ § ¢ ¡ ¡ . we write each as the composition of two reflections.3: The north. obtained by identifying antipodal points of is usually referred to as the real projective three-dimensional space. Thus.4. The middle two reflections cancel and we are left with two reflections. and points on the equator correspond to rotations by . as in Figure VII. A more direct geometric description of the composition of two rotations uses the fact that every rotation can be written as the composition of two reflections. although we usually prefer because it is easier to imagine. composition of rotations corresponds to multiplication of quaternions. we write the vector rotated by about the axis defined by using the formula of Rodrigues. ¨ ¨      © © P§  ©  ¤  P§  ©       © !§      ©   © !§   (   ¡( ( 0  (  0   (  ¡ (  0 (  ( ¡ (  and use the   ©  ¨  ¨  © ¥  ©   ¡ 0 Figure VII. The above relationships provide a convenient conversion between unit quaternions and axis-angle pairs. It is a good model of the set of rotations in . and the angle of rotation is twice the angle enclosed by the planes. The composition of two rotations represented by the unit quaternions and is x1 x2 Figure VII. The three dotted vectors correspond to the terms in the formula of Rodrigues. making sure that the second plane of the first rotation is also the first plane of the second rotation. Figure VII. We show that this can also be written in the form .3 illustrates the correspondence with a picture in one lower dimension. and as given above. The two planes defining the reflections are not unique.u u r r’ which can be seen from Figure VII. as illustrated in Figure VII. The axis of the corresponding rotation is the line common to the two planes. To compose two rotations.2.  ¡¡ £ x0 p(p (  ¡    ¦ ¢ ¥ £ ¢ ¥     © ¥ p ( ' ( p ¢       ¦¥ ¦ ¢ £ ¢  p ( ¢ ¡ ( p ¥   ¥     © !§ the direction of the rotation axis. The space ¨ ¥ ¨ If we substitute and identities and then we obtain the formula of Rodrigues.1: Product of matrices in the representation of a rotation by composite multiplication with unit quaternions.

Amer. 380–440. N EEDHAM . Closed-form solution of absolute orientation using unit quaternions. B. H ORN . We recommend the primer by Kuipers [3] for background on rotations and the text by Needham [4] for background on the more general context provided by complex analysis. Opt. 424–434. Proc. Quaternions and Rotation Sequences. K. 629–642. Oxford. P. 1997. A 4 (1987). Bibliographic notes. Irish Acad. K UIPERS .4: We see three rotations defined by the axis-angle pairs . New Jersey. # ¤ &¥  ¢ £ # ¡   ¦¢  !¡  #   ¢ . et de la e e variation des coordonn´ es provenant de ces d´ placements e e consid´ r´ s ind´ pendamment des causes qui peuvent les proee e duire.VII. Princeton Univ. 5 (1840). Even earlier. and . J. [4] T. [5] O. Soc. H AMILTON . Math. 1999. It is commonly acknowledged that quaternions have been discovered by Hamilton in 1844 [1]. [1] W.1 Rigid Motions 105 w ρ ψ v ϕ u Figure VII. Visual Complex Analysis. Clarendon Press. J. [2] B. It is less well known that a few years earlier. Gauss recorded his discovery of quaternions in his unpublished notebook in 1819. Pures Appl. Press. RODRIGUES . England. On a new species of imaginary quantities connected with the theory of quaternions. [3] J. The exposition of quaternions and their connection to rotations chosen for this section follows [2]. Des lois g´ om´ triques qui r´ gissent les e e e d´ placements d’un syst` me solide dans l’espace. Rodrigues studied the composition of rotations in space and gave a purely geometric explanation that is equivalent to Hamilton’s algebra [5]. 2 (1844). Each rotation is the composition of two reflections illustrated by the great-circles at which their planes meet the sphere. R.

we may apply it to the first collection and recompute the root mean square distance. the motion can be optimal only if translates the centroid of to the centroid of . Indeed. to . In other words. we solve it using quaternions representing rotations in three-dimensional space. We need some notation to make this precise. Then the sum of square distances between the correspond- ing points. and it might seem that computing the particular rigid motion that minimizes would be hopeless or at least difficult. Let us move every point to the and move the translated copy of with it origin of to . as §  ¨ § ¥      §  §     ¡  ¥ ¡    £       §     p  §     p ¥ ¥  ¥  ¥      Note that minimizing the root mean square distance is equivalent to minimizing the sum of square distances. the centroids of and are and . . More formally. the (solid) difference vectors all radiate out from the origin. § ¨¢ Given a rigid motion .2 Optimum Motion In this section. The translation minimizes the sum iff the origin is the centroid of the points : ©   ©   #       ¡  # ©     Optimum translation. Note that rotating and taking the centroid commute. Problem specification. The space of rigid motions is therefore six-dimensional.       § B     B !p   #B ¢  p     §  § ¡  £       ¥  ¥ ¥   #B ¡  £   ¡  ¡  §       ¥   ©B ¢ e    ¡  ¢ ¡  §        ¡ £ ¥ ¢ ¡ ©   ¥ £ ¦ £   §   £   ¥ ¢   ¦   ¡     p  §    p ¡   ¥ £     ¤  ¢  §   ¡ ¥   ¢ ¡ ¥    §  B § ¡ £ #  ¢ £¡ #  ¡¡ ¢   ¢ £¡ © © § ¡ ©B  ¥      £ #  ¤  ¢      © ¡ © ©  ¥ ¡ £ ¤¡ £   § § £   §       £  # ¡ . We begin by showing that the best translation moves to . Recall that the centroid of a collection of points is the average of the points. namely .5. is a quadratic function with a unique minimum. That minimum is characterized by a vanishing gradient: As mentioned earlier. .106 VII M ATCH AND F IT point for which the sum of the vectors to the points in the collection vanishes:   VII. Recall also that every rigid motion can be decomposed into a rotation followed by a translation. Let and be the two collections and assume that corresponds to . the centroid of is . After formulating the optimization problem. we study an optimization problem that arises when one attempts to match two molecular structures or to fit two structures snug next to each other. We are now ready to prove that the best translation is the one that moves to . We consider rotations and translations separately. We use the root mean square or RMS distance to assess how similar the two collections are. for each . While entertaining the possibility that the two collections are structurally the same or at least similar. We are interested in finding the rigid motion that minimizes the root mean square distance between and . ¦ ¡ # Figure VII. Since every rigid motion can be written as a rotation followed by a translation. We may therefore simplify our problem by translating and independently translating such that   This implies that the best translation moves claimed. we are interested in moving one collection so it best matches the other. is also the sum of square distances of the points from the origin. This measure is the square root of the average square distance: This implies that minimizes the sum of square distances from the . the translation that minimizes the root mean square distance between and is defined by . A crucial insight used in proving this fact is that the centroid is the only Optimum rotation.5: After moving the shaded points to the origin. This operation is illustrated in Figure VII. In other words. the latter sum vanishes iff . Quite the opposite is true. Suppose we are given two finite collections of points in and a bijection between them. and the main reason for this is the convenience provided by quadratic functions.

so minimizing mizing the sum of the . It is convenient to order them as . the surface represents the graph of the quadratic function over . The corresponding quaternion is . the dashed lines represent the zero-set and the boldface curve represents the graph of the restriction of that function to . Recall that the eigenvalues of a square matrix are the complex numbers for which the determinant of vanishes. We can interpret geometrically as a quadratic function over four-dimensional Euclidean space. A more effective algorithm alternates between improving the     ¡ B   "     B ¡     "   ¢         B ¡    B ¢ ¢   ¢©     ¥ ¢  £ ¢ The two matrices are skew symmetric as well as orthogonal. Short of being able to draw the graph of this function in . Our goal is to find a By the assumed ordering of the eigenvalues. The sum that we have to maximize can now be rewritten as £   ¥  ! !   ! ¦ ¡    "  ¢    #! " "B   ¢         ¡    ¡ ¢ #   ¢     §¡ ¢ ¡ $  ¤ ¡ The sums of the and the are not affected by is equivalent to maxithe rotation. Using quaternions. we have . which drops two of the dimensions. The sum of the square distances after the rotation is 107 for which the quadratic function gives a maxpoint imum. We can compute such a with a modest amount of linear algebra.6: The plane represents . we can express the rotation of a point as . but that would take a long time. Assuming and contain points each.    £    ¢ ¢   ¡¢ © . the eigenvalues are all real.2 Optimum Motion both centroids lie at the origin.VII. we have . and because we are only interested in unit quaternions. Eigenvalues and -vectors. Since multiplication with a unit quaternion preserves scalar products. as explained in Section VII. and because is symmetric.6. the optimum rotation is defined by the unit eigenvector that corresponds to the largest eigenvalue. we could of course try all bijections. we have . we have four eigenvalues. we illustrate the idea in Figure VII.1. If there is no bijection specified between the two sets then the problem of finding the best rigid motion seems significantly more difficult. We can thus write any quaternion as a linear combination of the eigenvectors. In other words. Letting . the partially dotted circle represents . we have . . Take a moment to verify that each matrix in this sum is symmetric. Recall from the previous section that       ¡  ¥ " ¢  £ ¢ © ¡ ¡   ¢  ¢  ¢  £ ¢ ¢   ¢ ¢  ©  ¡¢    ¥ ¦ ¢  ¡¢    ©¢ ¦ ¢  ¡¢   ¢ " ¢  p  ¦p  p    p  ¢ §¨ p  ¦p   ¢¥ ¦ ¢   ¡¢  ¥  p    p ¡ ¤  ¡ £     p  ¢£¥   ¦ ¢  ¡¢ p  ¡ ¡ £   ¢  "     ¢   ¢   ¢ £  ¢ £ ¢     ¡   £  ¡  £ ¢ £  ¢  ¤ ¤ ¥£ ¤ £ ¤  ¦ ¦  ¢     ¢ £  £  ¢     § §      §  ¥ ¥ ¥ ¥  ¢  ¡ ¡   § ¢ §  §   ¢        ©  ¦ ¢ £ ¥ ¥ ¥ ¥   ¡¢          ¢    ¢        § § © §    £ ¥ ¥ ¢  ¢ ©  ¡ ¥ ¥   ¢ ¡  ¥ ¡ ¢ §  §   ¢    §     ©      ¡    ¡ ¡ ¢  ¡ ¡ ¢      §  ¡          Figure VII. Since the sum of symmetric matrices is again symmetric. we may assume . The corresponding eigenvectors are the unit vectors such that . and this maximum is attained for . where is a unit quaternion and is the pure imaginary quaternion that corresponds to . Without bijection. Hence where . The corresponding eigenvectors are pairwise orthogonal and therefore span . Equivalently.

it follows that the algorithm halts. the root mean square distance decreases. Wellesley. A 4 (1987). P. J. B ESL AND N. Closed-form solution of absolute orientation using unit quaternions. PAMI14 (1992). H EBERT. 239–256. The problem of finding the rotation that minimizes the root mean square distance between two point sets with given bijection in has been studied in various fields. Int. ROTATE . Intell. WellesleyCambridge Press. In this section. 629–642. including x-ray crystallography [4] and computer vision [2]. For background on linear algebra and how to compute the eigenvalues and eigenvectors of a symmetric matrix. we follow the exposition of the solution given by Horn [3]. Given a permutation. IEEE Trans. M C K AY. Mach. S TRANG . In com- § ¡       # © ¢ ¡        $©  §    © ©  ¡    # © © loop       identity. ROTATE returns the rotation that minimizes the mean square distance under this permutation. endif forever. we refer to Strang [5]. A 34 (1978). which determines for each the point closest to . Anal. A discussion of the solution for the best rotation to relate two sets of vectors. Robotics Res. Finally. [5] G. Note however that we neither have a polynomial bound on the number of iterations nor a guarantee that the algorithm finds the globally optimal solution. This implies that no permutation is tried twice. we may use a subroutine A SSOCIATE . Given a rotation. A method for registration of 3-D shapes. Sometimes this change is motivated by the purpose of the computation. K ABSCH . Soc. H ORN . A popular version of the above algorithm uses injections from to instead of bijections. Acta Crystallogr. and locating of 3-D objects. VII M ATCH AND F IT puter vision. FAUGERAS AND M. Bibliographic notes. we replace M ATCH by A SSOCIATE and do the remaining operations as before. [2] O. Opt. Introduction to Linear Algebra. RMSD returns the root mean square distance. Sect. 1993. D. then else exit ©        $©  # #       ©   £     © . J. We use three subroutines to describe the iterative algorithm. at other times by the fact that finding the best bijection is not entirely straightforward. In the algorithm. The algorithm that attempts to minimize the root mean square distance between two point sets without specified bijection has also been described in several fields. the version that works with injections rather than bijections is known as the iterated closest point or ICP algorithm [1]. given a permutation and a rotation. recognition. the best translation always moves the centroid of to the centroid of . M ATCH . M ATCH returns the permutation that minimizes the root mean square distance between and . After each iteration. [3] B. So we may again assume that both centroids are at the origin and restrict ourselves to rotations. Patt. For a given rotation. K. if RMSD . Note that independent of the bijection. except that is replaced by the multi-set of points in that are closest to some point in . Massachusetts. D. 827–828. 27–52. Since there are only finitely many permutations. The representation.108 root mean square distance by changing the bijection and by changing the motion. J. 5 (1986). Amer. [1] P. [4] W.

For embedded in . namely how to sample uniformly at random and how to cover the space of motions most economically.  ¤ ¢  ¢ ¡ ¡ ¡  ¢ ¡   ¡   ¤ B  ¤   ©  ¢ ¡ ¡ ¦     © ¦ © ¦   The size of a sphere. © ¦¦ B   ¢         ¢    ¡ ¡   B ©  ©  Area which we get by substituting . the density is .7.3 Sampling and Covering ing the infinitesimal slices. The angle of rotation about the axis is twice the angular distance from the identity on . Specifically. Sweeping a three-dimensional hyperplane normal to the -axis. To pick the angle correctly.   © ¤  © Uniform sampling. The total volume of the 3-sphere is therefore . as long as both intersect the sphere. This projection is illustrated in Figure VII. ©B ¡   ¢ £¡   ¡  ©B ¡ B ' ¡ ©B ¡ ©      B ¥        ¦ ¡         #B  ¡ B . .  B¢ 7 B¢ ¡ ¢  5 ¢     ¢     © ¦ © ¦ Vol ¡ B 7 B¢ ¡¡ ¢  5 ©  ¤ P§   ¡ ¢ ¢ ¡ © ¤ ¤¥ P§ ¡ ¢  £ ¡ ¡ #B  ¡   ¡           In this section.  109 ¡  ¤ ¢ ¥  § ¤ ©  ¢ ¡ P ¡ G    ©   ©B ¦ We use the same method to compute the volume of embedded in . Archimedes’ theorem can be used to pick a point uniformly at random on . so we just need to pick the angle of rotation about this axis. This fact has been known already to Archimedes and is often expressed by saying that the axial projection from the sphere to an enclosing cylinder preserves area. The area of a slice is with . But note that the derivation shows more.VII. Hence. not uniformly but from a density that favors angles near the middle of the interval. We now extend this method to and thus to an algorithm for picking a rotation uniformly at random. normalized to have unit total integral. we return to what we learned from the above volume computation. The method may be viewed as picking a point on the enclosing cylinder and projecting it back to the 2-sphere: . Pick uniformly at random in Step 2. we get the volume by integrat- © ¤   §  ! ¢ Figure VII. We treat translations and rotations separately and spend most of our time on the more complicated case of rotations. we sweep a plane normal to the -axis and compute the area by integrating infinitesimal slices. The corresponding distribution function is B ¤    &  ¥  © © § £  ¤ P§ ©(P§  ¤ t©R (¨©§©© ¦!£ ¨ © ¥   ©  Step 1. Pick uniformly at random in Define . . ©B)¡   © ¦  ¡ B  ¥     ¡ VII.3 Sampling and Covering . Return ¥     ¢            ¢  ¢ ¥ ¢ ¢      ¤ ©  © ¤ R  ¢   B   The total area of the 2-sphere is therefore . Indeed. with the square radius equal to . in the quaternions near the identity would be more likely than those far away from the identity. Note also that Archimedes’ theorem does not extend to the 3-sphere. Hence. In other words. namely that the area of the slice between two parallel planes at a constant distance is the same for all such planes. We need to pick the angle from . as before. The perimeter of the circle in which the plane cuts the sphere is .7: Illustration of Archimedes’ theorem implying that the sphere and the enclosing truncated cylinder have the same area. It would not be correct to pick an angle uniformly at random since this would favor small dislocations of . Think of as the axis of rotation. We prepare the discussion of sampling rotations by measuring the unit 2-sphere and the unit 3-sphere. . we study two questions on rigid motions. at least not in the straightforward manner from sections between parallel plane to sections between parallel hyperplanes.

we append Steps 1 and 2 above with      VII M ATCH AND F IT We get a random rotation by using as a unit quaternion. we address how translations affect the root mean square distance between two point sets. with a bijection that maps to .2: Numerical assessment of how well the cube.559 1. Assuming is very small. we assume that the centroids of the two collections are both at the origin: . the volume of a ball with radius in is about . Figure VII. Indeed.110 at which monotonically increases and reaches . As an exercise we may estimate the number of balls we need to cover the unit 3-sphere.866 2.     ¤ ¢ ¡   ©B  ¢ ¡ B ¦  ¢ £¡ ¡ B     £¢     ¡ ¦    ¤ ¢ !¨§    ¡     ¢ ¡  ¤  © ¦          ¤   ¢ £¡        ¤    ¥  ¢ ©       ¢ ¡ B   Figure VII.353 0. all of radius . If we believe that we cannot cover more economically than the BCC lattice in .500 2. . packing while the BCC lattice leads to an effective covering.500 0. the edge centers and the midpoints between the face and the edge centers. After translating along . By counting fractions. we are guaranteed that every translation has a selected translation at a distance at most from . To simplify the analysis. We call a covering if and we call the covering radius. Pick uniformly at random in Let .8: From left to right: the cube. We need infinitely many balls has infinite volume.8 shows the portion of each lattice inside a cube of unit side-length and Table VII. and as Euler angles. and the volume is the fraction of the space covered by the packed balls. the FCC and the BCC lattices. The points with maximum distance to the lattice points are the cube centers. The idea of guaranteeing that every possible motion has a nearby selected motion can be expressed by covering the space of motions with neighborhoods. but we are usually just because only interested in bounded portions of space. for each . We study three lattices of points in some detail. To get a point uniformly at random on . we note that the FCC lattice has four times and the BCC lattice has twice as many points as the cube lattice. we pick a number uniformly at random in . both are known to be the respective best packing and covering lattices. It is convenient to measure the distance between translations and between rotations using the Euclidean metric. Consider first translations. let and be two collections of points in .2.720 FCC 4 0. Recall that the root mean square distance between and is the square root of the average square distance between corresponding points. we get a random rotation by using . We see that the FCC lattice leads to an effective C UBE 1 0. As in Section VII. We turn our attention to selecting a collection of rigid motions such that every possible motion has a selected motion nearby. the FCC and the BCC lattices pack and cover. Alternatively.523 0. Let and let be a collection of closed balls. If we use the centers of the covering balls as selected translations. The packing radius is the largest radius we can assign to the points to get non-overlapping balls. the FCC or face-centered cube lattice adds all centers of cube faces. Recall that its volume is .2 lists some of their pertinent properties. . and the volume is the total vol© Sensitivity to small translations.094 BCC 2 0. the root mean square   ¢   ¢¢ ¦   ¢ # §      © ¥ # © ¢ ¢   ¢¢ ¦  ¥ ¦     ©  B B  §    ¢ ¡ ¥  ¢ ©   ©        £¥ ¢ ¡  ¢ ¢ ¡ !    ¢ ¦£  ¢     © § § ¡ B     ! P © £ ! !§§ ¤¤ P§§  ©  ¤ © § (© ©  ¤   © (© © Step 3.740 0.433 0.463 points per cube packing radius volume (fraction) covering radius volume (fraction) Table VII.680 0. ume of the balls divided by the volume of the space they inhabit. which we represent by 3-vectors or points in . Return ©  © ¤ ©  . We will later analyze how these notions of distance relate to the effect of the motion on the root mean square distance between two sets in . This implies that the vectors add up to 0 implying that the sum of scalar products with any vector vanishes: . The covering radius is the smallest radius we can assign and still have the balls cover . we can use a straightforward volume argument to show that we need at least balls to cover the 3-sphere. The cube lattice consists of all integer points. and we compute its preimage under the distribution function: . and the BCC or body-centered cube lattice adds all cube centers to the cube lattice. Covering the spaces of translations and rotations. To pick an angle. Next.

Using we simplify the expressions for . The length is 1 if and only if for all . for . except that the constant now depends on the collection of points. we compute the the gradient: § £ £ ¡  §      ¢0   ¤0   0 §    §     ¥   ¡ ¡    ¥ (¢'       §  p  § p   ¥     B© B¢ §   purp   #B ¢ B pB urp £ ©B ¢    §¨ urp p B   p      p   ¥     § ¤ p  § B     p    ¥     ©B ¢   B    p     ©B ¢  B   § ¥  ¥  ¥    B rp ¤ ¢  ¢   ¥ £ ¤ # 0   ©   B £  ¢ XB   XB ¡ B  p    p   ¥    £ e ¤ p #B ¢ Hp    ¡ ¡ ¥ ¡  £ ©B ¢ ¢    § ©B ¢ e            ¡ ¢ 0  . Going back to the definition of . The problem of sampling motions has been studied in various fields. its gradient and the length of the gradient:    ¨ 0 ¢     ¢ £ ¢ B ¢     B      B  ¢ 0 0 0 ©  0  ¥       0  ¥   B     ©  ¥   B    ¡ ¡ ¡  ¢ ¡ #  ¢ ¥    ¥ 0¥     ¡        0 ¦ ¡    B  p ©  ¢ Pp e #  ¢ e The gradient is defined everywhere except at and its length is . Bibliographic notes. It is geometrically obvious that the total distance increases the fastest when each point moves in a direction straight away from . In this case. the root mean square as a function over the three-dimensional space of translations satisfies a Lipschitz condition with constant 1. This is possible in the limit and characterized by the velocity vector of being parallel to . Call the root mean square distance from the centroid the radius of gyration.9 illustrates this result by comparing the graphs obtained for equal and for nonequal corresponding points. the difference between the root mean square distance for two rotations is no more than that multiple of the norm of the difference vector: 0  #  ¢  £  ¢ B ¢ 0   B  0  B ¢ 0   B  0   ¢ p¥   9§ rp 0    B  00  B0 ¤ ¢       ¡     ¢ e ¤ p #  ¢ Hp    © ¢ e ©  ¢   dient never exceeds 1. To measure how fast the root mean square distance changes with varying translation vector. As for translations. Since the length of the gradient never exceeds . in particular to their radii of gyration. including statistics.VII. which includes the possibility that . where is the radius of gyration of projected into the plane in . the length of the gradient is maximized if for all . We repeat the analysis for rotations. We have if and only if for all .  ¥    ¥ ¦ ¥ Sensitivity to small rotations. For the purpose of computing the gradient and its length. Since the length of the gra where are the eigenvalues of the matrix defined in the previous section. Since we assume . The effect of the rotation represented by is best viewed in the   and We see that the rotations satisfy a Lipschitz condition that is similar to that for translations. we observe that the eigenvalues are and . Note that . we consider a function over :  p © ¥   B      ¥ 0   ¥ ¦      !p      ¥ ¢   ¡           ¡ #   #  ¢ which implies . Figure VII. we have and the root mean square distance between and the rotated copy of is Let be a unit quaternion.3 Sampling and Covering distance is 111 direction opposite to the rotation axis. the difference between the root mean square distances for two translations is bounded from above by the norm of the difference vector:  ¢     #  ¢ ¢  B   Figure VII.9: The hyperboloid approaches the graph of the norm function at plus and minus infinity. the radii of gyration of and are    In words.

many of the main questions in this area are still open. Optimizing the arrangement of points on the unit sphere. Lattices and Groups. C ONWAY AND N. F EJES T OTH . Ann. [1] J. For example. Springer-Verlag. ´ [3] L. 1988. New York. New York.112 crystallography and molecular modeling. 31 (1977). it is not known whether or not the BCC lattice is the most ecowith congruent balls. Comput. Sphere Packings. S LOANE . A popular method that is correct and different from the one described in this section is due to Marsaglia [4] and is reproduced in the exercise section of this chapter. Packing and covering problems have been studied within mathematics and have generated a large body of literature [2. and for most numbers of points (or caps) only approximate solutions are known [1]. Very little nomical covering of is known about optimal packings and coverings in nonEuclidean spaces. [2] J. B ERMAN AND K. M ARSAGLIA . 43 (1972). Surprisingly. H ANES . Math. Math. 645–646. Lagerungen in der Ebene. 1006–1008. it is important to notice that first picking a rotation axis and second a rotation angle favors quaternions close to the identity if we pick the angle uniformly at random in . Springer-Verlag. Stat. Various methods for picking a rotation uniformly at random have been published but not all are correct. In particular. 1972. Second edition. 3]. auf der Kugel und im Raum. The problem is challenging even in the relatively simple case of the 2-sphere. Choosing a point from the surface of a sphere. [4] G. © ¢   VII M ATCH AND F IT  ¢ £¡ .

As illustrated in Table VII. and a mismatch is a column with two different non-space characters. We compute by dynamic programming. we briefly discuss the two problems of match and fit for protein structures. left corner.3. Sequence alignment. spaces to achieve ¥  #  ¡  © X   ¡   £ £  ¡ ¡ ¡    §                #    ¡ #  ¥   ¨  £      £   . For the moment.  £  ¥  £   £    ¡ £ ¡  ©   ¡ ¡ . Longest common subsequence. we can reconstruct the longest common subsequence itself. we need to show that the length of the common subsequence cannot increase if we do not use the match between and . The common subsequence between two strings consists of all matches. We model a protein as a string over the alphabet of twenty amino acids: and .10. and we may move the last match to the end without de-   if if A A C C deletion: insertion: match: mismatch: Figure VII. Assuming gives the score for having and in a single column. Using a second array of the same size. Letting be the length of the longest common subsequence. and its length is the number of matches. return   # VII. Indeed.4 Alignment 113 creasing the length. A match Q Q R A A A C C £ A R C C R £ This algorithm is a typical example of the dynamic programming paradigm. Then   #    £    ¨  #  §    § ¢       ©     £ § ¡  ¡      ¡  © X    ¡   ¡ £ £    ¡    ¡ ©  §    £    Table VII. and define for all and . removing the last column leaves an optimal alignment of shorter strings.3: The alignment uses matches.10: The edit graph for the strings in the above example and the path that corresponds to the given alignment. The path starts at the source in the upper A Q R Q A C R C R To verify the recurrence relation note that every alignment ends with an insertion. insertion and deletion. To enstore the solutions. for to do for to do if then else endif endfor endfor. which we illustrate in Figure VII. which implies that the total running time is proportional to . we restrict ourselves to alignments without mismatches. takes vertical. An alignment maps the to the in sequence.VII. . which constructs an optimal solution from pre-computed optimal solutions to sub-problems. In the third case. the algorithm uses an array of tries. Columns with two spaces are disallowed. we represent an alignment by a matrix consisting of two rows and columns. and with this extra information. horizontal and diagonal edges B   §       ¡   ¡    §   ¡¡   ¡  ¡©      ¡       ¡ #B ¡       #    is a column of two equal non-space characters. In each case. An insertion is a column with a space at the top and a deletion is a column with a space at the bottom. a deletion or a match. We begin by studying how to match proteins and develop an algorithm that measures the similarity between two chains of atoms. Consider first the combinatorial (as opposed to geometric) version of the sequence alignment problem. we may keep track of the decisions made by the algorithm. Let be the length of the longest common subsequence of and . without using that match we end with an insertion or a deletion. we consider the related problem of docking a protein with its substrate. we get We can think of every alignment as a directed path in the so-called edit graph of the two strings. Each entry takes constant time. The general alignment problem permits mismatches and assesses the score by rewarding each match and penalizing each mismatch. is the minimum number of insertions and deletions needed to transform to .4 Alignment In this section. and not just compute its length. where is the total number of spaces. We turn the recurrence relation into an algorithm: integer LCS : . but it permits spaces on both sides.     © ¡ ¥ £ £§       ¥   §§ ¥     £ £ # ¡   ¡ # ¡ §   © ¦        ¡        #  § §         . Thereafter.

The idea of the algorithm is to sample the space of motions dense enough to guarantee an alignment with a score at least . The score of the best alignment between and is then . namely higher running time because we evaluate for more rigid motions. ¥¢  ¤ ¥¢  ¤ p   ¥ £ p B §  ¥ ¤   B© ¢ Pp e ¢  §   ¥      p   ¡     ©B ¢   ¡ Next. For now. Improving the approximation by decreasing comes with a cost. say . Instead. we permit a rigid motion be applied to one of the chains. It does this in time proportional to . First. we determine the sensitivity of the score function to small motions. We further simplify the discussion by assuming .11: The horizontal axis represents the six-dimensional space of rigid motions. we get a function that maps a rigid motion to the score between and . for . we assume a fixed embedding in and consider the alignment problem without applying any rigid motion. Letting and be positive constants. Γ where is the score of the best alignment that ends with an insertion and is the score of the best alignment that ends with a deletion. and hence . Let be the motion that maximizes . and we penalize for gaps as before. second. A gap in the alignment is a sequence of contiguous insertions or of contiguous deletions. The norm of the gradient of a single term in this sum is bounded by a constant . We need some notation to formalize this idea. Using three arrays. We quantify the dependence by analyzing the running time depending on . We thus aim at computing an approximately best alignment. Ignoring penalties for gaps. Chains of atoms. Let the and the be the centers of the -carbon atoms along the backbones of two proteins. and . we get where is the length of the alignment and the points are re-indexed so that maps to . Proteins tend to have globular shapes packing their atoms around their centroids. for some . This may be done by penalizing an insertion or deletion an amount when it starts a gap and an amount when it continues a gap.114 and ends at the sink in the lower right corner. To decide how dense we have to cover the space of rigid motions. it prefers shorter over longer sub-chains. We first consider translations . One such function is obtained by combining square distances with gap penalties as follows.1)  µ Figure VII. we reward a match between and by adding (VII.11. it does not lend itself to the dynamic programming algorithm and. This strategy makes sense in practice since in any case the locations of atoms are only known up to some precision. The upper envelope of the graphs is the motion-wise maximum of the score functions. This construction is of all : illustrated in Figure VII. Consider the function defined as the motion-wise maximum ¢ ¡ ¡ B to the score. The dynamic programming algorithm can still be used to identify the best in a collection of exponentially many alignments. we need a score function that balances the contributions of length and distance. Running time. Using the root mean square distance between two sub-chains is problematic for two reasons. It is common to penalize a gap separately for its existence and an additional amount that depends on its length. We may therefore assume that the radii of are both roughly equal to and the radii of are both roughly equal to . and the best alignment is for which . This gives rise to the following recurrence relations: VII M ATCH AND F IT . For each alignment between and . ¢¡ $    £ ¥ ¢ ¡    ¥ £ ¡ ¡  ¥ ¡  ¦¡££  #  £ ¡ ¡ ©   ¦     © ¦  £   # ¡ ¡ ¡ ¦   ¥ ¦ ¦ ¦     # ©  ¡ ¡  &  §     ¡      ¡  X ¡    ©      ¡ #  £ ¡ ££ ££  § R  © X¡ ¡  #  £ ¡¡ £ £  © ¡     p  §    p    §  R $ ¥ ¥ ¥    ¥  ¡  £  # ¢ £¡ R ¥ ¥ §  ¥  ©  ¡¦   £ R ¤¡ ¤  #   ¡  §     ¡   £ £ £ ¦ ¢ ©       ¦ ¡  ¡  # £ §    ¦¢  ©   ©  © £     ¡ § ¦   # . but we may decrease and thus get arbitrarily close to the optimum. we can again compute the best alignment with dynamic programming in time proportional to . The other parameters entering the analysis are the lengths of the chains. the radii of the smallest spheres enclosing and and the radii of gyration of the two sets. We can use the same algorithmic ideas to compute alignments between two sequences of atoms. we compute the best alignment for each of a dense sample of motions. Instead of computing the best motion for each alignment.

It follows that having a translation that is not quite the optimum contributes at most to the error. the van der Waals force is weakly attractive within small distances of maybe up to four Angstrom.  if if 0 © ¡ # ©  & ¡  ¡     ¡   00 © © F ¥ ¤¢  § ¥ ¥ ¢  ¦  ¥ ¢¢   ¦ £ ¡ ¡ F ¥ £¢ ¡ ¦ ¡   p  §   p ¨  p  §    p ¢¢ ¨  ££  ¡¡ ¥ § ¡ ¡ §  ¥ ¥ ¢£¥ § £  # ©   ¡ ¡ © ¢ ©  F ¥ £ ¡ ¡ ¡   ¦ © ¡ # ¡ ¦     ¡     ¡ ¥ ¥ ¦ ©    ¡ ¡  ¡ F ¥ ¤¢ # # ¡ ¥ ¢¡¦ ¦ ¥         ¥ ¢¡¦ ¦      ¤ p ©  ¢ p    ¡ ¥  © ¢  ¡¥   ¢   ¦ ¥ ¢ ¢ ¥ d   . Given a rigid motion . we get again a contribution of at most to the error. We need some Analysis. but one weakness is its sensitivity to collisions. By assumption on the shape of the protein. The goal is to find a rigid motion such that and fit well. The collections of colliding and of close pairs are ¡ # where is a small positive constant.4. in not making that interaction impossible. In protein docking. The sensitivity of to small rotations depends on the radii of gyration. There are many possibilities. This question makes sense if we use space-filling representations of the protein and the ligand. Instead of protein docking. the basic question is how well a proteins and its substrate fit to each other. We thus define  © ¡ ¡ © #  ¡   ¡ # #  F ¥ ¤¢ ¡ protein-protein interactions. This is of course not practical and we need faster alternatives. We think of the and as the centers and write and for the van der Waals radii of the spheres in and . and one is the approximation of the van der Waals potential by counting the pairs of spheres at small distance from each other. The geometric fit between the two proteins thus becomes a significant factor in making the interaction possible or. Multiplying this with the running time of the dynamic programming algorithm gives a total running time of .12. Protein re-docking. the volume of translations we need to cover is proportional to . the region of local complementarity is frequently fairly large. We cannot use the root mean square distance to guide our reconstruction of the complexed form and thus need a score function that assesses how well a motion does in generating a good fit. We cover the space of rigid motions by cross-products of these balls and thus get a constant times rigid motions. Here we are given the complexed form of a protein and its substrate and we attempt to reconstruct that form while suppressing any knowledge of the solution. For 115 notation to lay out the rules for this problem. The input to the reconstruction algorithm consists of and and not knowing the solution means we can not use any information on and on . more accurately.4 Alignment We cover the space of translations with balls of radius .12: The shaded local complement of the left shape is similar to the shaded portion of the right shape. Experiments show that this score function is a good indicator of good fit. Improvements of the running time are possible. and we get . We may account for this fact by allowing a few collisions in the definition of . By choosing the balls in the cover small enough. we need a constant times balls. In each case. and it is strongly repulsive for colliding van der Waals spheres. Actual proteins are flexible and can avoid minor collisions by small deformations. Let and represent the protein and the substrate in complexed form. and let be the protein after applying a random rigid motion. we can guarantee that the root mean square distances between and and between and are less than some ¦ Figure VII. which can be done directly or by computing the root mean square distances between and and between and . As mentioned in Section I. but to get a good approximation of the reality. We interpret this question as asking how similar the substrate is to a portion of the complement of the protein. we can test how well we did by comparing with . we consider the simpler re-docking problem. The substrate could be another protein or a small ligand. After is computed. By covering the space of rotations with balls of radius . we compute by comparing all pairs of spheres in time proportional to . we will need to build knowledge about flexibility into the score function. and the volume of the rotations is . some of which will be mentioned at the end of this section.VII. This idea is illustrated in Figure VII. but not if we represent them combinatorially or as chains of points in space. The general algorithm for re-docking is similar to the one for geometric alignment: we explore the space of rigid motions and evaluate the score function at the centers of the balls used to cover the space.

[8] A. Press. where is the radius of gyration of either radius or . In many cases. England. this improves the running time to roughly . Duke Univ. Since is typically in the thousands. There are two main computational approaches to structural alignment: one represents a chain by its matrix of internal distances [5] and the other uses rigid motions to align the chains embedded in space [9]. 141– 148. The FSSP database of structurally aligned protein fold families. S ANDER . Among other things. Chem. B RENNER .1). H OLM AND C. [9] S. 2003. Stanford Univ. E DELSBRUNNER AND J. [1] S. however. V. we simplify the analysis by setting and assuming that the radii of the smallest enclosing spheres and the radii of gyration are all roughly equal to . where and how proteins interact with each other and with other molecules. We refer to [4] for a recent survey of the extensive literature on computational approaches to protein docking. H. experimental evidence that such configurations do either not exist or are rare for actual proteins. L EVITT. B ESPAMYATNIKH . V. The structural alignment problem refers to comparing the backbones modeled as curves or chains of spheres in three-dimensional space. For the rotations. A. 123–138. and in these cases the geometric fit is an important factor. we have followed the second approach and presented the work of Kolodny and Linial [7]. S. 1997. J. Phys. Mol. [3] D. Protein structure comparison by alignment of distance matrices. Protein docking by exhaustive search.. J.. B. Biol. W OLFSON AND R. N USSINOV. The material in is this section is based on the work described in [1]. Trees. Approximate protein structural alignment in polynomial time. with constants and . research on this problem has lead to the creation of structural databases [6. H UBBARD AND C. G USFIELD . Current Biol. 2002. T. According to the sensitivity analysis in the previous section. The goal of protein docking is the prediction of whether. In this section. L INIAL . E LCOCK . [4] I. The particular score function given in Equation (VII. Mol. 3600–3609. C HOI . we get a total running time proportional to . even this is not practical and we need faster alternatives.116 . [7] R. H ALPERIN . M C C AMMON . M AO . we may cover the space of translations with balls of radius and the space of rotations with balls of . and multiplying this plored is thus proportional to with quadratic running time for evaluating the score function . S ANDER . J. Algorithms on Strings. The total number of rigid motions to be ex. Principles of docking: an overview of search algorithms and a guide to scoring functions. California. Computer simulation of protein-protein interactions. we need to cover a constant volume also requiring about balls. Cambridge Univ. H. and Sequences. For constant . Its importance within structural molecular biology derives from the observation that evolution preserves structure better than amino acid sequences. Nucleic Acid Res. but it is the only algorithm that guarantees a good approximation of the optimal alignment in polynomial time. For the translations. was sug£ ¦  VII M ATCH AND F IT    ¦ ¥ F ¥ £¢   ¡        ¢ ¥     ¢   ¥ ¡ ¥     0   ¥   ¦ ¦   0   ¦  ¡ ¦   gested in [9]. it could be zero because motions with high score value tend to be right next to motions that generate collisions. S EPT AND J. Proteins 47 (2002). However. [2] A. An improvement by a factor is possible if we compute for all translations composed with a single rotation in one sweep. [5] L. Manuscript. C HOTHIA . There is. L AURENTS AND M. 8]. 409–443. 233 (1993). we need to cover a volume of about requiring about balls. D. 1504–1518. B 105 (2001). Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. ¢ ¥ ¥   ¢¡ #   ¥ ¦  ¢     ¥ ¥ © £ . H. In other words. 3 (1993). Indeed. H OLM AND C. KOLODNY AND N. who explore rigid motions in the outer loop and optimal alignments using dynamic programming [3] in the inner loop of their algorithm. 536–540. whether or not the algorithm recognizes as close to depends on the shape of in this neighborhood. Note that this does not necessarily imthreshold ply that is large. the surface area of the interface during the interaction is substantial. D. M URZIN . Manuscript. Bibliographic notes. North Carolina. RUDOLPH . 22 (1994). Let us return to the question how to cover the space of motions to guarantee a root mean square distance of at most . It should be mentioned that the presented algorithm is significantly slower than the currently most commonly used DALI software [5]. [6] L. Biol. Stanford. As before. We can design cases in which has arbitrarily narrow high spikes and our algorithm has little chance to ever recover the complexed form. Durham. G. 247 (1995). S UBBIAH . SCOP: a structural classification of proteins database for the investigation of sequences and structures. E. there are cases with smaller interaction area in which forces unrelated to geometric shape outweigh the importance of shape [2].

so we define . (ii) Area there triplets of planes enclosing non-right angles for which is equal to the sum of square distances from to the three planes? 8. antipodal point       Exercises      B ¥ ¥ ¥   $ B ¥  ©B        urp ¢ pB p B   6¡ !p ¢ ¡ ©   ©   ¡  ¡ ¡ B ¡ £  ¦ $   ¥      ¢ ¥  B p  $   ¦ $ ©B  ¡       ¥ B ¡ B¡     ¡ B rp    p $ ¢ ¡ ¢ ¡ ¥ B t   ¥ #B ¢ ¢ ¡ B rp ¡ B  $ $ ¢ ¡   ¡ ¦    ¥ ¢ ¡ © ¦ $   ¥ ¥ ¢      #B . Consider a collection of points in and let be its centroid. Sizes of spheres. (ii) How many plane reflections do you need to represent the central reflection?   ¥ B ¥     ©B   1. Square distance from planes. Random rotation.          ¥ B ¢      B ¥£     §   ¢   "      '  ¥      ¢   § ¡      (     B B 2. let be the image of under that rotation. Sum of square distances. £  ¤ ¡ ¤  4. prove that there are three planes for which a similar formula gives the sum of square distances to the planes. (ii) How are the minimum. Biased probability. else let return and uniformly at ranthen repeat Step and ¢   (i) Show that every rigid motion is the composition of two plane reflections. Number of alignments. (ii) If or 1.Exercises 117 5. Prove that the uniform density of quaternions over implies the uniform density of points over the 2-sphere. Prove that the following method picks a point uniformly at random on : ¢   £ ¡ . Reflections. Assuming . The central reflection maps every point to its . The square distance from a point . (ii) What is the number of different alignments with a fixed number of spaces? (iii) What is the total number of different alignments? What exactly is the constant? (ii) Extend the construction to a collection of planes in . The remaining spaces are distributed over equally many insertions and deletions. . Recall that an alignment beand -carbon atoms that tween two chains of uses spaces can be represented by a matrix with two rows and columns. The reflection through a plane maps every point to the point such that crosses the line segment orthogonally at its midpoint.        B We know that the perimeter of is . the area of is and the volume of is .     ¥ £ ¢   3. The -dimensional unit sphere consists of all points at unit distance from the origin of the -dimensional Euclidean space:  (i) Pick numbers dom in . Let us mark a point on the unit 2-sphere. (iii) Further extend the construction to a collection of lines in . Any density function over the space of rotations implies a density function over the 2-sphere. Sampling the 3-sphere. Suppose Function U NIFORM picks a real number uniformly at random in . the root mean square distance to the is the root of the square distance to the centroid plus a constant:  (i) Show that is a necessary and sufficient condition for the number of spaces in any alignment of the two chains. What is the -dimensional volume of ? 7. the median and the maximum of three numbers picked by Function U NIFORM distributed? 6. In other words. £ ¥   ¥ ¡ ¡    ¡¥  £ ¡ ¡ ¥ ¡   ¡ ¥¡ ¤ ¡ ¡  £ (i) Show that the above claim holds for any three planes that pass through and pairwise enclose a right angle. we define and note that we need insertions just to make up for the difference in length. is also the sum of square distances from the three planes parallel to the coordinate planes that pass through . (i) Prove that for every point in space. For a rotation . (i) Show that the minimum of two numbers picked by Function U NIFORM is distributed according to the triangle density function .

118 VII M ATCH AND F IT .

3 VIII.1 VIII.Chapter VIII Deformation VIII.2 VIII.4 Molecular Dynamics Spheres in Motion Rigidity Shape Space Exercises 119 .

[Taylor expansion.] Kinetic data structures.] ¡ ¢£   .120 VIII D EFORMATION VIII. leap-frog. [Weighted area and derivative (forward pointer to Chapter IX).] .2 and IX). [Close neighbor lists.1 Molecular Dynamics  Newton’s second law. Verlet. predictor-corrector). different numerical methods (Euler. Beeman. [ Numerical integration.] Hydrophobic surface area. Delaunay triangulation or dual complex (forward pointer to Section VIII.

Univ. FACELLO . thesis. 287–306. ¡     . Illinios.] [Define cross-sections of the complex of independent simplices and proof that each cross-section gives a different pie formula but the same measurement. Geom. G UIBAS AND L. Linear motion in instead of . [3] M. Report UIUCDCS-R-961967. Comput. A. Dept. BASCH . 1996. Geom. D. Ph. Geometric techniques for molecular shape analysis.VIII. Inclusionexclusion complexes for pseudodisk collections. Comput.. L.2 Spheres in Motion 121 VIII.)] [This topic relates to the possibility of drawing non-straight Voronoi like decompositions [2]. Urbana. Discrete Comput. R AMOS .] [Dynamic Delaunay triangulations [3]. Illinois.. A. 17 (1997). E DELSBRUNNER AND E. Sympos. [2] H.] [Predict collisions of spheres.] Bibliographic notes. Proximity problems on moving points. 344–351. Sci. In “Proc. 13th Ann. 1997”. [1] J. Z HANG .2 Spheres in Motion [Explain the slack in the Pie Volume Formula (with a forward pointer to Chapter IX. J.

3 Rigidity [Discuss the pebble algorithm that analyzes the rigidity of a graph in three dimensions.122 VIII D EFORMATION VIII. .] Bibliographic notes.

Similar to two dimensions. [1] H. 1996”. The problems of (1) finding a good basis. That method can be   ¥   ¥  ¥ VIII. which are probably discussed in the approximation theory literature. E DELSBRUNNER . The goal there is photo realism and possibly the most difficult problem towards achieving it is the construction of a one-to-one correspondence between features of the initial and the final images. For the complex we observe two types of changes caused by adding an edge or a triangle. Comput. C HENG . Geom.4. The Morfi software has been used in [2] to explain two-dimensional skin geometry and to illustrate its use in deforming two-dimensional shapes into each other. H. Design and analysis of planar shape deformation. differ by at least one change in homotopy type. Comput. we can deform skin surfaces into each other by continuously changing the defining spheres. 205–218. Graphics Internat. Bibliographic notes. Theory Appl. F U AND K. There is a third type of change not seen in Figure VIII. W OLBERG . which in the he complex is caused by adding a vertex and in the body by creating a component. L AM . Geom. except the last three in the sequence. C HENG . [Explain the mixing of two or more shapes as a generalization of 1-parametrized deformation. Recall that the homotopy types of the body and the dual complex are always the same. we merely illustrate the deformation and mention some of its features in passing. are both difficult. They are similar to fundamental questions on function representation. which often does not exist. it deforms the skin of one set of circles to the skin of another. Figure VIII. F U AND H. In other words. We note that these deformations are similar but also different from the image morphs studied in computer graphics [3]. we show the skin curve together with the dual complex.1 shows the deformation of a skin curve defined by four into one defined by three circles. 191–204.] The main functionality of the Morfi software is that it can smoothly morph between one skin curve to another. Theory Appl. P. (2) finding the best approximation within the spanned space. We note that any two contiguous bodies.VIII. [3] G. The corresponding changes in the body are caused by creating a handle or filling a hole. The Morfi software creates a few-to-few correspondence through geometric considerations rather than working towards a one-to-one correspondence.1. 19 (2001). Recent advances in image morphing. A canonical such method is explained in [1].4 Shape Space .-L. P. P. 19 (2001). where we discuss notions of similarity between two molecular skins. Shape space from deformation.4 Shape Space ¡ 123 skin surfaces and thus create a shape used to mix space that encompasses -variate deformations. In this section. which implies that they change their type the same way and at the same time. The details of this deformation will be explained in Section VIII. [2] S.-W. E DELSBRUNNER . Comput.. For each snapshot. In “Proc. 64–71.

1: Ten snapshots of a deformation with skin and dual complex displayed.124 VIII D EFORMATION Figure VIII. . The skin in the fifth snapshot is the same as in the figures above.

¡ ¡   £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡   ¡ ¡ ¡ ¡ £ ¡ ¡ ¡ ¡ £ ¡ .4 Shape Space 125 Figure VIII.2: From left to right and top to bottom: the shapes at times . The sequence is defined by a set of seven spheres forming a question mark at time and a set of eight spheres forming a human-like figure at time .VIII.

 ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Let be a line that avoids all point. 1. Every question can be answered using the material presented in this chapter.126 VIII D EFORMATION Exercises The credit assignment reflects a subjective assessment of difficulty. Section of triangulation. Prove that intersects at most edges of and that this upper bound is tight for every . Let be a triangulation of a set of points in the plane. (2 credits).

4 Indicator functions Volume and surface area Void formulas Measuring Software Exercises ¦ ¡ 127 . This chapter will study three aspects of size: volume.2 IX. From these we will derive short inclusion-exclusion formulas for size measurements. Our general approach to measuring the size begins with indicator functions for convex polyhedra in . and arc length for such diagrams. IX.1 IX.3 IX.Chapter IX Measures There are various reasons why biologists want to measure the size of molecules. Volume is important in the calculation of free energy and in estimates of populations given a bound on the available space. surface area. Surface area is a resource consumed by molecular interactions and is probably even more relevant to research in structural biology than volume.

Particularly. is the most important dimenkeeping in mind that sion since polyhedra in relate to molecules in . we define    (IX. is a -face of itself and the facets are the -faces. Let be the finite collection of half-spaces such that . we only keep the terms that correspond to faces of . it extends to infinity. Note that . to the left and     ¡   ¦    ¢¡ B  ©  $ $ F ¢ ¥¤   ©       ¦    ¥   ©   ¢¡ $  ©    ¢ £¡      $  ¡ ¦ ¦         ¦ §   ¦ §¡  $ ¡      ©     ¦   £ $   ¥ ¢ ¡  ¢  ¡ ¦ ¦ £  $   ¢   ¦ ¡   $ ¢   ¥  ¡   ¦    ¡    ¡ ¦   ¦ ¡ ¦ ¦ ¡ ¦ ¦ ¦ ¡ 0   ¥    ¥           ¦  ¥ ¤ ¡          ¥        ¡ ¦ ¦ ¦ 0 ¡ ¦     ¤ ¡ ¡ 0 $ . which is the alternating sum of subsets of . In the unbounded case. Assuming general position. Convex polyhedra. Most of the terms in the exponentially long formula (IX. The Euler characteristic of is the alternating sum of faces. The straightforward way of doing this is called the principle of inclusion-exclusion. The boundary is decomposed into faces of various dimensions.1 Indicator Functions The Euler relation for convex polyhedra is a special case of the Euler-Poincar´ theorem for complexes. For we get and is an indicator function for . To see this define and . including the empty set for which for all points .1) are redundant and can be removed. Inclusion-exclusion. It is either bounded or unbounded. and . There are e elementary proofs for this special case. Each face is the intersection of the polyhedron with a subset of the hyperplanes bounding half-spaces in .1) Truncation.    ¢ ©  ©  © "   ¦ if if   is bounded is unbounded       § ¡  ¦    ¦ In the bounded case. In words. the boundary is a -dimensional topological sphere whose only non-zero Betti numbers are . the boundary is an open -dimensional topological ball whose only non-zero Betti number is .1: A bounded convex polyhedron in an unbounded one to the right. .     £¥ § ¢ £ ¨     £          ¦    ¢¡   ¢ §¡   ¨ ¡        ¡   ¥        7£  £ 5   ¥  ¡         ¡  ¦ Let be a convex polyhedron in and assume it has non-empty interior. which comes from the empty set. and in the second. the polyhedron is the convex hull of finitely many points. Specifically. We study polyhedra in -dimensional space. The Euler relation will follow from elementary proofs of properties of these indicator functions. Let be the number of faces.128 IX M EASURES Below we will construct indicator functions of from Euler characteristics of subcomplexes of the boundary complex. as we will see later. the dual of the boundary complex is a simplicial complex and the Euler-Poincar´ Theorem stated in Sece tion IV.1. that leads to We form an alternating sum of the an indicator function for the convex polyhedron. if otherwise. A convex polyhedron is the intersection of finitely many closed half-spaces. This sum is    © Figure IX. Namely if then it sees a facet from for the singleton the outside and we have set containing the half-space whose bounding hyperplane contains that facet.   Note that is outside iff for at least one nonzero subset .3 implies the Euler relation for convex polyhedra:   ¥ ¦  ¦ ¡   if if          ¦  ¦ ¦ ¢¡   ¡    £           ¦ ¦ ¡ provided . A face of is the intersection with a supporting hyperplane. In the first case. and this section presents one that is inductive. For example. which are usually prefixed for clarity. We show that the non-zero terms cancel unless there is only one non-zero contribution to the sum. For a subset and a point we define  ¦ IX. and both cases are illustrated in Figure IX. A hyperplane supports if it intersects the boundary but not the interior. The sum ranges over all subsets of .

and rewrite the formula in the Pie Theorem  ¡   ¡¡ ¦   $  £    © ¡  $  ¦   ¤ ¥  ¤ ¦ ¥ ©    ¤ ¥  ¦    ¦ ¡    ¡¡      ©  $ ¢¦  ¡      ¡¡ ¤ ¦ ¢¦ ¦ ¡  ¦¡ ¦   ¡    ¡   §¦ ¦ £ ¡     P ROOF. . . let . which in this context means that there are no two subsets of that define the same face.2) © 129 ¦ Note that . Notice that according to this definition. and define as the closed complement of . Let be the system of subsets that define non-empty faces. The corresponding systems are and . The restriction of the inclusion-exclusion formula (IX. it is still an indicator function of . we fix a point outside all half-spaces in . . the faces on the silhouette are not visible. in which case and . To see this. as required. We have for all . Both and have one less half-space not containing than does. ¡ Figure IX. The corresponding systems form the partition .IX.2: The half-spaces and share the hyperplane and are complementary to each other. The third term vanishes because iff . . Assume . For sets there is an intuitive interpretation of . ¡¡ © ¦ )   ¡ §¡ ¦  ¦  §¡  ¢¢  ¦ ¡¡ ¦  &¡ © and &  ¢¢   ¡ " ¦ ¦ )¡ )   "¢ ¢¢   ¡ "  &  " ¢¢ E£ ¡  ¦ ¡¡ ¦ ¦ ¦ ¦ ¡ £ £ © © © ©  ¡¡  ¡ ¡ ¡ ¦ ) ¡ ¡ ) ¡ ¡ ) ¡ ¡     ¡ ¡¡ E£ ¡ ¦   ¦ ¦   ¡  ¦ ¡ ) ¦       ¡¡ ¡ ¡ ¡ ¡ ¦ ¦   © ¡ ¦ ¡ ¦   " ) ¡    © " ¦         © ¦ ©  $ "  $   ¢  " ¢   ¢¡¡   $ F ¢ ¥¤   © © © ¡ ¦ © £ ¡ ¢ ¡      ¡ ¦ ¡     ¡ ¡)    ¦ ¥  ¡    ) ¦        ¢  ¦ ©    ¢  ¡         ¦ ©   ¦   "        ¡ ¦ ¦       ) ¦      ¦     ¡¡ ¡   ¦   ¡¦ $  ) ¨  ) ¥                  ones crossing the hyperplane shared by and . We distinguish £  ¢   ¡ ¡¡ ¦ ¡¡ ¡¡ ¦ £  ¦     £  ) £ ¦   ¡ ¡ ¦ )  ¡     if if   and hence . the ones contained in . which is a half-space that contains . The second term vanishes because all sets in contain . The union of and is . Therefore because the values cancel pairwise. It is convenient to assume general position. Consider visible from if sees all facets around from outside . ¤   ¤     ¢ ¦  ¤ ¢ ¦   ¤¤ ¢ ¦  ¤ ¥§¦ ¥ ¨§¦    ¡¦  ¤ ¦ ¤  ¤¤ ¢ ¦  ¤ ¢¡¦   ¤ ¥  ¦ ¡  ¦ ¢ ¦¦  ¤¤ ¢ ¦¦  ¤¤ ¡¦¦   ¥    ¦ ¦  ¢  ¦   ¤  ¦ ¤ ¡ ¦ We claim that even though is much shorter than . We argue that all three terms on the right side of the equation for vanish. By assumption. which implies that iff and therefore . which we consider an imFor proper face but still a face of . and the ones contained in . We can therefore write their values as sums of values of the subsystems.1) to the system is (IX. This claim is sufficiently important to warrant a complete proof. where  ¡ ¡¡ ¡ ¡     ¦   ¦  ¦ ¦ . The basis of the induction is covered by . We use induction over the cardinality of the set .3. . as in Figure IX. where .  The introduced systems partition .2. the A in terms of face numbers ¢  Figure IX. The induction hypothesis thus applies. Then iff is visible from . The Pie Theorem A implies the Euler relation for unbounded polyhedra. which is again defined as the collection of half-spaces that do not contain . . .  ¦   P ’’ P y ¦ ¡ three types of faces of . and .3: The point lies in the intersection of the complements of the half-spaces. and therefore . and the faces of are defined by sets in . The convex polyhedron is obtained by removing the constraint . By assumption of general    _ g g Unbounded convex polyhedra. Define sets of half-spaces and .1 Indicator Functions we get . The faces of are defined by sets in . where ¡ P IE T HEOREM A. as shown in Figure IX.

We return to the computation of the Euler characteristic. As illustrated in Figure IX. is a closed interval with . The projection of the silhouette onto a hyperplane normal to the line is a bounded convex polyhedron of dimension . we have for all and therefore . and the same edges and vertex intersect the interior of .5. For we have   ¥  ¡     ¡   ¥  ¡  ¡    ¦  ¡ ¦   ¥    ¡   ¥  ¡  ¡  ¦  By choice of .4: Three edges and one vertex of intersects the interior of . Hence if . ¢       P IE T HEOREM B. Furthermore. by the Pie Theorem B. We get  ¡      ¢ ¦     ¥  ¡   ¦ PA      ¦   ¥  ¦  ¤ A    ¤   ¢    ¦    ¦        P     ¢  ¥  ¡ ¦   ¥     ¦ ¡ P ROOF. We choose a line not parallel to any face of and points and sufficiently far in opposite directions on the line.4. We first weaken the theorem by restricting the points to lie within a convex body . We can now argue inductively that the Euler characteristic of is . using the respective other convex polyhedron as the restricting convex body . Define and let be the corresponding sum of values. We construct a convex polyhedron that contains and approximates in the sense that . every point is contained in all half-spaces of . as in Figure IX. and the silhouette is indicated by the two hollow vertices. ¢   ¦ ¦     ¦   ¢    ¥  ¤      ¤ ¡ ¡ ¤¤ ¡         ¥ ¦ ¦  © ¡ $ "¢ ) ¡ ¡ ¤ $¡ ) h © ¢¡¡ ¨ ©       © ¦      $       ¡¦ ¡   ¡    ¦ ¡   ¢¦     ¥    ¦        ¤   ¤    £  £   ¡     ¡¦     ¤ ¤ ¤ ¦   ¨ ¤   ¤   ¤ ¢ Bounded convex polyhedra. Define   ¥ ¢     ¥ ¦      £   £     )   £ ¤¦ ¦ ¥ ¡ £ £¢¦ £ ¢¦ ¡ ¤     if if   titions into the set of half-spaces that do not contain and the set of half-spaces that do not contain . and define symmetrically. . is the number of sets . Hence for all points and therefore also for all points . ¦ Z y Figure IX.5: The boundary of is dotted. Observe that this sum counts the -face the same number of times on both sides. We need a slightly stronger version of the Pie Theorem A to prove the Euler relation for bounded convex polyhedra. Let be the number of -faces of that have nonempty intersection with the interior of . This implies the Euler relation for unbounded convex polyhedra. We show that for points .130 with cardinality position. IX M EASURES For . Each proper face of either belongs to or to or to the silhouette as seen in a view parallel to the chosen line. and then strengthen it by further reducing the set system. is an indicator function for . this par- ¤ £    £ ¡ ¦  £   ¥    £ )¥ ¡    ¡¦ ¤ ¦ §£ ¦  ¥   ¤ )  Y z  . this time for a bounded convex polyhedron . which establishes the induction basis. By the choice of . same as on the left side. ¦     Restricting body. that of is solid. The system contains exactly all sets for which . Let be the number of -faces in the silhouette. ¤   £  ¢  ¦ ¡ £    £¡   ¦ ) ¡  ¡  ¤ ¤   ¡ ¡     if if        ¢   ¥  ¡ ¦          ¥  ¡ ¦  and use the Pie Theorem A to get ¦ ¦   ¥     ¡ ¦ ¡ ¥ ¦ ¢  ¦   ¦   ¥  Figure IX. On the right side it is counted times.

where the inclusion-exclusion approach to measuring the union of balls is laid out. Eulers Charakteristik und kombinatorische Geometrie. E ULER . 194 (1955). 13 (1995). Sci. He implicitly assumes that the boundary complex of every convex polyhedron is shellable. . M ANI . We note that all authors of papers referenced in this section are Swiss. Most of the material in this section is taken from [2]. except for one who has a Swiss grandmother. this principle also yields the Euler relation for convex polyhedra. [5] H. Adding the alternating sums of the . There are e many proofs of that relation. Acad. 140–160. 29 (1972). 92 (1981). ¨ [7] L. 101–110. Imp. [6] W. [3] L. [1] H. Scand. B RUGGESSER AND P. Acad. Zur Einf¨ hrung der Eulerschen Charakteristik. 197–205. [4] L. 4]. Demonstratio nonnullarum insignium proprietatum. N EF. Shellable decompositions of cells and spheres. Bibliographic notes. which has not been established until 1972 by Bruggesser and Mani [1]. [2] H. u Monatsh. Elementa doctrinae solidorum. and the historically first one for the general -dimensional case goes back to the work of Ludwig Schl¨ [7] in the middle of the nineteenth cenafli tury. 415–440. E DELSBRUNNER . Written a 1850–52 and published in Denkschrift der Schweizerischen naturforschenden Gesellschaft 38 (1901).  131 ¡ ¦   ¥       ¦  ¤ ¡     ¢   . Reine Angew. E ULER .1 Indicator Functions by induction hypothesis. S CHL AFLI . H ADWIGER . Geom. 109–140. Petropol 4 (1752/53). Imp. and implies . as required.IX. 1–237. although there is evidence that Ren´ Descartes knew about it a century earlier. Discrete Comput. As demonstrated. who thus filled the gap left in Schl¨ afli’s proof. Sci. Math. Petropol 4 (1752/53). quibus solida hedris planis inclusa sunt praedita. 6]. Novi Comm. finding elementary proofs of the Euler relation for convex polyhedra seems to be a favorite topic for Swiss mathematicians [5. 41–46. Math. The union of balls and its dual shape. Theorie der vielfachen Kontinuit¨ t. Indeed. J. The discovery of that relation for convex polyhedra in three dimensions is usually attributed to Ludwig Euler [3. Math. Novi Comm.

Recall that is the unit sphere centered at the origin . which is the intersection of the 3-sphere with a half-space . as shown in Figure IX. Let be the collection of half-spaces that contain the north-pole. the above formula gives a proof of the area formula for spherical triangles. Stereographic projection.132 IX M EASURES IX. Volume by integration.2 Volume and Surface Area In this section.6: A pyramid cut out of a ball by three half-spaces. subtracting three half-balls. and be the dihedral angles between the planes. For measuring molecules. It follows that the volume is  ¢ £  ¢ ¤  where is the closed complement of the half-space . The volume of the pyramid can now be computed by taking the ball. the indicator function of a geometric set is 1 inside and 0 outside the set. Then is the stereographic projection of the portion of that is not contained in the interior of . Call the north-pole of . If applied to all points of a ball in . so does contain . This is illustrated in Figure IX. the angles of the spherical triangle. Consider for example a bounded convex body and a convex polyhedron . we compute the volume of   ¥ ¢     ¦ The area of the spherical triangle is three times the volume divided by the radius of the sphere. or equivalently.    ¢ ¡      ¢   ¢   ¤ ©    © ©  § ¡ ¦   ¡    © ¥  © ¤ ¡  $ % 0  $ $ F ¢ ¥¤   ¤  ¤ £ £ ©  ¢ $ © ¢  ©    © F¢¤ ¥    © ©   ¢    F¢¤ 0 $  0  ¥    ¥         ¦ ¦   ¥    ¥  © ©       ©    © ¦ £ £ £ £        ¢ © © ¦      © © ¡  ¤ ¢¢ ¥          ¢ a© ¥ ¢ ¥ a©       ¢¢ ¦    .7. Let . The half-spaces intersect in an unbounded triangular cone. in which the volume is a sum of terms each involving four or fewer half-spaces. and total arc length of a space-filling diagram. We transform the question into one about half-spaces in . . we use the indicator functions developed in Section IX. we get a cap of . Let be a set of three half-spaces whose bounding planes pass through 0. Union of balls. The half-space lies on the side of its to .7: Stereographic projection from hyperplane that does not contain the north-pole. . Let be the unit 3-sphere with center at the origin and identify with the hyperplane . ¢ ¡   which implies that the area of the spherical triangle is .6. and the intersection with the ball bounded by is a pyramid whose base is a spherical triangle. adding three sectors.1 to derive inclusion-exclusion formulas for the volume. By definition. we are mostly interested in the case . the sets contain or fewer half-spaces each. We can therefore compute its volume by integration. area. We now turn to the problem of measuring the union of a finite set of balls in . Instead of computing the volume of directly. The stereographic projection maps a point to the point collinear with and . Assuming general position. ¢        ¢ ¥  ¢¦ ¡ £  £    ¡ ¡¢     ¢   ¦ ¡   ¦ §    ¥   ¢ ¢   In dimensions. ©   B     ¦ B ¢ ¥   ¢ ¡ ¢   ¦ B  ¥¡ ¢ ¡ ¥ ¢   ¡§ ¡         ¢ ¡ ¢   ¦ ¡ ©B ¥ ¡ $  ¡   1¥ ¡ ¦ ¡ 0 ¦    ¡   ¤ ¤¤ Figure IX. and subtracting the reflected pyramid. That radius is one. Let be the system of subsets of that appears in the statement of the Pie Theorem B in the last section. The volume of the intersection of the two convex bodies is N Figure IX. The map is bijective and therefore has an inverse.

For we get and therefore a zero contribution to the area. Use to project the boundary complex of to . which contains the north-pole in its interior. and the intersection of the half-spaces is a convex polyhedron .  The sets with one or no half-space are redundant because in these cases. Hence. we get a Pie Area Formula for the surface area of . Figure IX. For each ball we get a half-space . The volume of the portion of outside the polyhedron is 133 complex of and do inclusion-exclusion with a term for every simplex in the dual complex. the area of is the area of minus the alternating sum of the areas of cap intersections. Instead of the system of half-spaces we now use a system of balls obtained by substituting for . A subset belongs to iff its corresponding face of has non-empty intersection with the ball bounded by . namely for the system of balls and for a generic set in . Letting be the sphere and the set of caps. Instead of proving this algebraically. we use the Pie Volume Formula on the set of caps defined by intersecting balls.8: The area of the union is the sum of eight disk areas minus the sum of nine pairwise intersection areas. We observe that the index system in the Pie Volume Formula is an abstraction of the dual complex of . a non-empty set of half-spaces is in iff the corresponding set of balls defines a simplex in the dual complex. This is illustrated in Figure IX.2 Volume and Surface Area . For each set of caps in the system . The volume of the union of a finite set of balls is  Similarly. P IE VOLUME F ORMULA . The proof of the formula is similar to the one for area. Similar to volume. To prove this formula. Area and length. we can get a Pie Length Formula that measures the total length of the circular arcs in the boundary of the union of balls.7.    ¢   §   £ ¡ ¥ £ © ££ F   ¢   ¥  ¢ ¡     ©  ©        !  £ ¡ ¥ £ We could now get a formula for by scaling the volume by the distortion factor of . Let be the 4-ball bounded by and the system of subsets of that appears in the Pie Theorem B.      ¢     ¦ ¤    ¢   ¢  § a© £ £ F    ©    ©     ¢ ¢ ©    ¢ ©  § ¡ ¢    a© £ £ F ¢ ¡©   ¢        © $ ££ F   $ F ¢ ¥¤   © © $ © ¢ ¡ ¦ ¡ ¢ £   ¢   §¢   ¥   ¢         ¥  ¥  © © ¤ ££     ¡     i¡ ¡    ¦   ¡ ¢   ©  © F   ¤  © ¦ © ¤ ¡    ¥     ¥  ¦ ¥   ¥   ©   ©    ¢ ¡       ¤ ¤ ¢   £¡ ¢ ¡   ¢ ¡   ©   ©    © © © © ©  £ £ £   ¤ ¢ ¤        ¦ ¦   ! ©   ¦ © ¢    ¥ "   ¢             ¢ © "     ¢   ¤ ¡ ¤¡ ¦ ¦  ¢ ¤ ©   ¤ ¢ ¡ ¤  . plus the sum of two triple-wise intersection areas. For a single sphere. where is the abstraction of the dual complex of . except that the summation is done over all circles that are intersections of two   § §¦¡ § £ £ F  ¢ ©       ¥    ©  ©       ! §   §¦¡ §  ¦ §¡ ¢ ¢ Start with and embedded in as suggested in Figure IX. we use the same notation. For convenience. revisited.8. We have arrived at a simple interpretation of the Pie Volume Formula: construct the dual ¤ ¥    ¥  £ ¡ ¦£  £ £ F ¤   ¥  ¥   ¤ ©   ¦       ¤ Dual complex. A more straightforward derivation of a formula for the ball union translates the inclusion-exclusion formula from to . Since the caps are two-dimensional.IX. This is the weighted Voronoi diagram of . we have the corresponding set of balls together with the ball of in the system of . we get the Pie Area Formula given above. But this is also the condition for the projection of to have non-empty intersection with the interior of . the volume formula becomes an area formula. we explain the connection in geometric pictures. we add the contributions of individual spheres. . By summing over all balls.

For each such circle. 20 (1992). In “Proc. H ADWIGER . NAIMAN AND H. Ann. Princeton Univ. New Jersey. Edelsbrunner generalized the formula to allow for different size balls and strengthened it by using the dual complex as the index system [1]. Press. Statist. 1995”. That projection is conformal (preserves angles) and has a number of other nice properties. Discrete Comput. T HURSTON . Naiman and Wynn proved that the volume of a finite union of congruent balls can be expressed by an inclusion-exclusion formula whose terms correspond to the simplices in the Delaunay triangulation of the centers [4]. E DELSBRUNNER . Q. W YNN . For each triple in we have a three-sided spindle with two vertices. Vorlesungen uber Inhalt. Found. Comput.. [2] H. The material in this section is taken from that paper. Bibliographic notes. Sci. many of which can be found in the book by Thurston [5]. [5] W. The union of balls and its dual shape. In 1992. Geom. For two or fewer balls we have no vertices. IX M EASURES ¦ ¡ ¢ £¡ ¢ £¡ ¤   §¢   ¢ ¦ ¡ ¤ . Levy. Three-Dimensional Geometry and Topology. It follows that in the generic case. Algebraic decomposition of nonconvex polyhedra. IEEE Sympos. E DELSBRUNNER . 36th Ann. [3] H. 1997. the number of vertices of is twice the number of triangles minus four times the number of tetrahedra in the dual complex. The inclusion-exclusion formula suggests that this number is the alternating sum of vertex numbers of common intersections of balls.134 spheres forming a pair in . [4] D. [1] H. 1957. 43–76. Inclusion-exclusion formulas for such polyhedra can be found in [2]. Berlin. 415–440. a union of intersections of balls corresponds to a union of intersections of half-spaces. 13 (1995). Inclusion-exclusion Bonferroni identities and inequalities for discrete tube-like problems via Euler characteristics. P. We might even go one step further and consider the number of vertices of . P. Oberfl¨ che und ¨ a Isoperimetrie. Just as a union of balls in corresponds to a convex polyhedron in . Edited by S. we apply the (one-dimensional) Pie Volume Formula and thus get an expression whose terms correspond to the simplices in the star of the pair. The proof of the volume formula uses the inverse of the stereographic projection to transform balls in to half-spaces in . Springer. Volume 1. The latter is Hadwiger’s notion of a not necessarily convex polyhedron [3]. and for each quadruple we have a rounded tetrahedron with four vertices. 248–257.

The left drawing suggests that the area of the triangle is . This definition can be used in any dimension .  ¢       ¢       IX. To simplify the notation. We use similar conventions for triangles. the 0-sphere is a pair of point with possible subsets the empty set. This condition is equivalent to the three circles decomposing into eight regions in the way shown in Figure IX. we get sums that evaluate to zero if we replace volume by area or length. ¡R R R R ¡ R  ¡   R  ¡ R  $  ¡  s ¡ R ¡ R  R  ¡ R   ¡   R  ¡ #R  $  ¡  s   ¥ 0     R ¨ ¢ © ¡ Angles of revolution. The new collection leads to formulas for voids. and the one-dimensional angle at an edge as a dihedral angle. The only zero-dimensional angles are therefore 0.3 Void Formulas  ¥ ¤ ¨ ¢ £¡ c b ¢ ¡  ¨     ¥ ¡ ¦ ¤   © ¦¦ ¤ ©               ¤ ¨  ¨ ¤ . Equivalently. and we will see shortly that this convention makes perfect sense when we compute volume using angles. but the right drawing in Figure IX. For example. The zero-dimensional angle of a triangle is always . we let denote an independent set of four balls and. Specifically. is the volume fraction of a sufficiently small ball centered at an interior point of that lies inside the tetrahedron. A two-dimensional angle is the area of a piece of the unit 2-sphere and can assume any value between 0 and . for the area of the intersection of the disks with centers and . Let . I NDEPENDENT VOLUME F ORMULA . If we change the meaning from area to perimeter we get . which are bounded components of the space outside the union.10: Both triangles are spanned by the centers of three independent disks.     ¥ Figure IX. Similar to the two-dimensional case. Recall that a collection of three disks in is independent if for ev  § ¦ ¡ § ¤ ¥ ££ F  ¥  ! £ § ¡ ¥ £ ¤ ¥ £ £ F ¢       ¥       ¥      dimensional angle at a vertex as a solid angle. we also define the angles of the improper faces of as and . . Figure IX. at the same time. a single point.9 illustrates the definition.9: The solid angle at a vertex. edges.3 Void Formulas 135 . and arc length of a union of balls in .IX. the dihedral angle at an edge. and vertices. .10 indicates that there are cases where the formulas are not as obvious as to the left. and . It is convenient to normalize so that in both cases the full angle is 1 and every angle is a fraction of the full angle. there is a point inside every disk ery subset in the subset and outside every disk not in the subset.   § a© ¢   ¤ ¥   ££ ¥ F   ¥  ¨ 0    ¨    Consider for example a tetrahedron . . and 1. where we write for the area of the disk with center . or both points. the tetrahedron spanned by the four ball centers. In we refer to the two  vertices . The volume of an independent tetrahedron is ¥  ¥    ¥      ¥  Independent triangles and tetrahedra. Both formulas hold whenever the three disks are independent. For convenience. and the zero-dimensional angle of a triangle. For each face .  The proof of the formula is somewhat technical and omitted. and be the angles at the c We generalize the formulas for independent triangles to independent tetrahedra. $ 0    ¡ ¥ ¥ This section derives another collection of inclusionexclusion formulas that express the volume.10. a b a Figure IX. we drop the distinction between abstract and geometric simplices. surface area. and so on. we define the angle as the fraction of directions around along which we enter . . A (one-dimensional) angle is by definition the length of a unit circle arc and can assume any value between 0 and .

IX M EASURES the same formulas for area and length. the second sum is exactly the volume of the fringe. and vertices . Furthermore. As defined earlier. . For triangles.136 Angle weights. except that the first sum vanishes:  and decompose into the parts defined by the tetrahedra that contain as a face. It is convenient to cover the portion of outside the Delaunay triangulation with tetrahedra. This can be done by adding four points viewed as degenerate balls to the set . We get VOID VOLUME F ORMULA . It is therefore not surprising that we can rewrite the Angle-weighted Pie Volume Formula to get an expression for the volume of a void of . We write for . of    ¢   §¢ ¥  § ¦¡§  § £¡ ¥ £ ¢ ¡ ¢   ¥ ¤ ¥   ¥ ¤ ¥ ££ F  ££ F   ¥    ¥ ¤ ¢      £© ¢  ¤  £© ¢ ¢     ¤          ¥ ¥       !   ! § £ ¡ ¥ £ ¦¡ ¢   ¢    a©  ¨ ¤ )  § a©   ¥ ¢  § a©  ¢  ! § a© ¥ £ £ F  ¢ ¡ ¥ ¢   ¥ )    ¤ ¤ ¥ ¢ ¤  §   ¥ ¢ ¤ ¥   ££ F ¢ ££   ¢¢ ¤ ¢¢   ¥  ¢ F ©     ¥    ¥   ¥  ¤ ¢      £© ¢ ¡ ¨  ¤ ¨ ¢ ¥ ¥¤¢      £© ¢ ¢   ¥       © © ¥    ¢    ©    ¦        ¡¡     ¤ ¡    ¥ ¨     ¥  § a©   ¢¢ ¤ ¢¢             ¢       ! a©    ¥   ! © ¥ ¤   ! a© ¨ ¤  ¢   ¢ ¤ ¢  ) ¢ ¨ . a void of a union of balls is a bounded component of the complement space. This results in the new volume formula. The most straightforward translation of the angle-weighted formula suggests we compute the volume of by first computing the volume of the corresponding void in and then subtracting the volume of the fringe that reaches into that void. The volume of a void with dual set is #   ¦¤¢ ¦    £© ¥ ¢ ££ F     ¢ © ¨ a© ¦  )  ¤ ¤ A NGLE . The corresponding void in is triangulated by a subset of the Delaunay triangulation. Nevertheless. Figure IX.  Voids.11: Both voids in the union of disks is contained in a corresponding void of the dual complex. We first make the Pie Volume Formula more complicated and then simplify by cancelling terms. the only coface in is . Whenever is a tetrahedron in . With this notation we can rewrite the Pie Volume Formula as    § a© ¢   ¥ ¤ ¥ ¥ ¢         ¥   # ¢ ©   £¢ The new formula suggests we compute volume in two steps. and second we add the volume of the fringe. Observe that not all pieces considered in the second sum are subsets of the fringe. Strictly speaking. the angle is as before. the union of balls looks a lot like from a point outside all balls and voids. missing the simplices that bound the void in . let denote the collection of pairs with and . edges. We derive a new volume formula for a union of balls by combining the Pie Volume and the Independent Volume Formulas. First we compute the volume of the underlying space of itself. The volume of the union of a finite set of balls in is     £¢ # # ¤ where is the Delaunay triangulation of . is not a triangulation because it is not even a complex. and the contributed term is . some might reach into the interior of . . For example for a tetrahedron . Let denote the set of tetrahedra in a simplicial complex . We need some notation to continue. We start with the Pie Volume Formula.11 illustrates the fact that every void of is contained in a void of . the contribution is split up into as many pieces as there are angles around . for a subcomplex . we use the Independent Volume Formula to make a substitution.WEIGHTED P IE VOLUME F ORMULA . . Figure IX. From a point inside the void.

this implies that the sum of angles at the vertices of a convex -gon is . Geom. vol. L IANG . where the second containment follows because is obtained from by growing every ball of radius £ ¥ #¥     ¥  £ ©  ¥  £ ¥  ¢ ¡ ¡ Assuming these three conditions. 28th Ann. Measuring proteins and voids in proteins. A treatment of Gram’s angle sum formulas can be found in Gr¨ unbaum [3. for the edges. The union of balls and its dual shape. By choice of . E DELSBRUNNER . which states that the alternating sum of angles in a bounded convex polyhedron always vanishes. Let be the set of balls we add. 415–440. and . V: Biotechnology Computing. The implementation of the formulas are part of the Alpha Shapes software and their use in structural biology has been described in [2]. we get formulas for the area and the total arc length of by substituting for in the corresponding formulas of :  137 to radius . The Angle-weighted Pie Volume Formula is related to Gram’s angle sum formula. Assuming general position. and consider . chapter 14]. ¦  £ ¨ a© ¤       £ ¢   ¢ ¤ (i) be finite. The material of this section is taken from [1]. FACELLO . for the gon. Finally. Wiley.3 Void Formulas Similarly. minus 1. Hence. In . the sum of angles at the vertices is not longer determined by the combinatorial structure of the polyhedron. Define and note that the underlying space of is the void in that corresponds to the void in . The first complex is the sequence is and the last is . England. Let be the set of centers and note that the dual complex of is just together with finitely many isolated vertices. 1967. we construct so that (i). as required by (iii). The main idea in the proof is to cover the void with small balls and measure the difference between the new and the old union. F U AND J. 13 (1995). The Angle-weighted Pie Volume Formulas for the two unions are    ¥  ¤  (iii) ¢ . (ii). Interscience. 1995”. there exists a positive with . and they have the same dual complexes by the choice of . We require that faces In . A.   ¡ ¤  ¤ £   ¡ #  ¤   0 ¤ ¤¡ ¥   ¤ ¢  ¡    ¡ ¤   0 ¡   ¢ £   ¤  £    £  §  § £¡ ¥ £ ¦¡ § ¢   ¥ ¤ ¥   ¥   ¤ ¥ ¡ ££   ¢ F  ££ ¤ F   ¥    ¥ ¢ ¤ ¢ ¦    £© ¢  ¤ ¢ ¦    £©  ¢ £ ¢   ¡      £     ¢ £     ¤   ¢   §§¥ ¡ §¢          £ £            ¤        ¥ ¥   ¢ a© ¡   ¤     £¢ # ¢  ¤ £      ¡ ¤ ¤     ¡   ¡     ! © # ¥ ¡   ¡  # #   £ ¡ ¥ £ ¦¡   ¢ a© ¤ ¢ ¢ #   § ¢ #     . the balls in are contained in and thus cannot contribute to the union of balls in any other way than covering . 256–264.IX. Conf. but the sum of solid angles minus the sum of dihedral angles is. we have . G R UNBAUM . M. ¡   ¢¢ ¢ ¢¢  ¥   # Proof of void volume formula. Expressed in radians. and (iii) are satisfied. In “Proc. which also contains a proof of the dimensional version of the Independent Volume Formula. E DELSBRUNNER . Bibliographic notes. System Sciences. and have the same Voronoi diagrams and Delaunay triangulations by the way we changed the radii. £ £ [2] H. ¤ ¥¤ ¤ £ £ F      0       § a© £ ¥    ¤  £ # ¢  § a©   £ ¢ ©   ¢   £¤   ¤      ¤   ¥   ¥ ¤ ¥ ¤ ¥   ¡ 0 ££ ¤ ¢      £© ¥ ¢ ££ F      ¢ © ¨ a©     ! a© ¤ ¢ ¤    ¤   ¢£©  ¥       ©  F ¥   ¥ ¡ (ii) be a subcomplex of . London. Let be a finite set of balls of radii with centers in the void that covers . Hawaii Internat. where is obtained from by reducing every ball with radius to radius . . P. ¨ [3] B. hence as required by (ii). [1] H. Discrete Comput. this is . © The difference gives the Void Volume Formula. Convex Polytopes.

 ¥   ¥               we get as from alvis. name. to do . The following pseudo-code is then a direct implementation of the Pie Volume Formula of Section IX. As an example consider the measurements of voids in cdk2. Before exploring any of the other options in volbl. . ˚ for A. While measuring the voids.contrib. The measurements are in A . where is the van der Waals radius of the -th ball. Running volbl. which is an enzyme involved in the control of the growth process of a body cell.    IX. and total arc length of a ball union and its voids.4 Measuring Software [Should we add a short discussion of Patrice’s new software that also computes derivatives?] Volbl stands for the ume of a union of a ls. Algorithms and data structures. It is not necessary but a good idea to execute volbl in parallel with visualizing the alpha shapes of the same data. While the largest void is more than ten times as large as any of the others (in volume).12: There are eight voids in the -complex of cdk2. and A. and volbl outputs the measurements of all voids. as appropriate. Some of the voids have (open) dual sets that seem connected in the image but are not because of missing triangles. We simplify the actual situation insignificantly by assuming that the simplices in are stored in an array .138 IX M EASURES the corresponding interval of -values.776804e+01 number of corners: 34 The index of the void is a unique but fairly arbitrary integer assigned during the process of collecting the tetrahedra ˚ ˚ ˚ in the dual set. which confirms out intuition about the size difference between the two representations. It is part of the Alpha Shapes software and can be used to compute the volume. This list is a prefix of the masterlist mentioned in Section II. and Length Formulas.4.12 occur for the solvent accessible diagram defined ˚ A. it is still only of the order of one van der Waals ball.504511e+02 void volume: 1. The corresponding void in the dual complex is more than twenty times as large. The Angle-weighted Pie and Void Volume Formulas use the masterlist and in addition require a representation of the voids. The output for the largest void in this example is measurements of void. where is ¡ ¨  © ££ F ¢     ¥  E£  E £ E £ ¡ E £ ¢ ¢  ¢ ) ¤ £ ¦£  ¢   ¥ §   £ ¦£  ¢ ¨    ¥ ' § ¨   for ¤  ¤   ¤ ¢  )    ¨  ¤ ¥ ¢         £  ¤£ ¤ ¢  ¨ £   ¡§   ¢   ¢ £ ¢   ¢   0    ¤  £   £ ¢ ¡ ¡ ¡ ¢   ¢¡      ¢   £   0 a©   ¢   . In other words. as explained in Sections II. Area. The voids shown in Figure IX.009809e+01 surface area: 3.2. . which endfor. surface area.880316e+01 arc length: 5. index 845: number of tetrahedra: 26 tetra volume: 2. we need a list of the simplices in the dual complex of .4. we look at the wirefor frame of the dual complex defined by the balls with radii . Measuring voids takes about seconds on the author’s SGI Indigo II. which we do by typing > alvis name & > volbl name on the command line. After entering the index of the -complex. The software uses the files generated by delcx and by mkalf that represent the Delaunay triangulation and its filtration. To measure a union of balls using the Pie Volume. We use a partition of the Delaunay tetrahedra into the dual complex and the various voids. The software will start with a dialogue narrowing down the options of what to compute.3 and II. A . Figure IX. we pick the middle of The implementation of the Area and Length Formulas is similarly straightforward. we take a brief look at the algorithms used and the data structures these algorithms require. the software calculates for each ball its contribution to the void area and outputs the result in a new file.

In the checking option.915391e+04 Lof = 1. area. which is apparently rather small. Options. downto do . Figure IX.3. which we refer to as corners. The following pseudo-code is a direct implementation of the Void Volume Formula of Section IX. The software computes the volume.1 lists the main measurements made. It does this for the spacefilling diagram .100959e+04 Lsf = 1. We have voids.1: Cumulative measurements made by the Volbl software. We compute the lists by maintaining a union-find data structure while scanning the masterlist from back to front.IX.100959e+04 Aof = 3.1 and prints a summary of the results.13: The dual complex of the van der Waals diagram of cdk2. the sum of volumes of the space-filling diagram and its voids should be equal to the volume of the envelope. The surface area. The complex has vertices and no voids. The software also checks a few linear relations that should vanish provided the computations are correct. As an example consider the van der Waals diagram of cdk2.962563e+04 ¡   ¡!)    © forall faces if then do ¨ ¢   ¥   ¤ ¥  £   ¨ ¤  ¢    £ ¤£   ¡§   £     ££ F   ¤£ Y ¡§ ¢ ¡¢ ¨  ¤ ¨ ©    ¦£   ¢ ¥§    ¤£   ¢ ¡§ £   £   ¡ ¨   ¨   ¥      ¥ ¨       ¨    ¥   ¢       ¤ space-filling diagram voids outside fringe envelope dual complex dual sets of voids ¢¢ ¤ ¢¢ ¨ ¨  £ £    ¢  ¢  £  ¦£   ¢ ¥§ £       £ Table IX. Asf = 3. The specific relations checked by the software are Vsf + Vtv . length. total arc length. and the outside fringe.0 0 The implementation of the Void Area and Length Formulas is similarly straightforward.Vtiv Asf Lsf Csf Vsh Atv Ltv Ctv Vof Aof Lof Cof = = = = 0. each represented by a linear list of tetrahedra. it reports that there are no voids and it prints the sizes of the space-filling diagram and the outside fringe as Vsf = 3. U NION F IND F IND endfor. In the considered example. . The difference is the volume of the dual complex. case .034036e+04 Vof = 2. For example. the voids in the dual complex. its voids.0 0. and also the number of vertices in the boundary.0 0. A DD . Table IX. and the envelope (defined as the space-filling diagram union all voids). We fix this problem by adding a dummy tetrawhenever is hedron to the system and setting a triangle on the boundary of the Delaunay triangulation. and number of corners are of course the same for both. .915391e+04 Csf = 6388 Cof = 6388 Note that the volume of the space-filling diagram is insignificantly higher than that of the outside fringe. The only trouble with this algorithm is that tetrahedra in the unbounded component may be scattered in more than one list. which in turn should be equal to the sum of volumes of the dual complex. let be the first and the second Delaunay tetrahedron that has as a face.13. forall tetrahedra  139 vol Vsf Vtv Vof Ve Vsh Vtiv area Asf Atv Aof Ae lgth Lsf Ltv Lof Le crns Csf Ctv Cof Ce do ¢ . the software computes all terms in Table IX. endif endfor endfor. for case .4 Measuring Software the set of tetrahedra in the unbounded component of the complement of . the outside fringe (defined as the portion of the unbounded component of the complement of that is covered by the balls). whose dual complex is shown in Figure IX.

We plug the values for and into the formula for the area of and get ri ρj pj wj ϕ pk ϕ Consider now the intersection of two caps. Assuming that and are rational. and it does this for the space-filling diagram. We let be the angle at the two vertices and and the lengths of the two arcs. We then have two shared vertices approach as goes to infinity. £ and and Bibliographic notes. His proof is existential and superceded by explicit formulas that can be derived by the same methods as described in Sections IX. The structural biology literature distinguishes between numerical and analytical approaches to measuring molecules. The idea of using inclusion-exclusion for size computations goes back to Kratky [4]. All analytic formulas needed to measure the common intersection of up to four balls are straightforward. we can find infinitely many integers so that the two -gons share two vertices near the vertices of the . all measured as fractions of a full circle. and and arc lengths .1 and IX. The area of that -gon is .14: To the left. The area of the cap is then times the area of the sphere . as illustrated in Figure IX. and the outside fringe. the area of the approximating -gon is the same.14. To construct the -gon. the -gon has vertices with angle and vertices with angle . which is . In the checking option. the software outputs a file name. To compute we recall that the area of the cap is . the cap contains all points whose power distance from is no less than that to . as shown in Figure IX. We approximate the bigon by a spherical -gon.2. Area formula. except possibly the area of the intersection of up to three caps. Since all simplices in are independent. Scheraga and coauthors [5] implement an inclusion-exclusion formula for a union of balls based on Kratky’s work. A formula for the area follows from the GaussBonnet theorem in differential geometry. for the intersection of three caps with angles . It also checks whether the sum of contributions really add up to the total area. To the right. where . the shaded cap has radius width . . The points are placed slightly outside the circles so that the areas of the -gons are exactly the areas of the caps. after eliminating the terms that vanish when goes to infinity. where the sum adds all angles in the -gon. and we get for the area of . the software compares for each atom the area contribution to the space-filling diagram with the sum of contributions to the voids and the outside fringe. we may assume that the intersection is a bigon. We define the width of equal to the distance between the two planes that cut from . The cap on a sphere consists of the portion inside the sphere . but we prefer to derive it with elementary means.contrib that contains the contribution of each individual atom. and symmetrically . IX M EASURES whose edges are by definition great-circle arcs. Similarly. A detailed documentation of the Volbl software is given in [3].14. This is because a triangulation produces spherical triangles each contributing one half times the sum of the three angles minus one quarter to the area. Let and be the angles in the two -gons. This makes sense for volume and area but is done only for the latter. but the lack of an explicit expression occasionally leads to miscalculations [2]. . For the latter approach. The angles at the bigon. we would decompose the molecule into simple pieces and give a formula for the size of each piece. the voids. . who shows that there is a short inclusion-exclusion formula for the area of the intersection of a finite set of disks in the plane.      Figure IX. Furthermore.140 Another form of output is the description of the total measurement as a sum of contributions over individual atoms. we approximate each of the two circles by a regular spherical -gon. Note that the formulas give the precise area of the intersection of two or three caps since the approximating spherical -gon is only a tool in the proof and not used in the formula. An example is Connolly’s work [1] on computing the area of a molecular surface. ¨  ¨   ¡  ¡        ¥  "    ¡    ¢£ ¦ ¤    £ $ ¥    ¥ ¨ ¨ $   $ ¥  0       £    ££   ¦ ¤    ¨  ¨ ¥ 0     0    ¨   ¥ ¥ ¤ ¥  ¥   £   ¡ $ ©  ¥   $ ¡ ¥ ¥ £ $   $   $ ¨ ¥ ¥ ©      $ 0 ¥ ©  £ ¤    ¡    0 ¨ £ ¨ £ ¥ ©    ¥ £ ¥ ¥ £ ¥ ¨ ©¥  0 $ ¥ ©  ¡ ¡ ¡ $ £        ©B  © ¤ #B  © ¢  ¡ ¡ B ¡ $ ¡    ¢    0   ¢ ¤ ¦   0 ¦ ¡  0     ¡     0 0 ¡  ©    ¡ ¡ ¡ ¡  © ¦  ¡  ¨ ¢ ¥    ¤    ¢ ¥   ¢ ¤ 0    . the shaded bigon has angles and arc lengths and . $ $ $    ¥ ¤ £   ¡ ¦ ¤     ¤ ¤ ¥     ¥ &¨        ¤   ¤  ¤ 0 $ ¥   ©  Let be the radius of and the radius of the circle bounding . Equivalently. By construction. Depending on the type of area measurement. namely . Hence .

1017–1024. A. M AIGRET AND H. NAYEEM . C ONNOLLY. Comput. 1313–1345.IX. K RATKY. J. 1–11.. Chem. [4] K. Molecular Physics 72 (1991). 16 (1983). 1994. [2] L. Appl. The area of intersection of equal circular disks. Measuring space filling diagrams and voids. J. MSEED: a program for rapid determination of accessible surface areas and their derivatives. Urbana. A: Math. UIUC-BI-MB-94-01. T HEODOROU . D ODD AND D. Cryst. B. Gen. Illinois. A. Illinois. Univ. 13 (1992). A. K. V ILA . J. [3] H. Phys. E DELSBRUNNER AND P. S CHER AGA . Rept. R. 11 (1978). B. L. Analytical molecular surface calculation.4 Measuring Software 141 [1] M. PALMER . 548–558. Beckman Inst. Analytic treatment of the volume and surface area of molecules formed by an arbitrary collection of unequal spheres intersected by planes. F U . [5] G. C HENG . J. G IBSON . N. W. D. P ERROT.   .

Let be a line that avoids all point. Every question can be answered using the material presented in this chapter. Section of triangulation.142 IX M EASURES Exercises The credit assignment reflects a subjective assessment of difficulty. 1.  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Let be a triangulation of a set of points in the plane. (2 credits). Prove that intersects at most edges of and that this upper bound is tight for every .

In the case of van der Waals or solvent accessible diagram.1 X.3 X.4 Implicit Solvent Model Weighted Area Derivative Weighted Volume Derivative Derivative Software Exercises 143 .2 X.Chapter X Derivatives The derivative of surface area under deformation is an important term in the simulation of molecular and atomic motion. X. it is related to the length of the circular arcs in the boundary.

1 Implicit Solvent Model [Give a general introduction and work out the relationship with area and volume derivatives.144 X D ERIVATIVES X.] .

Duke Univ. The area derivative of a space-filling diagram.] [Explain the results and disucuss the continuity issue of the functions. B RYANT.X. Durham. P. L EVITT. .2 Weighted Area Derivative 145 X.2 Weighted Area Derivative [Talk about the unweighted and the weighted area derivatives. Manuscript. North Carolina. 2002. E DELSBRUNNER . KOEHL AND M.] [1] R. H.

146 X D ERIVATIVES X. Manuscript. Durham. The weighted volume derivative of a space-filling diagram. 2003.] [1] H.3 Weighted Volume Derivative [Talk the unweighted and the weighted volume derivatives. E DELSBRUNNER AND P. North Carolina. Duke Univ. KOEHL .] [Explain the results and disucuss the continuity issue of the functions. .

X.4 Derivative Software 147 X.] .4 Derivative Software [Discuss Patrice’s ProShape software.

148 X D ERIVATIVES Exercises The credit assignment reflects a subjective assessment of difficulty. Section of triangulation. 1. (2 credits).  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Every question can be answered using the material presented in this chapter. Let be a triangulation of a set of points in the plane. Prove that intersects at most edges of and that this upper bound is tight for every . Let be a line that avoids all point.

21. 21 alpha shape. 1 chain. 57 body (inside a skin). 28 affine hull. 40 gradient. 62   . 29 codon. 103 . 96 coordinate system. 19 Connolly surface. non-degenerate. 60 Corey-Pauling-Koltun model. 49 homology group. 28 alpha complex. weighted. 32 . 28 convex polyhedron. 116 canonical basis. 36 Euler characteristic. 11 linear algebra. mean. 16 continuous function. 23 group. 44 homotopy equivalence. 32 . 16 join. 40 edge flip. 9 element. 44 homotopy type. 48 Helly’s theorem. 96 integral line. Gaussian. 32 . 63 graphical user interface. 20. 9 attachment. 28 convex hull. 44 image (of a function). 45 kernel. 9 atomic number. 48 length scale. normal. 51. 48 homotopic map. 20. 32 cycle group. 40 electron. 29 boundary group. 60 Gouraud shading. 103 area. dihedral. 69 edge contraction. 32 gene. solid. 45 homotopy. 45 convex combination. 16 coset. 23 amino acid. 24 face (of a polyhedron). 61 indicator function. 5 angle. 49 boundary homomorphism. 61 homeomorphism. 44 contractible. 96 e exact arithmetic. 24 fundamental theorem of linear algebra. 51. 32 Gaussian curvature. 116 Hessian. 32 gluing map. 48 Johnson-Mehl model. 2 geodesic. 21 Alpha Shape software. 7 affine combination. 61 . 9 -sampling. 49 . 60 central dogma. 63 interval tree. restricted. 103 index (of a critical point). 32 . 2 dual complex. 48 facet.S UBJECT I NDEX 149 Subject Index active site. 35 . 60 backbone. 57 cell (in a complex). 9 atomic weight. 103 Dirichlet tessellation. 49 deformation retraction. 48 chain complex. 114 . 100 Lennard-Jones function. 101 dual set. 35 coaxal system. 36 length. 23 . 49 chromosome. 24 isomorphism. 65 basis (of a group). 61 critical point theory. 62 dihedral angle. principal. persistent. 59 curvature (of a curve). 57 homomorphism. 96 independent collection. 5 coherent triangulation. 45 Delaunay triangulation. 96 face (of a simplex). 19 DNA (deoxyribonucleic acid). 100 atom. 3 closed ball property. persistent. 3 genome. 51 Betti number. 49 Brunn-Minkowski theorem. 96 Euler-Poincar´ theorem. 44 homology class. 96 filtration. 48 inclusion-exclusion. 18 diffeomorphism. 96 Euler relation. 48 critical point. 5 barycentric coordinates. 60 differential topology. 51 Gauss map. 20 independent simplex.

regular. 44 topological subspace. 32 spherical triangle. 104 Volbl software. 4 transversal. 60 topological equivalence. 44 topology. 5 Protein Data Bank. 3 signature. 69. 35 ribosome. 60 smooth map. 23 proton. 61 Morse theory. 48 simulated perturbation. 17 principal curvature. 57 persistent homology group. 6 RNA (ribonucleic acid). 48 Ramachandran plot. 10 van der Waals radius. 15 vector field. 57 piecewise linear. 63 van der Waals potential. 32 nucleotide. convex. 69 pdb-file. 116 mixed cell. 28 persistent Betti number. 65 pocket. 96 tangent space. 100 Voronoi diagram. 3 residue. 35 metamorphosis. 9 Morfi software. 18 union-find. 44 topological space. 72 orthogonal spheres. weighted Delaunay. 30 molecular mechanics. 19 . 63 velocity vector. 15 molecule. 7 speed (of a curve). 24 regular triangulation. 64 triangulation. 14 specificity. 23 van der Waals surface. 44 open set. 24 skin. 60 map. 32. 10 molecular skin. 32 vertex insertion. 32 normal form. 25. 30 mixed complex.150 S UBJECT I NDEX linear independence. 44 open set (of simplices). 6 rank (of a group). 107 unstable manifold. 65 manifold. 15 . 40 smooth manifold. 44 matrix (of a homomorphism). 114 normal vector. 29 Skin Meshing software. 48 . 96 potential energy. 32 mesh. 40 void. 51 regular point. 32 principal simplex. 35. 55. 60 solid angle. 23 normal curvature. 48 simplicial complex. 68 polyhedron. 102 . 60 tangent vector. 56. 71 simplex. 59 Morse-Smale function. restricted. 63 star. 27 molecular surface. 35 . 30. 23 pencil (of circles). 19 . 18 orthosphere. 106 volume. 55 normal form algorithm. 2 open ball. weighted. 24 principle of inclusion-exclusion. 55 mean curvature. 41 Minkowski sum. 23 . 24 singular simplex. additively weighted. 44 supporting hyperplane. 44 topological type. 64 mouth (of a pocket). 61 regular simplex. 51 lower star. 39 morphing. 60 partial order. 22 parametrization. 69 neutron. 5 restricted Delaunay triangulation. 100 stable manifold. 17 power distance. 65 stereographic projection. 17 x-ray crystallography. 19 replication (of DNA). 9 quotient group. coherent. 9 NMR (nuclear magnetic resonance). 15 space-filling diagram. 100 subspace topology. 84 Morse complex. 18. 11 power diagram. 35 restricted Voronoi diagram. 103 solvent accessible surface. 96 protein. 64 Morse function. 44 transcription (of DNA to RNA).

C.. 38 Dirichlet. 109 Maillot.-W. H.. H. M.. 83 Guillemin. 114. S. P. 16. W. 11 M¨ ucke.. J. 19 Delfinado. 34 Bruggesser.. 31 Darby. 93 Mehl. B. N. Z. P.. T. 87. 109 O’Neill. 34.... 105. 114 Levitt... 58. 84. J. 70. 38 Besl. 70. 117 Hughes. B. 105. B.. 11 Miller. D. 84.. M. 16 Lee. 84 Chew. W... W. 62 Munkres. 77 Helly.... 84 Leach. 109 Edelsbrunner. W. W.. 26 Bray. H. 11.. R. I..-L. 70 Lam.. 4 Milnor. R. N. N. M.. L. J. 38 McCleary. M.... F.. 26 Bern. P. 8 Cormen.. D.. F. H. R. H. 46 Kirkpatrick... J. 16 Jorgensen. C.. G. D. L. A. C. K.. 99. 58.. D. 34. 74.. 99. E. L. D. L. 114 Leray. F. J. 8 Delaunay. T. 42. 99 Martinetz. R.. V. T. A.. C. G.. A. 70 Cheng. 32. 87.... 22 Amenta.. J. Q. A. M.. 25..AUTHOR I NDEX 151 Author Index Akkiraju. 54.. 22 Klee. 26 Gr¨ unbaum. 93 Giblin. J. 70. M. 46.. 46. 31. 22.. 99 Facello. 115 Frobenius. 11 Aurenhammer. 65.. E. 34. 65 Basch. 76. 117 Kratky. H. 8 Johnson. 77. 4 Gelfand.. R. 92 Mani. 74. 4 Gromov.. R. 99 Capoyleas.. 19 Bondi. A.. P. V. H. P. 11 Kapranov. 46 Letscher. 114 Creighton. N. 22. K. P. T. D. F. 65.. J. 54. W. 58 Naiman. F. 8 Alexandrov. 8 Bruce. K.. 11.. 16 Bader. N.. 93 Bhat.. K. F. 87 Cheng. 62 Hadwiger. J.. J. W.. 19 Gerstein. 109. 105.. 109 Corey. 38. J. R. 42. 31 Fu. L. F.. 58 McKay. 109 Kuntz.. 93 Lewis.. P. 109 Cheng. A. A.. 109 Gauss. M. 54 Dey. J...-G. B. G. T... G. 105 Griffith.. 19. 42. 62 Morse... V. 38 Ashcroft. 11 Clifford.. M. E. 77 Pauling. J. J. 26. (also Delone).. 4 Darboux. D. B. 34.. J. G.. 19 Kelley. P. 16 Alberts. 42. 83 Berman. J. 16 Leiserson. B. 82. J. S. F.. 70. 38. C. J. 50 Gibson. 4 Liang.. P. D. R.. 42 Forman. 31 Connolly. 115 Feiner... 16. T. S.. 26. 34 Palmer.. W.. M.. R.. 115 Eilenberg. 102 Harer. 105. M. 115 London. G. 99 Neyeem. L.. P. 32 Gelbart. 11 Bourne. 84.. 109 Gilliland. A. 50. 16. N. 22.. N. 113. 8 . 4 Mermin... L.. L. 38 Chothia. 26 Foley.. P.. A.. 26 Maigret. R.. 42 Johnson. 102. 19 Dodd. 8 Bronson. H. 77 Banchoff. B. L. 26 Billera.. M. E. C. 16 Mendel.. K. 42 Feng.. S. 8 Crick. E. 117 Guibas. 74. E. J.. 8 Lewontin. 42. H. 109 Pascucci. I. W. 54 Euler... B. M. C.. 19.. 102 Nef.. J. M.. 117 Casati.. 54. V. L. 79 Bajaj.. 76.

91 Vila. G. 11 Van Dam. 16 Raff.152 AUTHOR I NDEX Pedoe. 77 Schl¨ L.. P. M. 77 Watson.. 19 Zhang. 109 Schey.. R. R. 114 Roberts. H.. G.. V.. 11 Tsai. 74 Wynn. 102 Tirado-Rives.. A... H... A...-M. 54 e.. E. 109 Threlfall.. A. G.. 77.. R.. 62 Thurston. J.. 66 Schikore. 26 Smale. M. W. 109 Poincar´ H.... 113 Scheraga... K.. S. 8 Sch¨ utte. M... 46. L. J. L. A.. 77 Van Oostrum. 8 Ramachandran.. K.. 109 Vleugels... 26 Westbrook. A. D. 114 . A. 11 Van der Waerden. 8 Sturmfels. 26 Will. H. I. R.. 113 Sherwood. H. K. D.. J. 50 Sasisekharan. 46. 117 Schulten. L. D... 16. 16 Woodward. 4 Storjohann. 38 Taylor. R.. J. J. N. A.. 102 Zelevinsky. 83 Zomorodian. 82 Richards. 113 Van Krefeld. C.. P. 42 Van der Waals.. N.. 4 Shindyalov.. 62 Qian. B. A.. 8 Ramos. 76. 38 Seidel.. 26 Rivest. C.. H. 19 Wagon. 65. R. R.. B. A. 22 Seifert. 34. 8 Rotman.. 58 Strang. S.. 117 Wallace. 54. P. J. Y. J. 54 Stern. N. M.. J.. C. Pollack. 66 Steenrod. 99 afli. 74. D... Schneider. 62 Shah.. H. 19 Sullivan. 8 Wang. 77 Varzi. N. 58.. 91 Voronoi... J.... E. 62 Walter. V. R. G. J.. W. 4 Weissig. 62 Stryer. R. 11 Theodorou. 70 Veltkamp. M. L.. 38 Sharir. F. 31 Perrot. N..

Sign up to vote on this title
UsefulNot useful