INTRODUCTION TO BIO-GEOMETRY

Herbert Edelsbrunner Departments of Computer Science and Mathematics Duke University

Table of Contents
P ROLOGUE I II III IV V VI VII VIII IX X B IO - MOLECULES G EOMETRIC M ODELS S URFACE M ESHING C ONNECTIVITY S HAPE F EATURES D ENSITY M APS M ATCH AND F IT D EFORMATION M EASURES D ERIVATIVES S UBJECT I NDEX AUTHOR I NDEX i 1 17 35 53 71 89 101 117 125 141 147 149

Preface
[Mention the pioneers who early on recognized the importance of geometry in structural molecular biology: Fred Richards, Michael Levitt, Michael Connolly] [Mention that my book on the “Geometry and Topology for Mesh Generation” is complementary/a prerequisite to this book. In particular, it covers the construction of Delaunay triangulations in detail, and it describes the simulation of simplicity as a general idea to deal with non-generic situations.] [This book is really about alpha shapes in a broad sense. It might be useful to describe the history of that research in short. 1981. Vancouver. Conception of idea with Kirkpatrick and Seidel. 1985-89. Graz and Urbana. SoS, Delaunay software, Alpha Shape software with Ernst Mucke, Harald Rosen¨ berger, and Patrick Moran. 1990-93. Urbana and Berlin. Surface triangulations, Betti numbers, inclusion-exclusion, CAVE with Ping Fu, Ernst Mucke, Cecil Delfinado, Nataraj Akkiraju, and ¨ Jiang Qian. 1994-95. Hong Kong. Morphing, molecular skin, with Ping Fu, Siu-Wing Cheng, Ka-Po Lam, and Ho-Lun Cheng. 1995-98. Urbana. Flow and pockets, skin surfaces with HoLun Cheng, Tamal Dey, Michael Facello, Jie Liang, Shankar Subramaniam, Claire Woodworth. 1999-2001. Duke. Skin triangulation, hierarchy, Morse ¨ ¨ complexes with Ho-Lun Cheng, Alper Ungor, Afra Zomorodian, David Letscher, John Harer, Vijay Natarajan. 2002-2003. Duke and Livermore. Docking, Reeb graphs, Jacobian manifolds with Johannes Rudolph, Sergei Bespamyatnikh, Vicky Choi, John Harer, Valerio Pascucci, Vijay Natarajan, Ajith Mascarenhas. 2000-2005. ITR Project. Derivatives, interfaces, software with Robert Bryant, Patrice Koehl, Michael Levitt, Andrew Ban, Johannes Rudolph, Lutz Kettner, Rachel Brady, and Daniel Filip. ] [This book is based on notes developed during teaching the courses on “Sphere Geometry” in the Spring of 2000, and on “Bio-geometric Modeling” in the Spring of 2001 and the Fall of 2002, all at Duke University. These courses were either taken for credit or audited at least occasionally by Luis von Ahn, Tammy Bailey, Yih-En (Andrew) Ban, Robert Bryant, Ho-Lun Cheng, Vicky Choi, Anne Collins, Abhijit Guria, Tingting Jiang, Looren Looger, Ajith Mascarenhas, Gopi Meenakshisundaram, Nabil Mustafa, Vijay Natarajan, Xiuwen Ouyang, Anindya Patthak, Ken Roberts, Apratim Roy, Scott Schmidler, Xiaobai Sun, Yusu Wang, Shumin ¨ ¨ Wu, Alper Ungor, Peng Yin and Afra Zomorodian.]

Herbert Edelsbrunner Durham, North Carolina, 2002

2 on Spheres in Motion. Write Section VI.3: mention new results on scheduling. Write Section VIII. Exercises: come up with questions. Chapter VI Write Section VI.3: replace 23. 13. 2004). Write a section on the Weighted Volume Derivative.by 03-.4 on Shape Space. Chapter VIII Write the introduction to Deformation. General Fix the software for creating the index and glossary.4. Should Section V. Chapter VII In Section VII. Exercises: come up with questions.3 on Construction and Simplification.4 on Simultaneous Critical Points. Write a section on the Weighted Area Derivative. ¢ £¡                                         .2: find out about finding the best bi-chromatic matching in . Add the interface software description to Section V.and 23collapses.To do or think about (March 15.3 on Rigidity. Write Section VIII. Chapter X Write a new chapter on area and volume derivatives and related topics.2 on Topological PerChapter V sistence be reorganized by first presenting the algebra and second the algorithm? In Section V. Chapter IX Exercises: come up with questions. Exercises: come up with questions.1 on Molecular Dynamics. Write Section VIII. Exercises: add a few more questions. Write Section VIII. Should the Exercise sections be labeled so the page heading is more uniform? Chapter III Section III.

1.2 and talk about the structural organization of proteins in Section I.4. RNA. and proteins. DNA is the stuff that genetic material is made of.3 I. Each cell is like a society whose members have specialized tasks. it should not be surprising that there are exceptions to almost everything meaningful that can be said about them.3. All mentioned molecules are between large and huge.Chapter I Bio-molecules This chapter discusses the three main classes of organic macromolecules involved in the hereditary and life maintenance mechanisms of living beings: DNA. which carries the genetic information: We begin by describing the chemical structure of DNA and RNA in Section I. Finally. Perhaps it is more surprising that anything of broad validity can be said at all. They are relatively simple locally but exceedingly complicated in their totality. Because of the complexity and the large variety. DNA transcription replication RNA translation Protein I. which they accomplish in a complicated net of interactions. 1 . According to the central dogma of biology.2 I. We then explain the translation from RNA to proteins in Section I.1 I.4 DNA and RNA Proteins and Amino Acids Structural Organization Molecular Mechanics Exercises We talk briefly about the processes indicated by the three arrows and focuses on the structure of the players involved. RNA is mostly but not entirely an intermediate product copying portions of the DNA (transcription) and turning this information into working proteins (translation). proteins are created in two steps from DNA. we present some of the fundamental premises and results of molecular mechanics in Section I. Proteins act like machines that define the cell cycle as an ongoing process.

Interactions between base pairs hold the two strands together. each composed of a phosphate group. deoxyribose sugar. as depicted in Figure I. All atoms in the ring share electrons as a group and we draw some double bonds just to Double helix. Adenine interacts with thymine and guanine with cytosine. forming the structure of a spiraling staircase. The phosphate and the sugar groups in the backbone are connected by phosphodiester bonds. Figure I. We obtain the nucleotides G.1: A short piece of the DNA double-helix. DNA has three chemical components: phosphate. We begin by looking at the small level and work C C C N N CH HC N O CH2 C H H C OH O H C H C H P −O adenine phosphate deoxyribose sugar O N HC N C N C C NH C NH2 HC HC N NH2 C CH 3 N C O C HC N O C NH C O guanine cytosine thymine Figure I. passes through the -carbon. The carbons of the sugar group are numbered from to . C and T by substituting the corresponding base for adenine in Figure I. As discovered by Watson and Crick in 1953. In the double stranded DNA molecule. We use boldface edges to connect atoms that are joined by two covalent bonds. We think of the backbone as oriented in the direction of the path that starts at the -carbon. and the other is between the phosphate and the -carbon. NH2 N O −O I. Chemical structure of DNA. One part of the phosphodiester bond is between the phosphate and the -carbon. The bases are attached to the 1-carbons.2 sketches the chemical structure of the nucleotide A and shows the chemical structures of the remaining three bases.2 I B IO .2.1 DNA and RNA DNA (or deoxyribonucleic acid) is the material that forms the genome. and four nitrogenous bases. For example. This implies that the sequence of bases along one strand determines the ¡ ¦ ¡ ¥ ¡ £ ¡ ¤£ ¡ ¢  ¡ £ ¡ ¥ . and thymine. with atoms shown as tightly packed and partially overlapping spheres. the two backbones are in opposite. DNA consists of two strands of nucleotides twisted into the shape of a double helix. guanine. cytosine. the treatment of DNA in this section is coarse and lacking of many important details. and ends at the -carbon. The two bases of a pair are said to be complementary. orientation. our way up the multi-scale structure of DNA.1. The first two bases are double-ring and the last two are single-ring structures. The attachment of these bonds to the sugar groups is illustrated in Figure I. Compared to standard genomics texts. The covalent bonding in the ring structures of the nitrogenous bases is more interesting. or anti-parallel.MOLECULES indicate the total number of extra shared electrons.2: The chemical structure of the DNA nucleotide with adenine as the nitrogenous basis above. a deoxyribose sugar. The backbone of each strand is a repeating phosphate-deoxyribose sugar polymer. the hexagonal ring of cytosine has a total of eight covalent bonds.3. namely adenine. which we may think of as four thirds of a covalent bond between every contiguous pair. The chemical components are arranged in groups called nucleotides. and the chemical structure of the other three nitrogenous bases below. which is a complete set of the genetic material of a living organism. and one of the four bases. The two strands of DNA are held together by weak hydrogen bonds between complementary bases. A nucleotide is conveniently referred to by the first letter of its base. Figure I.

which differs from deoxyribose sugar by one additional oxygen atom. In the case of a human cell. ¡ ¤£ ¡ ¢  Chemical structure of RNA. The best evidence suggests that the solenoid arranges in loops emanating from the scaffold.3: Chemical structure of a very short segment of DNA.4: Chemical structure of the RNA nucleotide with uracil as the nitrogenous basis. each chromosome is a long thread (a double-strand) that is densely folded around protein scaffolds.1 DNA and RNA 3 special protein).4 illustrates the chemical difference between RNA and DNA by showing a ribonucleotide containing uracil. RNA has ribose sugar in its nucleotides. RNA is a single-stranded nucleotide chain and can therefore assume a much greater variety of geometric shapes than DNA. There are three main differences to DNA. topoisomerase II. which can fail for a variety of reasons. Indeed. giving each strand an orientation. O HC O −O sequence of bases along the other: reverse the reading direction and replace each basis by its complement. The body has about cells. Each cell of an organism contains a copy of the entire genome. Note that this definition depends on the rather complicated process of transcription. Uracil forms hydrogen bonds with adenine just as thymine does. O P O O O O P O 3’ 4’ 5’ H2 2’ 1’ O O O H HN N 5’H2 O 4’ 3’ T NH A 1’ 2’ O P O 5’ H2 O O O O O O P O 3’ 4’ 5’ H2 2’ 1’ O H HN N O O 1’ 4’ 3’ G NH NH H C 2’ O P O O O O O O O P O Figure I. This enzyme has the ability to pass a strand of DNA through another. It takes one more level of packaging to convert the solenoid into the threedimensional structure we call a chromosome. this amounts to about two meters of DNA partitioned into twenty-three pairs of chromosomes per cell. the DNA is wrapped twice around a configuration of eight histones (a  AATCGCGTACGCG TTAGCGCATGCGC 3’ 5’ ¢       ¢ C     NH C O HC N O CH2 C H H C OH O H C OH C H P −O uracil phosphate ribose sugar Figure I. A protein machine builds new DNA strands by separating the two old strands and complementing each by a new anti-parallel strand. This higher level uses a core scaffold made of another enzyme. this implies that the DNA must be thin and efficiently packed. 2. which is more than a hundred times the distance between the earth and the sun. How is a long thread of DNA converted into the relatively thick and worm-like structure visible through the electron microscope? On the lowest level. The beads of wrapped histones assume a coiled structure (a solenoid) stabilized by another type of histone that runs along its central axis. We begin by looking at the chemical features of RNA. 3. ¦ ¨¥ ¥ © ¦ §¥ ¥ © 5’ 3’ Replication is based on this simple rule of complementarity and makes essential use of the relatively weak bonds between the two strands. and cytosine. which itself assume the form of a spiral. A gene is a subsequence of the DNA capable of being transcribed to produce a functional RNA molecule. Chromosomes. RNA nucleotides carry the bases adenine. but substitute uracil for thymine found in DNA. The dotted connections between the nitrogenous bases indicate hydrogen bonds.I. 1. totaling about meters of DNA. which is a much needed operation during packing and unpacking the DNA. Since humans are small relative to that distance. Figure I. . guanine. The numbers to order the carbon atoms of each sugar group.

Br¨ nn 4 (1866). which is not translated into protein. 964–967. G RIFFITH . and one strand acts as a template for RNA synthesis.5: The RNA grows in the 5’ to 3’ direction. 1966. During the transcription of a gene. in this case by adding a nucleotide carrying uracil to the chain. Verhand¨ lungen des naturforschenden Vereines. W. 6]. which brings amino acids to the mRNA during the translation process. The idea that traits are hereditary is old. C RICK . Genetic implications of the structure of deoxyribonucleic acid. G ELBART. Modern Genetic Analysis. The groundwork for our current understanding was laid in the nineteenth century by Gregor Mendel. Molecular structure of nucleic acid. The Origin of Genetics: A Mendel Source Book. J. C. as sketched in Figure I. which acts as an intermediary structure in the synthesis of proteins. which helps coordinating the assembly of amino acids to proteins. WATSON AND F. which makes RNA.4 RNA is classified into different types depending on their function. and ribosomal RNA (or rRNA). except that U replaces T. Free ribonucleotides align along the DNA template. when he discovered the basic rules of the hereditary mechanism [2]. 1981. [2] G. 1999. A structure for deoxyribose nucleic acid. is similar to the replication process of DNA. Each individual transcription works in three steps. Specific sequences in the DNA signal the chain termination by triggering the release of the RNA strand and the polymerase. C RICK . the same as the non-template sequence of the gene. u 3’ A 5’ T P S P C S P G S P S Figure I. RNA polymerase moves along the DNA. There is also functional RNA produced by a small number of genes. Transcription. D. The resulting RNA sequence is S I B IO . Nature 171 (1953). R. Abhandlungen. the two strands of DNA are separated locally. It then unwinds the DNA and begins the synthesis of an RNA molecule. [4] J. H. Examples are transfer RNA (or tRNA). Bibliographic notes. M ILLER AND R. Today there are many books on the subject. RNA polymerase binds to a promoter segment of DNA located in front of the gene. M ENDEL . 5’ P C S P G S P A S 3’ P U [3] C. L EWONTIN . which moves along the DNA adding ribonucleotides to the growing RNA. WATSON . An English translation of this work can be found in [3]. The process is catalyzed by another protein machine. H. The Double Helix. Electron microscope pictures show that the transcription of DNA to RNA is a highly parallel process in which a row of RNA polymerase complexes follow each other along the gene and produce RNA concurrently. Freeman. It was long known that DNA is critically involved in that mechanism. S TERN AND E. The book by Watson [4] is an enjoyable personal account of the years preceding the discovery of that structure. [1] A. Initiation. Termination. Nature 171 (1953). 3–47. C. the RNA polymerase complex. H. Versuche uber Pflanzen-Hybriden. F. The transcription process. D. and most of the material in this section is taken from [1. It compares free ribonucleotides with the next exposed DNA basis and adds a complementary match. . [5] J. D. but the detailed mechanism how it comes about started to unfold only recently. M. Antheneum. Elongation.5. WATSON AND F. The vast majority is messenger RNA (or mRNA). maintaining a transcription bubble to expose the template strand. C. New York. Freeman. New York. S HERWOOD . Chapters 2 and 3].MOLECULES A gene is thus not only marked but indeed defined by the promoter segment preceding and the terminating sequence succeeding it. [6] J. 737–738. J. but it took until the work of Watson and Crick in 1953 to discover the chemical structure of DNA [5.

In this section. As can be seen in Alanine Cysteine Aspartate Glutamate Phenylalanine Glycine Histidine Isoleucine Lysine Leucine Ala Cys Asp Glu Phe Gly His Ile Lys Leu A C D E F G H I K L Methionine Asparagine Proline Glutamine Arginine Serine Threonine Valine Tryptophan Tyrosine Met Asn Pro Gln Arg Ser Thr Val Trp Tyr M N P Q R S T V W Y I. Chemical structure. two amino acids are linked by a peptide bond whose creation releases water.9 have pentagonal and hexagonal ring ¢ L R R H D Figure I. Different residues are distinguished by their side-chains. as illustrated in Figure I. H N H H C C OH O H H N H OH2 H N H H C O C N H H C C OH O C C OH O     Table I.7: The two isomers of an amino acid. the carbon.7.9. + R R Figures I.6: Two amino acid residues joined by a peptide bond. Most of the internal nodes are carbon atoms. Amino acids that are linked into a polypeptide chain are referred to as residues. This tetrahedron has two orientations. we mark double and partially double bonds by boldface edges.2 Proteins and Amino Acids Proteins are polypeptide chains obtained by translation from strands of messenger RNA. All unlabeled nodes are either carbon or hydrogen atoms. nature uses only twenty to build proteins. The two oriented forms are referred to as isomers and distinguished by letters L and D. ¡ ¡   Glycine Alanine O O Threonine S Cysteine O O Serine Aspartate N NH2 COOH COOH NH 2 S N N N O O Glutamine Lysine Methionine Glutamate O N Cα Cα Arginine H Figure I. We list their names together with their three-letter codes and single-letter abbreviations in Table I. . Four of the five amino acids   R R Valine Isoleucine Leucine O N Figure I. A protein is a linear sequence of amino acids connected to each other by peptide bonds. As before. Among a much larger variety of amino acids. with rare occurrences of oxygen.2 Proteins and Amino Acids 5 Amino acids.6. residues differ widely in size and structure. a carboxyl group. -carbon and carbon atoms is the backbone of the protein. we sketch the translation process and discuss the chemical structure of proteins.8: The fifteen amino acids without cycle in their chemical structure.8 and I. C . sketched in Figure I. linked to an amino group. As shown in Figure I. The resulting repeating sequence of nitrogen. and a side-chain.1. Each amino acid consists of a central carbon atom.I. which is part of the backbone. nitrogen and sulfur atoms. The shaded circle is the -carbon on the backbone. are at the vertex positions of a tetrahedron around C . Only L-amino acids occur in nature as building blocks of proteins. one being the mirror image of the other. The fifteen amino acids sketched in Figure I. Asparagine The four neighbors of an -carbon. one hydrogen atom. codes and abbreviations of the twenty amino acids that occur as building blocks of natural proteins.8 may be viewed as trees rooted at the -carbon.1: Names.

6 structures. Incidentally. The codon XYZ is A A G C U Tyr Tyr Cys Lys Asn Glu Asp Gln His Lys Asn Glu Asp Gln His Arg Ser Gly Gly Arg Arg G Arg Ser Gly Gly Arg Arg Trp Cys Thr Thr Ala Ala Pro Pro Ser Ser Translation. Each tRNA is a short sequence of about 80 nucleotides. which implies that the map is not injective but uses redundancy to reduce the number of outcomes.2: The genetic code. which are UAA. In many cases. This explains the relative uniformity among the four residues in any one slot of Table I. This unique feature locally restricts the flexibility of the backbone.2. N Proline N Tryptophan O N O Tyrosine Phenylalanine N Histidine Figure I. Since codons are triplets of nucleotides. and complementary substrings shown.9: The five amino acids with cyclic chemical structure. mapped to one of the residues in the row of X and the column of Y. Genetic code. The four positions inside that slot correspond to A. A tRNA Table I. one of which is sketched in Figure I.10: Transfer RNA with anti-codon at the bottom. an accurate match at the first two positions suffices and a mismatch at the third position can be tolerated. The sequence of nucleotides is read consecutively in groups of three. although that one also binds to methionine. The redundancy is in part due to multiple tRNA molecules carrying the same residue and in part because there is flexibility in how the tRNA reads the codons. which forms a cycle by having its chain connect back to the nitrogen next to the -carbon along the backbone. called codons.10. Some residues correspond to more codons than others. and UGA. The correct reading frame is identified by starting the translation always at a start codon. Complementary subsequences form double-helix substructures that further fold up to characteristic ‘clover leaf’ formations. UAG. The translation process is more involved than transcription because it converts information between two languages that use different alphabets. As mentioned above.   I B IO . There are only twenty residues.2.3. The start codon is AUG and maps to methionine. . The complete map is shown in Table I. The initiator tRNA is a specific transfer RNA that recognizes this sequence and binds to methionine. AUG. The fifth amino acid is proline. it differs from the tRNA that binds to the AUG codon in the middle of the sequence. U in the second row. ¦ ¢   £¡¢ amino acid 3’ ¦ 5’ C Thr Thr Ala Ala Pro Pro Ser Ser Ile Ile Val Val Leu Leu Leu Phe U Met Ile Val Val Leu Leu Leu Phe G C G G A U U C U C G G A G C C C A G G G U C C G C C U A A G A C A C C U G U G anti−codon GAA Figure I. each producing an entirely different residue sequence. we have codons. the tRNA molecules are instrumental in translating codons into residues.MOLECULES The translation is accomplished by transfer RNA molecules that recognize codons through the same binding mechanism used for replication and transcription. as will be discussed in Section I. Since there are four different types of nucleotides. covalently attached amino acid at the top. G in the first row and C. there are apparently three possible reading frames. Empty entries correspond to the stop codons.

I. It consists of a small subunit and a large subunit. 5]. B. Considerably shorter and more focussed descriptions of proteins and protein structures can be found in [4. In some cases. For each codon. New York. C. England. DARBY AND T. Proteins: Structures and Molecular Properties. The translation process is facilitated by the ribosome. The codon and anticodon are matched in anti-parallel orientation. A. BAN . 1993. 7 [4] N. New York. S TRYER . N IESSEN . the translation even starts during transcription. which come together around an mRNA strand with the help of the initiator tRNA that contributes the first residue. £ ¢  ¡ . After the determination of the DNA structure in 1953. A LBERTS . K. it finds a tRNA with matching anti-codon and appends its amino acid as a residue to the carboxyl end of the growing polypeptide chain. Protein Structure. and a few more years to decipher the genetic code on which the dogma is based.2 Proteins and Amino Acids molecule matches the exposed codon of the mRNA with its anti-codon and contributes its residue to the polypeptide chain that grows at the other end. The protein chain and the mRNA are released and the ribosome dissociates into its two subunits. H ANSEN . J. Most of the twenty amino acids that occur in proteins have been identified in the nineteenth century. S TEITZ . Press. J. WALTER . 1988. Press. W ILKINSON . all three of which are comprehensive texts in their respective fields.to the 3-end is thus preserved by the orientation of the polypeptide chain from the amino group of the first to the carboxyl group of the last residue. 3. Essential Cell Biology. 1993. E. [3] T. 1998. The translation process ends when a stop codon is read. Oxford Univ. An Introduction to the Molecular Biology of the Cell. The orientation of the mRNA strand from the 5. Science 11 (2000). as always. [5] P. Similar to transcription. ROBERTS AND P. [1] B. it took only a few years for the community to agree on the central dogma. The complete atomic structure of the large ribo˚ somal subunit at A resolution. Freeman. before the mRNA strand is complete. Oxford Univ. R AFF . Garland. which is a large complex made from more than 50 different proteins and several RNA molecules. C REIGHTON . Biochemistry. Third edition. [6] L. C REIGHTON . England. E. Second edition. M. M OORE AND T. The material of this section is taken from [1. J OHNSON . D. M OODY AND A. with several ribosomes working concurrently and in sequence along the strand. The ribosome scans through the strand like a tape reader. The geometric structure of the ribosome has recently been resolved by x-ray crystallography [2]. B RAY. J. Protein Engineering. J. 878– 879. Freeman. L EWIS . the translation of an mRNA strand into a protein happens in parallel. 1990. Bibliographic notes. 6]. P. [2] N. New York. E. A. P.

The structure is stabilized by hydrogen bonds between every CO group and the NH group four residues later. . The and angles measure rotations around the bonds preceding and succeeding every -carbon atom. # ¡   $"! ¥ § ¢   ¡ ¥ £ § ¤  ¡   ¡ © ¥ £ ¦ ¤    ¥ £ ¦     ¨ ¨ © ¢ . A given residue prohibits some angles because of steric hindrances. and refer to it as a peptide unit. and are specified for each residue in the chain. The realizable angle pairs as a subset of the square of angle pairs. which is the link between the carbon and the nitrogen atoms. and for the two coplanar trans forms. which by convention is for the trans and for the cis form. This so-called Ramachandran plot for glycine is sketched in Figure I. Cartoon representations of protein structures usually draw -helices as tubes. and measures the rotation around the C -C bond. whose backbone forms a right-handed helix. The characteristic dihedral angles for a right-handed -helix are roughly and . ¡ ¡ Two common motifs.12. They combine strands to sheets.8 I B IO . The side-chain of ψ φ O C ψ H H N H Cα N φ Cβ C Cα O Figure I. Most surprisingly. in which C -C-N-C is relatively stretched (zig-zag). The two forms are distinguished by the rotation angle along the C-N bond. which can run in the same direction (parallel) or in opposite directions (anti-parallel). and in this way restricts the rotational degree of freedom to a small region. In contrast. Figure I. and the cis form. 0     £     ¥ % )(¦ ¥     ©   ¥ % '&£ ¥   ¥  §   ¨ ¢ ¦ £ ¥ bond character. which differs from all others because it binds back to the backbone.MOLECULES are physically larger residue angles than a are visualized  ¥ £  ¥ £ § ¤  §   ¥  I. Bond rotation. which are flat and made up of several strands. which is measured along the axis. which is the reason that a relatively large portion of the square of angle pairs is realizable. The conformation of the backbone is completely determined when . There are however two possibly planar configurations: the trans form. . A rotation takes about residues and produces an axial separation of about ˚ A. An interesting residue in this respect is proline. As shown in Figure I. A strand can be obtained by stretching the -helix until the axial distance between two ˚ contiguous -carbons reaches about A. Again by convention.6 shows its chemical and Figure I.11: The planarity of a peptide bond is caused by its partial double-bond character. £ ¥   Ramachandran plot. in which it curves in one direction (zig-zig). Because of partial doubleCα   prohibited collisions between atoms. Consider the three bonds from one carbon to the next along a protein backbone.3 Structural Organization We cannot hope to understand proteins without a good grasp of their multi-level structural organization. there is no freedom to rotate around the peptide bond.11 its geometric structure. Contiguous -carbons are separated ˚ in the rotation direction and by about A rise. The stabilizing hydrogen bonds are between neighboring strands. A will generally prohibit a larger range of smaller one.13 the tubes are visible as spiral sections of the ribbon. the links between the -carbon and the carbon and nitrogen atoms are single bonds with one-dimensional rotational degrees of freedom. In Figure I. A motif that is commonly observed in proteins is the -helix. All side-chains lie outside the helix structure. same proteins fold up to same shapes.12: The square represents all angle pairs and the shading indicates the region of disallowed pairs for glycine. glycine is only H. which © ¨ ¢ Another recurring motif are -sheets. measures the rotation around the N-C bond. Figure I. . and this is really the reason why geometry plays an important role in their study.11.

This accumulated effect thus prefers interactions between geometrically complementary shapes. they are not visible under an electron microscope. We only scratch the surface by explaining the principle steps in the reconstruction of protein structures from x-ray diffractions: 1.13: Ribbon diagrams visualize proteins by emphasizing the backbones as it winds its way through the structure. 3. its accumulated influence is significant if two subunits have geometrically complementary shapes that permit a large number of atom pairs within the reach of the force.   Protein architecture and function. that are specific to interactions with other molecules. this fact is expressed by saying that the van der Waals force creates specificity in the interaction. 2. Both options are illustrated in Figure I.I. and quaternary structure addresses questions about their relative position and interaction. In biology. but there are others and most notably images generated from nuclear magnetic resonance (or NMR) experiments. Even though proteins are large molecules that typically consist of a few thousand atoms.3 Structural Organization 9 Quaternary structure refers to the spatial arrangement of subunits of a protein.14: Two parallel -strands to the left and two antiparallel ones to the right. . Tertiary structure refers to the spatial arrangement of residues that are far from each other along the chain. The description of quaternary structure includes the rather weak van der Waals forces. Since there are probably hundreds of thousands of different proteins. Secondary structure refers to the spatial arrangement of residues that are near each other along the chain. While active sites usually occupy only a small fraction of the surface.14. £ ¥ Figure I. Structure determination. How do we then know anything about the structural organization of proteins? The primary source today are xray diffractions from protein crystals. which have to be obtained from the known chemical structure threaded into the density. CO Cα NH OC Cα HN CO Cα NH Cα NH HN CO Cα NH OC OC Cα HN CO HN Cα Cα NH OC Cα Cα OC CO Cα NH OC NH CO HN Cα Figure I. The dotted edges represent stabilizing hydrogen bonds. Compute the electron density and from it derive the structure. It is common to distinguish four levels of organization in the description of protein architecture: Primary structure refers to the sequence of residues along the oriented polypeptide chain. The x-ray experiment does not determine the element identities of the atoms. Both methods are complicated and laborious. That specificity plays a dominant role also in protein-protein and in protein-ligand interactions. Each chain forms what we call a subunit. it would be desirable to automate the process. which affect atoms in short distance (within ˚ about A). Expose the crystal to x-ray beams and collect the diffractions. Although this force is weak compared to others. so-called active sites. A single protein may indeed contain more than one polypeptide chain. A protein typically has a few regions embedded in its surface. Prepare a protein crystal. Evidence for that claim can be provided by mutating a protein and distinguishing between mutations that preserve and that change the active sites. they decide protein function. It seems that Step 1 is the main obstacle in reaching this goal.

10 in part because some proteins are not known to form crystals at all. Step 2 requires an x-ray source, a device to rotate the crystal by small angles ( or less), and a detection device. For each angle, we get a two-dimensional picture of diffractions. The three-dimensional electron density is computed from a whole array of such pictures. A typical level surface of an electron density is shown in Figure I.15. The main mathematical tool in the construction
¥     
¡ £   ¤© ¢"¥ ¤©   £ © # §  £   ¤© ¡ § £  ¦© ¤© #   £ ¢¤ ¤¢  ¦# ¤¢  § £ ©  £ ¡  ¤¢  §  £ ¥ ¦# "!  ¡  £ ¥ ¡ "!  ¥ §  £ ¡ ¤¢  § ¡ £ ¡ ¦¦¥ ¤¢ 

I B IO - MOLECULES
§¦ ¤¨  £ ¡ §%§ ¤¨   £ ¡ # ¥ £ ¡ ¦¡ ¤¨ ¦¥ %$¨ ¡ £ § © # £ § ¦¥ %$¨ ¡ ¤¨ £  ¤¨  £ ¥ © £ ¡ ¤¨ ©   £  ¢"¥ ¤¨ ¡ £  § ¤¨ ¥  £      ¨
 

Table I.3: Incomplete records of the atoms that belong to an arginine residue. CA is the -carbon atom, CB the -carbon, etc.
¢

Figure I.15: The so-called chicken wire representation of a level surface of a three-dimensional density.

Bibliographic notes. The Ramachandran plot for realizable bond rotations goes back to work by Ramachandran and Sasisekharan [6]. The -helix has been suggested as a common motif in proteins by Pauling and collaborators in 1951 [4], and in the same year they also identified the -sheet [3]. This was a few years before these motifs had been observed in x-ray experiments. In the late 1950s, Max Perutz reconstructed the structure of hemoglobin from x-ray diffraction data [5], and John Kendrew did the same for myoglobin. A classic text on the x-ray crystallography method is [2]. The material on x-ray crystallography and PDB files presented in this section is taken from [1].
[1] L. J. BANASZAK . Foundations of Structural Biology. Academic Press, San Diego, California, 2000. [2] T. B LUNDELL AND L. J OHNSON . Protein Crystallography. Academic Press, New York, 1976. [3] L. PAULING AND R. B. C OREY. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci. USA 37 (1951), 729–740. [4] L. PAULING , R. B. C OREY AND H. R. B RONSON . The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA 37 (1951), 205–211. [5] M. F. P ERUTZ . X-ray analysis of hemoglobin. Lex Prix Nobel, Stockholm, 1963. [6] G. N. R AMACHANDRAN AND V. S ASISEKHARAN . Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7 (1963), 95–99.
 

of the electron density is the Fourier transform. A fundamental difficulty in this step is that only the amplitudes (intensities) of the waveforms are observable, while the phase information must be obtained by different means. Protein data banks. After completing the structural study of a crystallized protein, investigators usually send their results to the Protein Data Base, which is a public repository of protein structures described in so-called PDB files. At the beginning of each file we find ancillary information, including the header, the name of the protein, the author, the reference to the corresponding journal article, etc. There is also information about non-standard components and about secondary structure elements. The main body of the file lists the coordinates of the observed atoms. They are always given in an orthonormal coordinate system, in which the length unit is one angstrom. Table I.3 illustrates the format by showing a small portion of a PDB file for hemoglobin, listing the coordinates of the atoms of an arginine residue. Note that there are no hydrogen atoms, since they are too small to be resolved by an x-ray experiment.

¡ ¥ £ # ¦# ¤¨  ¤§ ¨ © £   ©   £ © ¢¤¨    £   ¢¤¡ ¨ § ¨ £   ¡¦¡ ¤¨ ¥ £ ©   £ © ¦¥ ¤¨ ¦ ¤¨ ¥ £ # ¡  £ #  ¤¨  § £ © ¦ ¤¨    £ © ¢¤¨

ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM

N CA C O CB CG CD NE CZ NH1 NH2

ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG

0

I.4 Molecular Mechanics

11 the Avogadro’s number of its atoms. In other words, if the mass of one atom of that element is daltons then the mass of one mole is grams. Table I.4 lists properties of elements that are commonly found in organic matter.
element Hydrogen Carbon Nitrogen Oxygen Sodium Magnesium Phosphorus Sulfur Chlorine Potassium Calcium #p 1 6 7 8 11 12 15 16 17 19 20 #n 0 6 7 8 12 12 16 16 18 20 20 electron shells . .. .. .. .. .. .. .. .. .. .. .... ..... ...... ........ ........ ........ ........ ........ ........ ........

I.4 Molecular Mechanics
After a protein has been created by translation, it folds into a shape, or conformation, that is determined by its sequence of residues. The folding process is a reaction to a multitude of forces that simultaneously act on every part of the protein. This section presents some of the current knowledge and efforts to model these forces. We begin by studying atoms and discuss covalent and non-covalent forces.

Atoms. Each atom has a positively charged massive nucleus, which is surrounded by a cloud of negatively charged electrons. The nucleus consists of protons, each contributing a unit positive charge, and of electronically neutral neutrons. The electrons are held in orbit by electrostatic attraction to the nucleus. Each electron has one unit of negative charge, which exactly neutralizes the positive charge of one proton. In total, we have the same number of protons and electrons and thus an electronically neutral atom, as illustrated in Figure I.16. Different

H C N O Na Mg P S Cl K Ca

Table I.4: Some elements together with their numbers of protons, neutrons and electrons distributed in the shells around the nucleus.

-

-

-

-

+

+ + + + + +
-

Figure I.16: A schematic picture of a hydrogen atom to the left and a carbon atom to the right.

elements consist of atoms with different numbers of protons. The atomic number is by definition the number of protons, which is also the number of electrons. The number of neutrons is usually about the same because too few or too many neutrons destabilize the nucleus. The atomic weight is the ratio of its mass over the mass of a single hydrogen atom. Because the mass of an electron is negligible, the atomic weight is almost exactly the number of protons plus the number of neutrons. Avogadro’s number is useful in translating from the miniscule world of single atoms into a humanly more accessible scale. It is the number of hydrogen atoms in one gram of hydrogen, which is roughly . The mass of one hydrogen atom is therefore gram which, by definition, is one dalton. One mole of an element is

Covalent bonds. According to the Born model, electrons live in shells around the nucleus and populate inner shells before using outer ones. The first three shells from inside out can hold up to 2, 8 and 8 electrons, as indicated in Table I.4. The chemical properties of an atom are defined by the tendency to either empty or complete its partially incomplete shell, if any. One way of doing that is by sharing electrons. The shared electrons complete the outermost non-empty shells of both atoms involved. According to Table I.4, carbon, nitrogen and oxygen need four, three and two electrons to fill their outer shells. As illustrated in Figure I.17, this can for example be done by covalently binding to the same number of hydrogen atoms. We can now define a molecule as a
+ + + + +

Figure I.17: The geometry of covalent bonding for carbon, nitrogen, and oxygen.

connected component of the graph whose vertices are the atoms and whose edges are the covalent bonds. When an atom covalently bonds to more than one other atom, then there is a preferred angle between pairs of bonds. For ex-

£

£

. .. ..... ...... ....... ........ ........

. ..

¢ 

¢     ¡  ¢   ¦  

¢

12 ample for carbon, this angle is what we get by connecting the centroid of a regular tetrahedron with two of the vertices. Using elementary geometry we find this angle is . Two atoms can also form a covalent double bond, which forces the nuclei closer together and is stronger than the corresponding single bond. It also prevents any torsional rotation around that bond, which is possible for single bonds. We need a sequence of four atoms and three covalent bonds to define the torsional angle of the middle bond. It is generally parametrized such that corresponds to the trans (zig-zag) coplanar configuration. For example for H CCH , we have three bonds on each side of the middle bond. There is an energetic preference for staggering the covalent bonds on the two sides, which corresponds to torsional angles of , , and . When two atoms that covalently bond are of different type then they generally attract the shared electron to different degrees. The shared electrons will therefore have a bias towards one end of the structure or another. We then have a polar structure in which the positive charge is concentrated on one end and the negative charge on the other. Examples of polar covalent bonds are between hydrogen and oxygen and between hydrogen and nitrogen, as illustrated in Figure I.17. In contrast, the bond between hydrogen and carbon has the electrons attracted much more equally and is relatively non-polar.

I B IO - MOLECULES der Waals interaction. Experimental observations point to a potential energy function roughly as graphed in Figure I.18. The corresponding force is the negative derivative,
energy

Non-covalent bonds. An atom can also donate an electron to another atom and thus create a complete outer shell. An example is sodium donating the only electron in its third shell to chlorine, which uses it to complete its third shell. As a result we get positively charged sodium cations and negatively charged chloride anions. Both are attracted to each other by electrostatic force and form a regular grid packing, in which each sodium cation is surrounded by six chloride anions, and vice versa. These arrangements are known as table salt. A weaker interaction, also based on electrostatic force, is generated by polar molecules. A prime example is water, which is partially positively charged at the two hydrogen ends. Water molecules thus tend to aggregate in small semi-regular structures, but this force is weak and bonds of this kind are constantly formed and broken. The polarity of water molecules is the basis for the difference between hydrophilic molecules, that are polar and therefore attract water, and hydrophobic molecules, that are non-polar and do not attract water. Another non-covalent force is responsible for the van

¢

¥     ¦¦  ¦   % !  ¥ §  ¥ ¥ £ ¦ ¤ 

¥ £ &  

  ¦¢

¥ § 

   © § ¥ £  ¨¦¤

¢

 

¢

¡ ¢ 

distance

Figure I.18: The van der Waals force is obtained by adding the attractive force (derivative of dashed curve) and the repulsive force (derivative of the dotted curve).

which is interpreted as a balance between an attractive and a repulsive force. The attraction is due to a dispersive force that can be explained using quantum mechanics. The repulsion also has a quantum mechanical explanation in terms of the Pauli principle, which prohibits any two electrons from having the same set of quantum numbers. It is useful to keep the relative strengths of the various forces in mind. Table I.5 gives estimates of the amount of energy necessary to break one mole of bonds.
bond type covalent ionic hydrogen van der Waals strength in vacuum water 90.0 90.0 90.0 3.0 4.0 1.0 0.1 0.1

Table I.5: Relative strength measured in kilo-calories per mole necessary to break the bonds. Water molecules interfere with ionic and hydrogen bonds, which are therefore considerably weaker in a solution than in a vacuum.

Force field. To get a handle on how molecules move, we define the potential energy of a system of atoms. The general assumption is that the system develops towards a minimum. To model the potential energy accurately, we would have to work with quantum mechanics, which is beyond the scope of this book and also beyond the capabilities of current computations for large organic molecules. The alternative is molecular mechanics, which uses classical mechanics to model the forces that act on atoms. The

is the distance between the two atoms. Its location at time is . and is the distance between the two atoms. if the potential is . that energy is written as 13 as defined is only a rough approximaIt is clear that tion of the real potential energy that drives the behavior of the system. To every action there is an equal and opposing reaction. In this case. Let be the trajectory of a point with mass . The problem in molecular dynamics is significantly more involved. for some . marks where the function crosses the zero line. and its momentum is . The forth sum adds the electrostatic potential between every pair of atoms in the system. Both the gravitational and the electrostatic potentials have this form. Suppose we write the force as the negative gradient of a potential function: .4 Molecular Mechanics simplest such model sums five contributions to the potential energy. The constants and are the charges. For example. Figure I. stationary and equal to one over the norm. A trajectory is a solution to this equation. Recall Newton’s three laws of motion: 1. Newton’s secthe acceleration. ¦ angles torsions atoms atoms Torsional rotation. and is the value at the unique minimum. Angles that lead to staggered arrangements of bonds at both sides are energetically preferred. . where is the force acting upon . its velocity is . the trajectory can be computed analytically. is the dielectric constant of the medium. This preference is modeled by a cosine function with minima and the same number of maxima. ©cbIaG if   ` ¦e § h¦ ¡ ¢ ¡ D# B R ©cbIaG " #D Y£ ` ©D XGV HWFVI U  ©D T  F SB #D PHFGI ¨  ©D E B F ©D R ©D Q£ EB ©D B D bonds ¢ ¡ § CB £ ¦ ¡  ©        7 92 0 783 0 3 ¦ '#& 2 3 A@   5   64 ) '#&  5 2   ( 1( )  0   $%  ¢  "# © !§         ¡       ©   ©    ) 2 0 2 0 2 ) 3  " ¥ ¡ ¤£¢ ¡ ¢   ¥ ¡  ¦ ( § ¤££¡ ¨¦ ¢ ¥  ¡ ¥ ( ¥        ©  ¦ . We use a vector to describe the state of a system of atoms and define the potential energy as a function . A body continues to move in a straight line at constant velocity unless a force acts upon it. The rate of change of the velocity is also referred to as . As before. as illustrated in Figure I. The rate of change of the momentum equals the force. namely several hundred kilo-calories per mole. ond law can now be written as .19. The third sum approximates the energy for different torsional angles around a bond. Using this notation. Electrostatic interaction. One of the applications of force fields is the simulation of molecular motion. The first sum approximates the energy penalty for differing from the reference length. The collision constant. Newton’s second law is expressed by the differential equation . Whether or not that approximation suffices depends on what we use it for. 2.19: A generic trajectory when the magnitude of the attraction to the origin decreases with the square distance. by a quadratic function.I. namely about one one-hundredth or even less. again by a quadratic function. is considerably less than for bond length. . three accounting for covalent bonds and two for non-covalent bonds. We have bodies (atoms) and the energy potential and force depend on the momentary locations of p Bp  srq¢    #B ¦ ¢ purpq Bt G gf   G " B ¦e ¥ ¥   ¥ Bond angle. the generic trajectory is an ellipse with one focus at the origin. We briefly look at each one of the five terms. 3. ¡   D© S B £ ¦e G gf ¢ ¡   G" ¡ ©cbIdG " ` This formula contains various constants that depend on the type of atom or interaction involved. The strength. . then . The fifth sum approximates the van der Waals potential by the Lennard-Jones 126 function. In its simplest form. The strength is relatively large. Van der Waals interaction. . In simple cases. The second sum approximates the energy penalty for differing from the reference angle. Molecular dynamics. ¡ ¦ ¥ ¥ Bond length.

Jorgensen and Tirado-Rives [3] derive parameters in an attempt to reproduce thermodynamic properties in computer simulations. 110 (1988). J ORGENSEN AND J. L ONDON . Harlow. There are various approaches to determine these radii. Mol. ££¡ ¡ ¢ ¢  ¡ ¦gef £   S   ¤¢ ¦e ¡ gf   ¡ "   ¡ ¤£¢ ¡ § ¦ ¤££¡ ¡   ¢ ¦ ¥    ¥ ¦ ¥ . Principles and Applications. 1996.14 all bodies. Molecular Modeling and Simulation. England. Numerical algorithms for molecular dynamics can be found in Leach [4] and Schlick [6]. 1968. R. [7] J. who quantified the deviation of rare gas from ideal gas behavior. G ERSTEIN . Amer. Springer-Verlag. and we refer to physics texts such as [1. The OPLS potential functions for proteins. [1] N. In this case. Already for three bodies. D. Bondi [2] looks for the distances of closest approach between atoms to determine van der Waals radii. Simulating motion with molecular dynamics is an important topic in com- ¥ putational biology. Solid State Physics. M ERMIN . L EACH . New York. [5] F. [7] analyse the most common distances between atoms in small molecule crystals in the Cambridge Structural Database. New York. 1657-1666. As before. The van der Waals potential derives its name from the work of van der Waals. Biol. J. The classic two-body problem is the special case in which and is the sum of the two corresponding gravitational potentials. The energy potential is the function defined earlier. 245–279. and the force acting on is . Finally. Molecular Modeling. The first half of this section is a highly simplified introduction of atoms and bonds. The material on force fields is taken from Leach [4]. a u [6] T. Orlando.MOLECULES where the mass vector multiplies each component of the acceleration vector with the mass of the corresponding atom. Molecular Crystals. The explanation of the dispersive contribution in terms of quantum mechanics is due to London [5]. we represent the collection of atoms by a point . there is no analytic solution and one has to resort to numerical methods to approximate the trajectories. Zur Theorie und Systematik der Molekularkr¨ fte. 253–266. Florida. 290 (1999). Zeitschrift f¨ r Physik 63 (1930). B ONDI . A SHCROFT AND N. T SAI . The packing density in proteins: standard radii and volumes. [4] A. Newton’s second law of motion can now be written as I B IO . R. 2002. [2] A. C. Liquids and Glasses. the generic trajectories are again ellipses. C HOTHIA AND M. Harcourt Brace. The origin of the force is a fluctuation of electrostatic charge in atoms. The currently available numerical solutions are inadequate to simulate the entire folding process even for small proteins. The definition of the van der Waals radii used to parametrize the Lennard-Jones functions is just one example. Chem. J. Tsai et al. T IRADO -R IVES . One of the difficulties in the simulation is the near cancellation of large forces so that relatively weak residuals gain a decisive influence. Soc. 1976. Chapters 19 and 20] for further details. The problem in molecular dynamics is even more difficult because the potential function is considerably more complicated than a sum of gravitational potentials. Bibliographic notes. S CHLICK . W. [3] W. TAYLOR . Even small inaccuracies in the model or the computation can lead to false decisions and possibly spoil the entire remainder of the simulation. Wiley. Longman. Energy minimization for crystals of cyclic peptides and crambin. To determine the constants needed to parametrize the mathematical formulation of a force field is far from trivial. L.

ch). Palindromic Sequences. (i) Determine the dihedral angle formed by two faces meeting along a common edge. Draw the result in form of a Ramachandran plot. which is the length of the unit circle.rcsb. The arrangement of atoms in a folded protein is often compared to that in a crystal lattices. as usual. Let the energy potential be defined by . and the full solid angle is . A double-strand of DNA has no preferred direction. 5. Ramachandran Plot. Lattices. Structure Repositories. A regular tetrahedron has four equilateral triangles as faces. Prove that the generic trajectory in this force field is an ellipse centered at the origin. but we can orient it so one direction is forward and the other is backward. Counting strings. (i) Download a PDB file from either data base and extract the string of single-letter abbreviations describing the amino acid sequence. Regular Tetrahedron. (i) The face-centered cube (or FCC) lattice consisting of all points with integer coordinates whose sum is even: such that . (ii) Is the relative frequency of amino acids you observe related to the relative number of codons that encode them? 6. (i) Given a strand. Call two linear or cyclic pieces of doublestranded DNA the same if they can be oriented so we read the same string of nucleotides in the two forward directions. Download a PDB file and extract the sequence of and angles along the backbone. [By convention. 7. Draw the graph whose nodes are the acyclic amino acids that has an arc connecting two nodes iff one amino acid can be obtained from the other by the replacement or addition of a single atom.] 3.hcuge.] 8. 2. Sketch two such lattices by drawing the atoms as points and connecting neighboring atoms by straight edges. which meet along six equally long edges. (ii) Determine the solid angle formed by three faces meeting at a common vertex. Amino Acids.org/pdb) and the Swiss Bioinformatics Center (expasy. Call a single strand of DNA a palindromic sequence if it the same as the the complementary strand read backwards. Elliptic Trajectory. how would you determine whether or not it is a palindromic sequence? (ii) Give an algorithm that finds the longest subsequence that is palindromic. (i) How many different linear pieces of doublestranded DNA of length are there? (ii) How many different cyclic pieces of doublestranded DNA of length are there? [Beware of palindromic sequences. Descriptions of protein structures are publically available at the Protein Data ¤ ¦ p Bp B urq t ge ¦   G srf   G # " ¦ p Bp   B § ¦ © ¢ ¡ ¨ ¡ B ¡ ¦ © ¦ ¢ ¡ ¡ ¥ ¢ ¢ ¡   ¡ ¤ ¥ ¤ ¨ ¡ ¡ £ ¢  £&  ¢   £ ¡ ¢   § £ ¥ ¥   #B '   B ©B   ¡ ¤ ¦ ¡  B ¤ ¦ ¢ ¡ ¢   § £   B . (ii) The body-centered cube (or BCC) lattice consisting of all points will all even or all odd integer coordinates: such that or . the full dihedral angle is .Exercises 15 Base (www. we read the strand in the to direction. ¥ ¥ ©  Exercises 1. In either direction. which is the area of the unit sphere. (i) Is the graph connected? (ii) Does every connected component have a path that passes through every node exactly once? 4. The force it exerts on a point is .

MOLECULES .16 I B IO .

but this might be a result of evolutionary selection. The goal of studying the geometry of proteins is therefore two-fold: the development of new computational tools to help determine or refine structure information and understanding the relationship between shape and function.4 Space-filling Diagrams Power Diagrams Alpha Shapes Alpha Shape Software Exercises 17 . like proteins fold up to same shapes. S EQUENCE S HAPE A protein is a peptide chain of amino acids that folds up and forms a shape. protrusions. Finally in Section II.3. we introduce alpha shapes. we use Voronoi diagrams to decompose space-filling diagrams.Chapter II Geometric Models A surprising finding in the research on proteins is the importance of geometric shape in their functioning. In this chapter. which are dual to space-filling diagrams and are our preferred computational representation. the shape seems to determine how proteins interact with each other and with other molecules. The rest of this books takes a complementary view by concentrating on mathematical models and computational data structures that arise in the study of proteins. which is due. we introduce some of the basic geometric models useful in representing molecular shape. we develop a language suitable for studying details of our models.1.4. In Section II. By and large. At the current stage of our biological knowledge. and in doing   ¡    ¡  F UNCTION II. we introduce space-filling diagrams as the primary geometric model of molecules. In Section II.2. dynamics. we talk about the Alpha Shape software and discuss how it can be used. there is an overwhelming accumulation of sequence information.1 II. in part. In Section II. In a natural environment. This finding is usually expressed as a causal chain of responsibilities:  so. The details of that shape in terms of its cavities. We have seen the bio-chemist’s view in Chapter I. this is only a small fraction of the wealth of available sequence information.3 II. Although the number of proteins for which the three-dimensional structure has been resolved and is stored in the Protein Data Base is in the thousands. who aims at pruning the immense variety by limiting attention to physically or chemically likely configurations.2 II. and energetics determine how it interacts with other molecules. to the near completion of several large-scale genome projects.

which is a small protein of barely more than 300 atoms. The total number of arcs is however rather limited. The interior of each cap lies in the interior of the union. but there would be if the two disks to the lower left were just a little smaller. Let be a finite set of disks in the Euclidean plane. which we denote as . A single disk can contribute any non-negative number of arcs. we specify each ball by its center and its radius . More formally. we cannot get more than arcs. which has no endpoints.2. At any moment during the motion. The upper bound is a consequence of the relationship between arcs in the boundary of the union and angles in the Delaunay triangulation. It is also possible that an arc is an entire circle. is by and We note that the rounded boundary of large tangent continuous but can have cusps at places where the rolling circle cannot quite squeeze through two disks.2: On the outside. The union  Figure II. If there are disks whose union is a simply connected region.3 shows the union of balls that represent gramicidin. this new curve is the boundary of the portion of that is not covered by any placement of the open disk bounded by the rolling circle. The sphere bounding intersects the other balls in a finite collection of caps. The front of II. the boundary of the union of uniformly grown disks. and on the inside the rounded boundary of the original union. Similar to the two-dimensional case. then the number of arcs cannot exceed .  ¤0    ¥£¢ 0 ¤ ¥  ¢ 0   ¢   ¦¡  ¡ ¥  ¥ d  ¡  ¡  ¡ ¢ 0   ¥¢ 0 ¥¢   £¢ ¢     ¡¡ ¡ ¥ ¡ 0 . Let now be a finite set of balls (solid spheres) in three-dimensional Euclidean space. An example is shown in Figure II. the rolling circle describes the rounded boundary. we may turn the cusps into crossings by adding arcs connecting the cusps.1. To this end we roll a circle of radius on the outside about the boundary. which we denote as . .1 Space-filling Diagrams A space-filling diagram associates a molecule with a portion of the three-dimensional space it occupies. has a boundary that consists of circular arcs meeting at common vertices. We study such unions first in the plane and then in space. as in Figure II. There are no cusps in Figure II. which will be explained in Section II. Figure II. This curve is the boundary of obtained by growing every disk to radius . .1. We specify each disk by its center and its radius .  ¡   §¢ Rolling circle. Hints towards proving the upper bound can be found among the exercises at the end of this chapter. and the portion of the sphere not covered by any cap is the ¢ ¡ ¡ ¢ 0   ¢    ¡ ¡ ¡ ¡ 0 ¢ ¡ of the disks. The construction is illustrated in Figure II. Figure II. which consists of convex and reflex circular arcs. We can make the boundary of the disk union smoother by substituting blending curves for the vertices where the circular arcs meet. Even if we allow more general configurations. In cases where tangent continuity is important. We can imagine creating that portion with a milling machine whose material removing stylus has the shape of the rolling circle. The cen- Union of balls.2. Four of the eight disks contribute two arcs each to the boundary. Union of disks. we study the portion contributed by a single sphere.18 II G EOMETRIC M ODELS ter of the circle thus traces out a curve at distance away from the boundary. the circle touches the boundary but never intersects the interior. We thus obtain a tangent continuous immersion of a curve in . To understand the structure of the boundary of the union.1: Union of disks in the plane. An atom is represented by a ball (a solid sphere) and a molecule is the union of balls of its atoms.2. The tacit assumption in constructing such a diagram is that the locations of the atoms in three-dimensional space are known.

fewer than arcs. The radius is chosen so that the rolling sphere approximates a water molecule. Rolling sphere. we first note that a single sphere intersects the other balls in fewer than caps. To count the faces contributed by our sphere. Since each arc has at most two endpoints (if it is a full circle then it has no endpoints) and each endpoint belongs to two arcs. When we look carefully. and the boundary of is referred to as the solvent accessible 0  ¢ ¡ ¤ ¥  ¢   ¢ Figure II. The same type of symmetry can also be observed in dimensions beyond three. can can detect a self-intersection of the surface in Figure II. we multiply by and note that each arc belongs to at least two and each vertex belongs to at least three spheres. the radii of the balls are usually the van der Waals radii of the atoms.4 shows such a rounded surface representation of gramicidin. This shows that the upper bounds are asymptoti- Figure II. the numbers for well packed sets of spheres. There is a hole whose rounded surface penetrates through the outer surface roughly in the middle of the picture. and fewer than vertices.3: A union of balls representation of the gramicidin protein. the union of reflex patches (tori and spheres) is referred to as the re-entrant surface. We conclude that there are fewer than faces. The center of that sphere moves along the boundary of the union of grown balls. By analogy to disks in the plane. There are convex spheres in Figure II. and reflex sphere patches that correspond to vertices of . The caps form the same structure as the disks discussed earlier. To get bounds on the total number of faces. the number of arcs in the boundary of the union of caps is less than . which are common for proteins. there are configurations of balls with at least some constant times faces. The structural description of a finite union of balls is thus recursive in the dimension.3 have radii sphere patches that correspond to faces of . To count the faces. The number of arcs and vertices in the boundary of a union of balls in can be quite a bit higher than the same numbers for a union of disks in . Relative to that surface. In the application of space-filling diagrams to biology. edges and vertices. We can again get a smoother boundary by rolling a sphere of radius about . the . Figure II. and the boundary of is referred to as the van der Waals surface. ¤ £¢   ¤  ¢ ¤ £¢     ¢ 0 0 ¥ 0 ¤ ¥  ¢ ¥  ¡ ¥¢  ¥ ¥   ¥ ¥  ¥  ¥¢ ¥ ¢ £¡ ¥  ¥  ¡ ¥  ¥ d¥ . contribution of the sphere to the boundary of the union. The union of convex patches is sometimes referred to as the contact surface because that is where the rolling sphere touches . . we also have no more than vertices. However.II. which implies that there are fewer than faces on this one sphere. Similarly. reflex torus patches that correspond to arcs of .4. and its front sweeps out blending surfaces that cover cusps and crevices of the original boundary. This happens because the tunnel connecting the hole to the outside is slightly too narrow for the rolling sphere to squeeze through.4: A molecular surface representation of the gramicidin protein. we recall that these are the connected components of the complement of the union of caps. are much smaller and typically only a constant times . only that they live on a (two-dimensional) sphere instead of . arcs and vertices. arcs and vertices.1 Space-filling Diagrams 19 cally tight. It can be shown that for each value of . We will see that these components are related to the triangles of the Delaunay triangulation.

It leads to the Voronoi diagram of this section. F U AND J. P. L. Each face of the boundary sweeps out a (three-dimensional) cell in . J. H. let be the set of points with . The rounded surface is usually referred to as the molecular surface.4 are computed using the software described in [1]. Graphics Appl. and Bibliographic notes. Surveys 23 (1991). [3] M. C ONNOLLY. Uniform growth. It follows in particular that is a connected cell. We get the boundary of by drawing the sphere bounding each ball only inside its own Voronoi cell. [1] N. The cell of is the set of points at least as close to as to any other weighted point.3 and the molecular surface in Figure II. We refer to Aurenhammer [2] for a survey of Voronoi diagrams. Since common intersection of the . 345–405. Space-filling diagrams have a long tradition in biochemistry and are similar to the CPK mechanical models named after Corey. [2] F. We describe the same complex as a Voronoi diagram of the set of points with weights . and . Increasing all radii of a set of circles or spheres continuously and at the same rate is referred to as the JohnsonMehl model of growth [4]. . 16 (1996). and each vertex sweeps out a curved edge in the common boundary of generically three membranes and three cells. and we get a structural re-arrangement whenever we sweep over a vertex of the Voronoi diagram. Viewing geometric protein structures from inside a CAVE. ¢ Figure II. The same is true for and every . The molecular surface is sometimes referred to as the Connolly surface. ACM Comput. each arc sweeps out a (two-dimensional) membrane separating two cells. the boundary of consists of patches of such hyperboloids. which is . Since the membranes bounding the are all sheets of two-sheeted hyperboloids. 6 (1983). All these patches are visible in their entirety if viewed from . Voronoi diagrams — a study of a fundamental geometric data structure. Define the weighted distance of a point from equal to the Euclidean distance minus the weight: . the boundary of the union sweeps out the Voronoi diagram. this implies that is also star-shaped and that lies also in its kernel. which is sometimes referred to as the additively weighted Voronoi diagram. A KKIRAJU . the line segment connecting and lies entirely in . Consider the case of two weighted points. Q IAN . Analytic molecular surface calculation. By construction.5: Two-dimensional Voronoi diagram generated by uniformly growing the disks. chapter 1]. Figure II. We can now see how structural differences between and arise: when we grow the balls. II G EOMETRIC M ODELS is the star-shaped and that lies in its kernel. An algorithm that computes cells of the additively weighted Voronoi diagram in has been developed and implemented by Will [8]. The solvent accessible surface in Figure II. 58–61. Pauling and Koltun [5. 7]. this property is expressed by saying that is   2      ¢ ¡      2  2    ¢   ¢ ¤ §¢    ¡ ¢  £¢ 0   ¤  ¢ 2  ¤ £¢   B   0 p ¢  2 2   ¡ 0 0 ¥ ¨¦ ©§ ©B    ¥ ©B    £¢ ¡ ¡ B ¡ ¤ ¢ #B     ©B    ¤ ¢   ¢ ¢ ¥ ¥     £¢ B rp   0   ¢ #B    p  ¢ ¢ ¥ ¢ ¡ B !p p  ¡ B   ¥  ¢ ¢ ¥   B !p   ¢ 3  ¢ ¡ . the arcs of the patches meet up in pairs along the membranes and in triplets along the curved edges of the Voronoi diagram.20 surface.5 illustrates the definition in two dimensions. The variations of these models discussed in this section have been introduced by Lee and Richards [6. In geometry. 548–558. Appl. E DELSBRUNNER . named after Michael Connolly who wrote early software constructing this surface [3]. their algorithms and applications. Crystallogr. If one ball is contained in the interior of the other then its cell is empty. The boundary of and of do not necessarily have the same combinatorial structure. The points of this membrane satisfy which is the equation of one sheet of a two-sheeted hyperboloid. We can understand structural changes by observing how they are introduced while we continuously grow the balls. Observe that for every point . we have two non-empty cells separated by a two-dimensional membrane. IEEE Comput. Otherwise. AURENHAMMER .

Rev. Mining Metall. Molecular Modeling. ETH Z¨ rich. Longman. Areas. Biol. A.II. Principles and Applications. 416–458. ETH 13188. 151–176. M EHL . 6 (1977). Harlow. M. L EACH . Biophys. J OHNSON AND R. [8] H. R ICHARDS . M. Switzerland. Trans. Bioeng.1 Space-filling Diagrams 21 [4] W. 1996. Mol. F. Ann. L EE AND F. Diss. [6] B. England. Reaction kinetics in processes of nucleation and growth. u . R. 1999. The interpretation of protein structures: estimation of static accessibility. Inst. packing and protein structures. volumes. [5] A. R ICHARDS . W ILL . [7] F. AIMME 135 (1939). J. 379–400. Computation of Additively Weighted Voronoi Cells for Applications in Molecular Biology. 55 (1971). Am.-M.

Power distance. Hence. The Taylor series expansion of the radius as a function of time is If lies outside .2 Power Diagrams 0 . and vertices shared by the cells. #  if lies ¡ " B B  B ¡ "      # #B  B © If we grow the square radii of a finite collection of spheres or balls. intersect both. At first. Every polygon is shared by two cells. . The two planes are indeed the same. The set of balls at time is denoted as . Figure II. Think of the three configura- The first order approximation of the growth is one half the inverse of the radius.6. We have D     ¢  0   0  p  p  p  p     B    XB  B   XB  B D 0   0    p  rp D 0   0   p  rp B B     #B   D   0   p  ¡ ¡   © ¡D ¢    ¥   £   ¢  0 D 0D   0 ¥ ¥ ¢ ¢ ¥ ¥   DI   0 ¢ ¢ ¥ ¡ ¥  ©   B rp ¥ ¥ ¥ ¡  0   D  D   00   ¢ ¥ D   0   p  B ¢    ¢      § ¥ ¨¦ £ ¤ ¢ ¥ ¥ ¡ ¢ D  B I D   0 ¥ ¥ ¢ I  ¢ ¢ ¢ £ ¥ ¥  ¢ © ¥ ¡ ©  D B !p         0 . and it passes outside if the two circles are nested. The square of the radius. The power or (weighted) Voronoi cell of a ball under the power distance is the set of points at least as close to as to any other ball. Of course. Power diagram. is the intersection of a finite number of half-spaces and thus a convex polyhedron. As in Section II. we let be a finite set of balls . This decomposition is known as the power diagram and has a variety of applications in molecular modeling.7 illustrates the definitions in two dimensions by showing the Voronoi diagram of the same eight disks used in earlier figures. and it is even possible that it is empty. Using the same algebraic manipulations as above. If follows that the membranes swept out by the arcs of are pieces of planes. we get a decomposition of space into convex polyhedra. it passes through their intersection if that is non-empty. edges. we can show that the set of points with equal power distance from two balls form a plane.6: The line of equal power distance separates if the two circles are disjoint and not nested. We see the circle at which the two spheres intersect sweeps out a plane.22 II G EOMETRIC M ODELS Growing square radii. inside on boundary of outside !   !        ¥  p ¢ ¥  B rp  # $ II. this plane may separate the two bounding spheres. is sometimes referred to as the weight of the point . If we denote by the set of points whose power distance from is at most as large as the power distance from then . The points that belong to both spheres at time satisfy . This polyhedron may be bounded or unbounded. Varying has the same effect as dropping the requirement that the two expressions vanish.1. The appropriate function in this case is the power distance of a point from a ball defined as the square distance from the center minus the weight. We grow each ball to radius at time . the power distance of is the square length of a tangent line segment from to the bounding sphere. As indicated in Figure II. smaller balls never really catch up except in the limit:    Figure II. The power or (weighted) Voronoi diagram of is the collection of cells together with the polygons. In words. and in the generic case every edge is shared by exactly three and every vertex is shared by exactly four cells. so we get tions as snap-shots in an animation in which the center of the small circle moves towards the center of the large circle. We can describe the decomposition of space implied by the square radius growth model as a Voronoi diagram for yet another weighted distance function. larger balls grow slower than smaller ones.   ¦ ¨ § #B  © ¤ #B  © ¢ B  #   ¢ ¡ 3        ¡ 2 ¡ B¡    # ¡ # ¡ We are interested in the surface swept out by the intersection of the spheres bounding and and claim it is a plane. Instead we just require that they both be equal. the line moves in the same direction but then comes to a halt and reverses its direction moving away from the center of the large circle. or lie on the same side of both.

We refer to an element of a Delaunay triangulation as a simplex.7: Power or weighted Voronoi diagram of eight disks in the plane.II. It follows that the number of edges in the Delaunay triangulation is at most some constant times . Hence Figure II. . and for the numbers of vertices. also the number of triangles and tetrahedra are at most some constant times . . Similarly. implying there can only be a small constant number of them. this exhausts all possible types of overlap among the Voronoi cells. hence . . triangles and tetrahedra.8.2 Power Diagrams 23 triangles. and share a common vertex. The (weighted) Delaunay triangulation of is dual to the (weighted) Voronoi diagram. . In three dimensions.      ¢ ¥ "  ¦ ¥ # ¢ " ¦  #   #  # ¢   ¡¢  ¢      ¢ # ¢      ¢ # #  # #     # ¢   ¢ ¢ . polygons become edges. an edge. and are connected by a triangle if . which can be a vertex. hence . edges. Combining this inequality with the Euler relation implies and . The Euler relation here is . The number of vertices is at most the number of disks. which says that the alternating sum of simplices is always equal to 1. hence ler relation implies and . It is obtained by connecting and by an edge if the cells and share a common polygon. and . a triangle or a tetrahedron.8: Delaunay triangulation drawn over the dual Voronoi diagram of eight disks in the plane. and . a Voronoi polygon belongs to a Voronoi cell iff the corresponding Delaunay edge contains the corresponding Delaunay vertex. ¥ ¤ ¦  # #  #  ¥ "  ¢   ¥ ¢  "  ¥ (   ¥ ¦ ¥ ¤¤   #  ¥ ¦ a  d "¦ " ¤ ¢  $       ¥ # ¥d ¦   ¥   ¥      ¦d ¤ " ¥ "  % ¦ ©¥  #  "   £ ¥ ¤  £ ¤ ¦ ¥ ¥ d¥ # ¥  ¦ ¥ ¤ # Delaunay triangulation. but they require a placement of the balls that would be rather unlike the configurations we observe for proteins. Similarly. . we can perturb them ever so slightly to move them into general position. and vertices become tetrahedra. and share a common edge. The neighbors are near the central atom and are therefore packed in a small amount of space. For example. and as a consequence. Typically. we note that each tetrahedron has four triangles and each triangle belongs to at most two . we reverse the inclusion direction. Combining this with the Eutetrahedra. and the number of edges is at most the number of pairs of vertices. The number of vertices is at most the number of balls. Assuming the balls in are in general position. Writing . and are connected by a tetrahedron if . Since complexes of tetrahedra are difficult to draw. Observe that every triangle has three edges and every edge belongs to at most two triangles. ¥ Observe that we reverse dimensions when we go from the Voronoi diagram to the Delaunay triangulation: cells become vertices. The Delaunay triangles are transparent so they do not obstruct the structure of the Voronoi diagram underneath. Before counting the simplices in three dimensions. Number of simplices. edges become There are Delaunay triangulations that have almost this many simplices. we have # Figure II. let us warm up to the challenge by counting the simplices of a two-dimensional Delaunay triangulation. we illustrate the definitions by showing a two-dimensional Delaunay triangulation in Figure II. If the balls are not in general position. We can count the simplices using the Euler relation. each atom is surrounded by its neighbors in the Delaunay triangulation. .

for all . We prefer to be economical with terms and refer to them as (weighted) Delaunay triangulations. and we refer to it as the orthosphere of the four balls. there is no difficulty at all if is negative and is therefore imaginary. . as seen from the viewpoint. 527–549. It is common to reserve the name Delaunay triangulation for unweighted points and to refer to the duals of power diagrams as regular triangulations [1] or coherent triangulations [7]. Acyclicity. 2001. If the four balls had zero radius. which implies that the power distance of from is less than that from . Press. The plane of points with equal power distance from and thus contains the shared triangle. and larger power distance from all others. In other words. We may assume Bibliographic notes. Math. The visibility ordering of the Delaunay tetrahedra with respect to any fixed viewpoint is acyclic. P ROOF. Sur la sph` re vide. Algebraically. Note that is further than orthogonal from all other balls. © ©  ¨  ¥ ¨ ¨    § ¨   §     ¨   ¨ ¡ ¦¨        ¨     ¥  ¦¨ ¨      ¨   ¨  ¨ ¨      0  ¢ ¡ ¡ ¡ 0    #B   § © ¢  ¢ ¡  ¡ 0  0 p ¡ ¡ !Bp     ¡¢   ¡ ¡  ¡ %§ ©B0    ©     0   0    p   p 0    ¡  ¢     ¢ B#    ©B   ©B   ¡   0 ¡ ¡ ¡ ¢ ¢   ¢ ¨ ¥  ©   B ¡0  © ¨  ¢  © ¢ ¥ ¨ © ¢     ¡ ¡ 0 X ¦¥&¨  ¤¢    £ § B ¡ ¡ ¡   ¡ ¡ ¡ ¡ ¡ 0 B £ . J. . L. we can order two tetrahedra if one lies in front of the other one. Let be the sphere with center and weight . G. It turns out that this relation can in general have cycles but is acyclic for Delaunay triangulations. [4] H.  ¢ ¡    ¡ § The name is justified because the two tangent planes defined at any point common to the bounding spheres of and form a right angle between them. Given a fixed viewpoint. It follows that the orthospheres of and of are orthogonal to the three balls whose centers span that triangle. Fiber polytopes. Upper bounds on the number of Delaunay simplices for “well-spaced” points in can be found in [5]. Let now be a vertex of the Voronoi diagram of . Power diagrams of discrete sets of weighted points have been studied by Carl Friedrich Gauss more than 150 years ago in the context of quadratic forms [6]. Let be the viewpoint and write if there is a half-line that emanates from and passes through the interior of the Delaunay tetrahedron before it passes through the interior the Delaunay tetrahedron . The dual triangulations have been introduced considerably later by Boris Delaunay (also Delone) [2]. Uber die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen. whenever the same is true for and . E DELSBRUNNER Geometry and Topology for Mesh Generation. 793–800. Chapters I and V]. Nauk SSSR. ¨ [3] P. Cambridge Univ. e Otdelenie Matematicheskii i Estestvennyka Nauk 7 (1934). 40 (1850). and belongs to the Delaunay triangulation of iff the orthosphere of . Since real numbers are totally ordered. In reference to subsequent work by Dirichlet [3] and Voronoi [8]. and . B ILLERA AND B. Akad. Ann. D IRICHLET. that is. and distance from the orthosphere of . Then each Voronoi vertex is equally far from four points and coincides with the center of the circumsphere of these points. Math. would be their circumsphere. Reine Angew. Two spheres or balls and are orthogonal if II G EOMETRIC M ODELS that does not intersect any edge of the Delaunay triangulation. J. The half-line passes through a sequence of Delaunay tetrahedra. That sphere is orthogonal to . Let be a half-line that emanates from and passes through the interiors of and . Assuming the generic case. We need some notation. England. . . Algorithms for constructing weighted Delaunay triangulations in and are discussed in [4. We call this the visibility ordering with respect to the given viewpoint. . the power distance increases along chains of the relation . 209–227. Suppose for a moment that the balls all have zero radius. By transitivity. a tetrahedron connecting points . ACYCLICITY L EMMA .24 Orthospheres. we conclude that is acyclic. the power distance of from the orthosphere of is less than its power . S TURMFELS . 135 (1992). Specifically. Izv. The viewpoint is on ’s side of that plane. Any two consecutive tetrahedra share a triangle. and . That reference also explains how to computationally cope with ambiguities in the construction caused by non-generic input sets. We will use the concept of orthogonality to generalize this property to the case where the have not necessarily zero and not necessarily equal radii. has equal power distance from four balls. This property can be used to characterize Delaunay tetrahedra for a generic set of balls. [2] B. [1] L. We use orthospheres to prove that the relation is acyclic. these diagram are often referred to a Dirichlet tessellations or Voronoi diagrams. and we have and for some . . and is further than orthogonal from all other balls in . D ELAUNAY.

Discriminants. 1994. [7] I. 2002”. M. J. G ELFAND . G AUSS . K APRANOV AND A. Birkh¨ user. Z ELE VINSKY. ` e Math. Resultants and Multidimensional Determinants. [6] C. ACM-SIAM Sympos. Reine Angew. M. 198–287. 20 (1840). J. Dense point sets have sparse Delaunay triangulations.. Reine Angew. . In “Proc. 13th Ann. 125–134. E RICKSON . a [8] G. 97–178. and 134 (1908). F. VORONOI . Recursion der Untersuchungen uber die ¨ Eigenschaften der positiven tern¨ ren quadratischen Formen a von Ludwig August Seeber. Discrete Alg.II. Math.2 Power Diagrams 25 [5] J. 133 (1907). Boston. M. 312–320. V. Nouvelles applications des param` tres cone tinus a la th´ orie des formes quadratiques.

We have and because the -st circle intersects the other circles in at most two points each. This condition has an interesting consequence on how the themselves may intersect. For this collection to be independent. In this section. we refer to it as the dual shape of . two. Figure II. We first discuss this pattern for general sets that are not necessarily balls. ¡ Recall that the Delaunay triangulation is the dual of the Voronoi diagram. These points cut the -st circle into at most arcs.10. Call of a collection of sets independent if for every subcollection there is a point inside every set in and outside every set not in : Hence.26 II G EOMETRIC M ODELS Independence. Recall that a simplex belongs to the dual complex iff the corresponding clipped balls (the ) have a non-empty common intersection.9 illustrates the definition for the set of disks used in many of the previous figures. there must be points whose patterns of inclusion in the sets are pairwise different. each stick represents a covalent bond. Figure II.3 Alpha Shapes   ¤    §  ¢     ¡  ¢    ¥ ©    ¢ ¤ ¡ ¥¨ ¥ ©   ¨¢ ¦¨ ¡   ¤ § ¥   £  ¡  ¡    ¢£¢ ¡   # £ ¥¨ # ¤   . where it can be used to show that the maximum number of independent balls is four. we generalize this construction and consider the dual of the Voronoi diagram restricted to within the union of the defining balls. The same argument also works ¥   &   Figure II. ¥      ¥ 7     ¥5         ¥     ¥ ¥ ¥  ¥    0 ¥¤     £      ¢          ¥ 0  ¥ 0 ¦   ¤ ¤   ¦  ¥ 0  0      0 where is the convex hull of the centers of the balls with index in . The underlying space is the set of points contained in simplices of . there is only one possible intersection pattern for four independent balls. The nine edges correspond to the pairwise intersections and the two triangles to the triplewise intersections of the clipped Voronoi cells. Again. In a nut-shell. Let we can get by drawing circles in the plane. The dual complex records the non-empty common intersections among these cells. iff the common intersection of Voronoi cells has a non-empty intersection with the union of balls: . in three dimensions. Note that this is just a more formal way of explaining the duality transformation we used in the last section to construct the Delaunay triangulation from the Voronoi diagram. which implies that at most three disks can be independent. Equivalently. and they can form only one combinatorially distinct intersection pattern. Observe that the Voronoi cells decompose the union of balls in into convex cells . The number of regions is therefore           ¢  ¥      ¥  #  "  § ! ¥   ¥ Dual complex.  ¡¡ # II. We use the pigeonhole principle to show that the maximum number of independent disks in the plane is be the maximum number of regions three. In the special case. Let be a subset of the index set. # A collection of size has subcollections. . There. In this context. while here. there can be at most four balls (one more than the dimension of the space). looks like the ball-and-stick diagram common in chemistry and biology. and three disks in the plane. in which the balls have non-empty pairwise but no non-empty triple-wise intersections. For each there is a (combinatorially) unique independent configuration shown in Figure II.9: The dual complex is drawn on top of the Voronoi decomposition of the union of disks. it represents the geometric overlap between two balls.10: The independent configurations of one. and each arc cuts at most one region into two.

)  ¦¤  ¤   ¥ ¡   ¡   ¢ ¡     ¡ ¡ ¤ u D   D ¡ D   ¢  ¡D ¡      ¡     ¡ ¤ ¡ ¢¡    £    D 0 £ ¤¥  D   0 0  ¡ ¤     ¥ ¡ ¡ ¢¡ ¢ ¡       ¢  ¡         ¡  ¡ ¡ ¡   ¢ §§¢¡ ¥ ¡ ¡ ¡         ¢ ¡ ¡ ¡  ¢    ¡ ¡   ¡ ¢ ¡     ¢ . all radii are imaginary. So there exists a subset not represented by 27 independent caps. and because lies outside the sphere. Furthermore. the radius of the ball at time vention is that for is . and so on. We need some notation. In this spirit. It can still be that there is a point outside contained in . Figure II. We refer to as the -complex and to its underlying space as the -shape of . the dual complexes can also only get larger in time. and can be proved by induction over the dimension.3 Alpha Shapes Independent simplices. as the index for time varying sets. for every . we get three disks of maximum size by intersecting them with the plane that passes through the centers. Similarly. There are only finitely many simplices and therefore only finitely many subcomplexes of that arise as dual complexes during the growth process.II. ¡ £   0 ¤    )  £ ¤¥ Figure II. covers all Voronoi vertices. four for a tetrahedron.12 illustrates the construction by showing three complexes in the filtration generated by eight disks in the plane. A particular such configuration is illustrated in Figure II. we sometimes forget the difference and think of the simplex as this collection of balls. . As mentioned above. Assume first that . Given three balls. Instead of time. I NDEPENDENCE L EMMA . . is not independent. . we assume the lemma for disks (or rather for caps on a sphere) and prove it for balls in . and the dual complex is empty. union: P ROOF. A collection of four balls in is independent iff the (unique) vertex of the corresponding Voronoi diagram is contained in the . Recall that each simplex in the Delaunay triangulation is spanned by the centers of a small collection of balls. In other words. The three planes meet at . for example . we call the simplex independent if the collection of balls is independent. but then . and because lies outside . the Voronoi cells of the balls are unchanged at all times. Let be the collection of balls and the dual complex of at time . To translate between continuous time and discrete To prove the reverse. since the portions of the Voronoi cells covered by the balls can only grow.11. and the dual complex is equal to the Delaunay triangulation. . In discussions of combinatorial properties. This is a fairly strong statement since it limits the balls to a single intersection pattern. It follows that each simplex in is independent. We refer to this sequence as a filtration of the Delaunay triangulation. But this is exactly the criterion for a simplex to belong to the dual complex. The main reason for this con. We thus have a sequence of complexes that begins with the empty complex and ends with the Delaunay triangulation. three for a triangle. We will prove shortly that all simplices in the dual complex are independent. But this implies that the Voronoi vertex lies outside the sphere: . The lemma holds in any dimension. It follows that the dual complexes that arise throughout time are subcomplexes of one and the same Delaunay triangulation. that is. Each has zero weight at time and negative weight and therefore imaginary radius before that time. The following lemma is the key to proving that all simplices in the dual complex are independent.11: The planes bounding the Voronoi cell intersect the sphere in three circles. the three caps are not independent. we assume that is not independent. We return to the idea of growing the balls continuously and watch how the union changes. There sphere bounding intersects the other balls in three caps. as claimed. Filtration. the three caps are not independent. But this implies that three balls are independent iff the (unique) line in the corresponding Voronoi diagram has a non-empty intersection with the union of the three balls. two balls are independent iff the (unique) plane in the corresponding Voronoi diagram has a nonempty intersection with their union. . By construction. This plane intersects the Voronoi diagram of the balls in the Voronoi diagram of the disks. We let time go from to and grow the weight of each ball to at time . For small enough (large enough negative) time. The circles bounding these caps lie in the three planes bounding the Voronoi cell of . To avoid the complications of a discussion for general dimensions. we use the square root. the Independence Lemma also holds for three disks in the plane. Then intersects the other three balls in three non- )   ¡ ¤       ¤ ¤    ¡ any point on the sphere. For large enough time.

with the orthosphere of dying last at time . Kirkpatrick and Seidel [3] in 1983 for finite sets of points in the plane. every prefix is a complex. The unexpected popularity of that software in structural biology triggered the development of further geometric concepts useful in structural biology. and this has been described in complete generality in [2]. We can sort the Delaunay simplices in the order in which they enter the dual complex. Sometimes. but the pair of larger disks became independent earlier. [2] H. however. the edges become thinner and the triangles become lighter. 415–440. 1998 (republication of translation of the original Russian edition from 1947). that also belongs to the difference. 13 (1995). In this case. From the first to the third complex. Alpha shapes and alpha complexes have been introduced by Edelsbrunner. S. This is also the time when the three disks become independent. The first complex contains all vertices but only two edges and no triangles. 5].   ¤  ¨   ¨  ¡  ¤  ¤  £ ¡ ¨    ¢ ) ¡ ¨ ¡  D ¨  ¤  ¤ ¨  ¡  ) ¡ ¨       D  ¨   ¡  ¨ rank. In the generic case. some of which are explained in this book. Dover.13 illustrates this case. the difference between and consists of two or more simplices. . Define the birth-time of a simplex as the minimum time such that for all .    such that if We represent the filtration by sorting the Delaunay simplices by birth-time. their orthospheres die at different times.13: The two larger disks are independent. A LEXANDROV. . In the absence of any degeneracy. Let the orthosphere of be the smallest sphere orthogonal to all balls whose centers are vertices of . Often two contiguous complexes and differ by only one simplex. Geometrically.28 II G EOMETRIC M ODELS the shared Voronoi vertex. [1] P. Figure II.  ¤ Ordering simplices. this case is characterized by a non-empty common intersection between the affine hull of and the Voronoi cells of its vertices. The union of balls and its dual shape. Figure II. which has been developed decades earlier in the area of combinatorial topology [1. Every dual complex is a prefix of this ordering. The main reason for the popularity is the duality between space-filling diagrams and alpha shapes as explained in this and the two preceding sections. All these simplices are born at the same time. namely when all three disks reach Bibliographic notes. and because of the tie breaking rule. but the dual edge does not belong to the dual complex because their common intersection is disjoint from the corresponding Voronoi edge. That generalization benefitted from adopting the language of simplicial complexes. the birth-time of coincides with the time it becomes independent. computes the connectivity of the  ¤   ¤ ¡ ¤ %¨     ¨  £ ¡ ¨   D  ¤  ¥     ¨ ¨    ¤ ¨    D ¨   £ . Geom. . and in case of a tie by dimension. even if it does not coincide with a dual complex. Remaining ties are broken arbitrarily. E DELSBRUNNER . The difference between two contiguous complexes in the filtration consists of all simplices whose birth-time coincides with the creation of the second complex. To fully develop that duality. Discrete Comput. This property of the ordering will be crucial for the algorithm in Chapter IV that . The triangle connecting all three centers and the edge connecting the centers of the two larger disks are born at the same time. all these simplices are faces of a single simplex. Figure II. New York. Combinatorial Topology.12: Three unions of disks and the corresponding dual complexes. the concept has been generalized to three dimensions and made available as a software package with graphical user interface [4]. we define a function . The time becomes independent is also the time the orthosphere of dies or shrinks to a point. About a decade later. alpha shapes had to be extended to take into account weights.

E DELSBRUNNER . ¨ [4] H. S EI DEL . Second edition.II. Chapman and Hall. D. Theory IT-29 (1983). J. 43– 72. E DELSBRUNNER AND E. 1981. M UCKE . 551–559. G IBLIN . Graphics 13 (1994). IEEE Trans. On the shape of a set of points in the plane. Surfaces and Homology. P. London. G. K IRKPATRICK AND R.3 Alpha Shapes 29 [3] H. ACM Trans. Inform. Graphs. . [5] P. Three-dimensional alpha shapes.

This is done according to published translation tables that map atoms to van der Waals radii. Given a pdb-file. Delaunay triangulation. including measurements of closest approach.pdb name to read name. which is the most common approximation used for the size of water molecules. The operations are ambiguous if the balls are in non-generic position. Some differences are due to different methods used to derive radii.3.dt that represents the Delaunay triangulation.30 II G EOMETRIC M ODELS tains a line for each atom listing its three coordinates and the van der Waals radius.4 A. we call > pdb2alf -r 1. £ § ¢ ¡ ¢  ¢ ¡ The details of the discussion apply to Version 4. and We can extract the coordinates and the radii using software that is part of the Alpha Shapes distribution.pdb name delcx name mkalf name alvis name The -th ball is inserted through a sequence of flip operations. With this notation. which accounts for almost 50% of the number of atoms found in organic matter. O RIENTATION : decide whether a ball center is on the positive or negative side of the oriented plane spanned by three other ball centers. etc. The efficient and robust construction of the Delaunay triangulation in is not entirely straightforward. In our example. Hydrogen atoms sometimes donate their electrons to complete the shells of other atoms and thus can exist without any shell and radius to speak of. Using an arbitrary ordering of the balls. we write for the set of the first balls and for the Delaunay triangulation of . molecular mechanics calculations. II. In the common unified atom model. for . Specifically.pdb and create a new file name that con-    for to do I NSERT ¡  ¡ )    ) . The coordinates are explicitely given in the file. the van der Waals radii of larger atoms are adjusted to include the bonded hydrogen atoms. Hydrogen atoms are generally not represented in pdb-files. but can be inferred to some accuracy from the types and relative positions of the other atoms in the protein. The basic strategy is incremental.1. see Section II. . One of the most problematic elements is hydrogen (H).4 name. The resulting set of balls thus defines the solvent accessible diagram representing the interaction with the surrounding water.   ¥  ) ¥ ¤ ¥ £ £ ¤     ¢  ¡ ) £   £ Data format. there is no universally agreed upon table. The -r option allows for the specification of a radius increment that is applied to every atom in the file. Both tests reduce to the sign of the determinant of a small matrix and can be decided without computing intermediate geometric information. we use exact arithmetic and simulated perturbation. but the radius must be inferred from the atom type. The discussion is more descriptive and less analytical than in the previous three sections. we take four steps to construct and visualize alpha shapes in an interactive graphical user interface: > > > > pdb2alf name. Specifically.pdb. adding one ball at a time to the triangulation. Unfortunately. Exact arithmetic guarantees the correct execution of flips in all generic and therefore unambiguous cases. We briefly mention the algorithmic ingredients used. The first step towards computing alpha shapes is to construct the Delaunay triangulation of the set of balls. the algorithm can be written as follows. The flips are performed depending on the outcomes of only two types of primitive tests needed in the construction of the Delaunay triangulation: O RTHOGONALITY: decide whether a ball is closer or further than orthogonal to the orthosphere of four other balls. endfor. The main public source for structural protein data is the Protein Data Bank (pbd) mentioned in Section I.4 Alpha Shape Software This section introduces the basic Alpha Shape software and explains how to go from a standard descriptions of protein structures to the visualization of their alpha shapes. Only a fraction of the information is needed to construct alpha shapes. This is accomplished by the command > delcx name The aunay omple program creates a file name. and so is the Delaunay triangulation. name. this radius increment is ˚ 1.1 of the Alpha Shape software executed on an SGI workstation running under the UNIX operating system and may differ for other versions and platforms. for each atom we only need its coordinates in three-dimensional space and its radius. To cope with the related robustness problem.

For example. we only show the singular simplices together with the regular triangles. then every pair of vertices forms an edge in the Delaunay triangulation. so . We finally discuss the visualization interface of the Alpha Shapes software.3.alf.15 shows four alpha complexes of the relatively small gramicidin protein. The remedy here is to add the balls in a random sequence. £    ¨ D  D      D ¡ D       is  D           ¤ D ¤ ¡D   D   D    D ¡ not in singular regular interior if if if if ) ¡ ¨ ¡ D ¨ ¨ ¨   ¤ ¡ ¥ ¨ D ¨ ¦¦    ¦¦ ) ¢  D ¤ D ¤ ¡D £ ¨ ¤ £ ¢ ¡ ¥ ¡  £ )       ¥ ¡ ¤  ¢ £ ¤ ¥    ¢  ¤ £¡ ¤    ¢   . Some of the three events may coincide. and a signature panel. The use of exact rather than floating-point arithmetic poses a challenge to the efficiency of the code. and the filtration file. Fortunately. it enumerates the simplices whose intervals contain in time O( ).2.14: Edge-skeleton of the Delaunay triangulation of twenty one points on the moment curve in . Given a value of . Then we spend a lot of time constructing that triangulation. Suppose the three events happen at times . dual complexes obtained by growing the square radii form a nested sequence of subcomplexes of the Delaunay triangulation. ¨ Figure II. The main reason for recording all this information is to determine how to draw in the graphical interface. A simplex in the boundary of can never become interior. only to destroy most of it before arriving at the final triangulation. For this purpose. name. so . The simplex is regular if it belongs to the boundary but is not principal. we need quick access to the simplices of the various types in . when becomes a face of another simplex. All alpha ©  © ¢ £ Filtration. and scene panel. but there are others. Finally. It stores each simplex several times. The sequence is generated by calling ¥ ¥  £ § ¨ ¢  £   ¡ D ¤   D   £ usually well packed and have Delaunay triangulations of size at most proportional to . A common remedy is to use so-called floating-point filters: calculate in floating-point arithmetic. We represent the filtration by the sequence of Delaunay simplices ordered by birth-time. The danger remains that one of the intermediate triangulations is large. and redo the computation in exact arithmetic if the error is too large to guarantee a correct decision. The software refers to the sorted sequence of simplices as the ‘masterlist’. . and it is interior if it is completely surrounded by other simplices. The interface consists of a visualization panel. For example. the Delaunay triangulation in can have a number of simplices that is quadratic in . Another challenge to the efficiency of the code is the inherent size of the Delaunay triangulation. In other words. marking when is born. if the centers of the balls lie on the moment curve and all radii are equal. so . As explained in Section II. a simplex whose orthosphere dies strictly before the simplex is born is never singular. and for a given moment . name. name. that stores the filtration along with some auxiliary data structures. The combinatorial topology term for being singular is principal and means that is not a face of any other simplex.14. Figure II. As mentioned in Section II.II.4 Alpha Shape Software simulated perturbation reduces ambiguous cases in a consistent manner to unambiguous ones. we apply a random permutation to the input sequence and construct the Delaunay triangulation following this permutation. and when becomes interior to the alpha complex. the balls of organic molecules are 31 > mkalf name The a e pha shape iltration program reads the Delaunay triangulation in name.dt. This is the filtration of -complexes. bound the error. as shown in Figure II. a tetrahedron is interior as soon as it is born. In each case.dt and generates a new file. This danger is quite real as systematic enumerations of the data tend to generate subconfigurations with relatively large Delaunay triangulations. Then  Visualization. Each such tree stores some number of intervals in space O( ). for .alf. The necessary support structures are computed and the graphics user interface is opened by executing > alvis name The pha shape ualization program uses both the Delaunay triangulation file. we store the existence intervals in a number of intervals trees.

Figure II. A survey of geometric measure- £ ¡ ¤   ¤  ¡ ¡ ¤ ¨ . By default.16: Signature panel of the Alpha Shape visualizer. For example. For example. Instead of mapping the time to a property of . The Alpha Shape software was created by Ernst M¨ ucke as part of his doctoral work at Urbana-Champaign. shaded. The buttons in the middle of the scene panel provide control over how simplices are drawn: colored. the panel contains a signature that maps the index to time. which is still the most recent version distributed on the web [7]. To support that selection. The Delaunay triangulation software in the Alpha Shapes distribution is based on a variety of algorithmic techniques described in a recent text by Edelsbrunner [3].15: Four alpha complexes of gramicidin.16 shows the signathe underlying space of ture panel and the three default signatures for gramicidin. Figure II. triangles and the regular triangles are shown.17: Scene panel of the Alpha Shape visualizer. The interval tree used for fast retrieval of simplices is explained in [2]. which can be accessed via the web [8]. Specifically. the three default signatures map each index to the number of singular edges. the D #D ¨   £  ¤ Figure II. as shown in Figure II. £  ¤ Bibliographic notes. A particular index.17. only the singular vertices. complexes are shown in the first but which complex is shown and how it is shown is decided in the other two panels. and the volume of . 1-skeleton of the Delaunay triangulation shown in Figure II. the signatures map the index to the property of . After a period of rapid development directed by Ping Fu at the National Center for Supercomputing Applications. To facilitate the reconstruction of the map from time.14 is obtained by drawing all edges of the last alpha complex while suppressing the display of all triangles and tetrahedra. The visualized complex is selected in the signature panel. the largest resource for structural protein data is the Protein Data Bank [1]. it shows the log-scale graph of . or with gaps created through a slow explosion. All signatures that count rather than measure are displayed in log-scale. is selected by the position of a vertical bar in the signature panel and by clicking the Alpha Shape button in the scene panel.32 II G EOMETRIC M ODELS Figure II. . Different settings can be used to highlight different aspects of an alpha complex. As mentioned earlier. the software reached version 4. The best documentation of the algorithm and data structures used in the software are still his thesis [6] and the original paper on the topic [4]. seamless. The matrix on the right hand side can be used to select the types of displayed simplices.1 in 1996. in wireframe. the panel displays a variety of functions (or signatures) that illustrate how the complexes change with time. the area of the boundary. edges.

[5] M. 13 (1983). and volumes. J. ¨ [6] E. Shapes and Implementations in Threedimensional Geometry. Three-dimensional alpha shapes. ACM Trans.org. B HAT. see also the software collection in biogeometry. F. Illinois.edu. Chapter 22 in The International Tables for Crystallography. ¨ [4] H. Comput. 1993. G ERSTEIN AND F. M.duke. A new approach to rectangle intersections – part I. F ENG . W ESTBROOK . G.). W EISSIG . I. E DELSBRUNNER . Univ. [8] Protein Data Bank web-site at www. P..rcsb. areas. S HINDYALOV AND P.alphashapes. J. P. 33 . [7] Alpha Shapes web-site at www. M UCKE . B ERMAN . Vol. 2001. 28 (2000). Sci.org/pdb. [3] H. N. E DELSBRUNNER . M UCKE . E. 531–539. UIUCDCS-R-93-1836. Arnold (eds. Dordrecht. Kluwer. Math. Graphics 13 (1994). Internat. Comput. 235–242. Z. R ICHARDS . Rept. Protein geometry: distances. [2] H. 43–72. T. G ILLI LAND . M. Press. 2001. Urbana. B OURNE .4 Alpha Shape Software ments of proteins including a discussion of different tables for van der Waals radius assignment can be found in [5]. 209– 219. [1] H. Nucleic Acids Res. Rossmann and E. Geometry and Topology for Mesh Generation. N. H. M. E DELSBRUNNER AND E.II. The Protein Data Bank. England. Dept. the Netherlands. Cambridge Univ. G.

(ii) Describe the Voronoi diagram and the sequence of alpha complexes of the model. Binomial coefficients. The generalization is not quite as neat if we sum powers rather than binomial coefficients. The filtration of water. Let be the maximum number of cells we get by drawing spheres in . Barycentric subdivision. (i) Look up the standard geometric model (determined by radii. (iii) caps on a sphere in  disks in 2. Empty Voronoi cell. Given an alphabet of letters. . (i) How many vertices. ¦ [You might consider answering question (ii) before question (i). 4. The barycentric subdivision of a simplex is obtained by adding the barycenter of (also known as the centroid or center of mass) as a new vertex and connecting it to the simplices in the barycentric subdivisions of the faces. triangles and tetrahedra are in the barycentric subdivision of a tetrahedron? (ii) Use the Alpha Shape software to create the barycentric subdivision of a regular tetrahedron. The sequence is tree-like if there are no two letters that alternate more than twice.34 II G EOMETRIC M ODELS (i) Show that (ii) Show that  ¡ 5. Number of arcs. Let be a set of the plane. use tree-like cyclic sequences to prove that it consists of at most (maximal) circular arcs. Is this bound tight?   ¥ ¡ (i) Show that (ii) Give a formula for tive .] 6. Sphere arrangements. and that satisfy Conditions (a) and (b). Call a disk in a finite collection of disks redundant if its Voronoi cell is empty. In other words. The boundary of the union of the disks consists of circular arcs contributed by the circles. [You will need to use weights to make the barycentric subdivision of the tetrahedron the Delaunay triangulation of the points. 8. (ii) Prove that the necessary conditions given in (i) are also sufficient. A water molecule consists of one oxygen and two hydrogens: H O. (i) Assuming the boundary of is a single closed curve. prove that if is redundant then there exist disks . Let be two positive integers and recall that the binomial coefficient is the number of ways we can choose elements from a collection of elements. form a sequence but refrain from placing any letter twice in a row. Recall also that ¥ £  ¥  ? 7. bond length and bond angle). A half-plane is the set of points on or on one side of a line in . subsequences of the form and are prohibited.] ¥    £ Exercises £ ¥          ¡ ¥ £ £ £   £  £ £ ¥£   . Similarly. Examples of tree-like sequences of four letters are and . that works for all posi- ¢ ¡   ¥    ©¥  ¥ ¥ ¥    ¥       ¥     1. In other words. [We note that the relation in (ii) neatly generalizes the formula .  £  ¡ ¡ R r¡     ¡ R ¡   ¥ ¥ #B     ¥ #¥¦ © ¥ d ¥ d    ¢ ¡ ¡ ¥ ¥    ¡ ¡¡ B ©B    ©B  ¥  ¥ ¢          ¥ ¤ ¥        ¢ ¥ ¡ ¢ ¡ ¢  R ©   ¥    § ¨   ©¥ ¥  ¡ R ¡ R ©        ¥  ¨ ¨   ¥ ¡ R ¡ R ¥ d ¥  ©B    ¥¢     ¢ ¡ © ¥ 7¥ R   R  r¡ R R R 6R ¥ ¡   ¡ 5 ¨   . Is this bound tight? 3. Independent half-spaces. and in the collection such that (a) for the orthocenter of . Tree-like sequences. Is this bound tight? (ii) Prove that in general the number of (maximal) circular arcs in the boundary of the union is at most . and (b) lies in the triangle then is redundant. Is this bound tight? (ii) Define a tree-like cyclic sequence by prohibiting cyclic subsequences of the form . edges. and a cap is the intersection of a sphere with a half-space. Prove that a tree-like cyclic sequence over an alphabet of letters has length at most . What is the maximum number of independent (i) half-planes in . (ii) half-spaces in .  unless .]  ¡ ¥  ¢ ¡ ¨ ¢ ¡  ¡ ¨ ¢ ¡ ¥ (i) Prove that a tree-like sequence over an alphabet of letters has length at most . (i) Prove that if there are disks . a half-space is the set of points on or on one side of a plane in .

In other words. we introduce model that is similar to the molecular surface. In Section III. we describe the algorithm that constructs a molecular skin in terms of a triangle mesh.4. and some of the possibilities along these lines will be discussed in Chapter VIII.2.1 III. we discuss various notions of curvature of a surface. We have also discussed the molecular surface model that is obtained by rolling a sphere about the van der Waals model.4 Molecular Skin Curvature Adaptive Meshing Skin Software Exercises 35 . and we show that the maximal principal curvature is a continuous map over the molecular skin. The molecular skin also lends itself to represent deformations. We call this the molecular skin model. Corners and crevices are filled up and the surface consists of spheres connected by blending torus patches and inverted sphere patches.3 III. and we use that software to illustrate some of the properties of these curves and surfaces.2 III.Chapter III Surface Meshing Recall the different types of space-filling diagrams we discussed in Chapter II. The surface is piecewise quadratic and has a number of attractive properties not shared by the other space-filling models. another the continuity of the maximum principal curvature. Another interesting property is an inside-outside symmetry that implies the existence of locally perfectly complementary molecular skin models. III. Its surface consists of spheres connected by blending hyperboloid patches and inverted sphere patches.3. we present software for constructing molecular skin in two. In this chapter. we give the geometric definition of the molecular skin and show how it can be decomposed into quadratic patches.and three-dimensional space. which may be used to support numerical computations over the surface. The van der Waals and the solvent accessible models are both unions of finitely many balls in three-dimensional space and differ only in the radii.1. Both properties are crucial for the construction of good quality meshes. In Section III. for each cavity we may construct a molecular skin representation whose boundary matches that of the molecule. Finally in Section III. One is the continuity of the normal direction. This chapter is organized in four sections. In Section III.

It is possibly easier to develop an intuition for combining circles than for combining paraboloids.1.2: Circles sampled from a coaxal system consisting of two orthogonal pencils. the affine hull consists of all circles that pass through the same two intersection points. like the vertical family sketched in Figure III. We will use only a subspace of that vector space. note that ¡ ¥ ¥ ¥ ¥ ¥ ¢  ¥ ¥ ¢ ¢ Recall that a circle ¥ ¢ ¡   0  0  p   p    0  0  p   ¤p     0 ¡ 0  p  ¡ p    ¡ ¡ ¡ ¡  ¡   p      0  ¡ 0  ¢   ¢ 0 ¢ ¥ ¢ ¢    ¢ ¢ ¥ ¥     ¢   ¥ ¡ ¡  p  # © ¢   © Functions form a vector space under the usual notions of scaling and addition. then we get the subset of circles whose centers are the points on the line segment with endpoints and . there is sufficient pedagogical advantage to first talk about circles in . If instead of the affine hull we take the convex hull. Given a collection of circles. The three paform rameters correspond to the three degrees of freedom represented by the center and the radius. and similarly the convex hull is the subset of zero-sets of convex combinations. we can generate another such function by affine combination. Indeed. ©   is orthogonal to if . We have    ¡       #B   p  p ¡ p   ¡ 0   p      0     ¡  0  p  p    p p B     p  p     rp B r¤p    0  p  !p  B     ©   ¡ ©B  © ¡  ¥   ¢ ¥  ©B    ©B     ¥  ©          © ¡      B !p ©B   p   0        ¡  ¢    £ §  ¢   §£ ¡ ¦ ¥      ¢ ¢ ¥     AB¡  B R   B   B ¡  ¡ ¡ ©        ¡     © ¢ ¡ ¡  © ©   © ¡ £ ¤        ¡    ¡ ¡ © © © ¡ ¡ ¡ ¡        ¦         B ©B ¢   ¡    §      ¦£ ¥   © ¦P§  © ©   ©       ¢ © . the circle is the zero-set of the weighted .   0  p  ¤p  ¥  ¢   ¥ ¢ ¥     ¥ ¥ ¢   ¢ ¢   ¢ ¥ ¢ ¢   Figure III. If is orthogonal to and to then it is also orthogonal to every circle in the affine hull of and . . the affine hull is the set of zero-sets of affine combinations of the corresponding weighted square distance functions. As illustrated in Figure III. We call the resulting family a pencil of circles. Circles and paraboloids.1: A circle in distance function. that arise as weighted square distance functions have the . its graph is a paraboloid of revolution in that intersects in the circle. To see this elementary fact. where the are real numbers with .1 Molecular Skin Almost everything we will say in this section applies equally well to spheres of any fixed dimension. if then for all coefficients and . Even is most relevant for the though the case of spheres in study of molecules. The new function is a convex combination of the if all are non-negative.36 III S URFACE M ESHING Pencils. Recall that the weighted square distance function of a circle is the map defined by . Given two intersecting circles and . In other words. .  §   The center is therefore and the square radius is .  0 ¥ Figure III. is the zero-set of its weighted square  0   0  p   p ¥  ¢ ¢   ¥  ¢     ¥ ¢ ¥ ©     dius of the zero-set of ¢ . We compute the center and ra- III.2. Given a collection of such functions . All paraboloids square distance function. The centers of the circles in the affine hull are therefore the points on the line that passes through and . If and are disjoint then the affine hull is again a pencil but this time of pairwise disjoint circles. namely the one consisting of functions of the above form.

The skin of three circles is already more difficult to understand. The convex hull of two circles is an infinite family of circles. viewed from the posi- Taking roots left and right implies that the radii of and add up to at most the distance between the two centers.4: Sections of the zero-set of tive direction. If these circles intersect in two points then the skin is a dumbbell. Furthermore. which is Suppose we are now given two circles and and two more circles and both orthogonal to and . We parametrize by the coordinate of the circle centers. The envelope of is the projection of the silhouette of as viewed along the direction. .   Figure III.5. Let and be two orthogonal circles.5: The skin of two intersecting circles is the envelope of a reduced line segment of circles. and finally taking the envelope. which is a hyperbola. In other words. . We introduce a shrinking operation that reduces small circles less than big ones and this way generates a smooth envelope. at least directly. The collection of all reduced circles is the projection of the entire zero-set. then shrinking every circle in the family. We thus have ¢   Figure III. The envelope is therefore the zero-set of . we define .2 and is referred to as a coaxal system. The smallest non-trivial example is the skin of two circles. for a family of circles we define . Orthogonality and complementarity. gives  Figure III. The corresponding radius is . Such a configuration is illustrated in Figure III. Skin and body. It is the region in bounded by the skin. the skin of the collection of circles is the envelope of the reduced circles. It is the set of points for which vanishes.4. but the union of their disks is just the union of the two original disks. From we get .3. Formally. The same parametrization of the family of reduced circles.3: The dotted circles belong to the affine hull and the solid circles are reduced. as in Figure III. An example can be seen in Figure III. which sketches a shrunken pencil of circles. Similarly. .1 Molecular Skin and thus vanishes as required. as shown in Figure III. we have equality iff .III. Specifically. the reduced versions of any two orthogonal circles     ¡ 0 0   0     ¥     0  0          0      00   00 ¥    ¡     ¢ ¢ ¥ ¢ ¡    The reduced circle with center is the zero-set of 0  B  ¥ ¥    B B   Bt   d XB XB  d  ¢ B ¥ ¦¢         ©B    ¡     ¥ ¥ ¡ ¢  ¥     © ¦P§     ¥   © ¦!§  ©  ¡         " £ ¤¢      ¥¦¢ £ ¤¢  p  p  0     ¡      ¤ ©   ¡ ¢  ¡ ¢ B    ¡   ¡    " ¡ ¡ ¡ ¥   B    " ¡ ¡¢ ¡ ¡   "         ¢ ¡  ¥  ¡   ¡ ¡   ¡ ¡   ¡      ©  ©B   ¥   © "   0       B XB   ¢    ©        "   ¡   ¡ " . the skin is the boundary of the body. We are interested in the envelope of a shrunken pencil. Envelopes. Then every circle in the affine hull of and is orthogonal to both and and thus to every circle in the affine hull of and . It can be visualized as a leaning hour-glass of circles. The body is the union of disks bounded by circles in . We thus take an indirect approach and first study what happens when orthogonal circles shrink. we have two pencils in which each circle in the first pencil is orthogonal to each circle in the second pencil. More general curves than just hyperbolas can be constructed by taking the convex hull of a finite collection of circles. Suppose is a pencil and all its circles pass through the points and . and symmetrically. In other words. It consists of two circles connected by a blending hyperbola arc. 37 for fixed value of .

The two envelopes are therefore the same hyperbola. the mixed complex of is the same as the mixed complex of the collection of circles introduced in Section V. which is facilitated by a complex assembled from Voronoi and Delaunay polyhedra. We thus claim that the ¥   £      £ ¥£§ ¥¨  £ ¡ £ ¢  ¡ © ¦P§   ¥ © centered at each ¡ ¢ £ ¢  # ¢ ¡  ¥ ¢ £¥ £ §    ¡   ¥ ¥¨ #   ¥ # ¥     ¥ ¡ £ ¢ £ £¥ § £       " " " "  "   "      ¡   "    ¡ "  . The smallest separating circle that touches both branches belongs to and has the same size as the two osculating circles that both belong to . Suppose contains only circles with real radii.7 illustrates the construction by showing the mixed complex decomposing the skin into circle and hyperbola arcs. and a triplet of circles defines an inverted circle. The corresponding mixed cell is the Minkowski sum of shrunken copies of both.6. as sketched in Figure III. Figure III. the envelope of is a hyperbola. A Figure III. We will not prove this claim and instead give an explicit construction of the decomposition. The mixed complex consists of all mixed cells and their faces. we first note that a circle in can at most touch the hyperbola.6: Hyperbola with orthogonal asymptotic lines. or equivalently. rather intuitive explanation of the construction can be obtained by drawing the Voronoi diagram and the Delaunay triangulation on two parallel planes in . we would have two crossing reduced circles contradicting the orthogonality of the two corresponding original circles. A single circle defines a (smaller) circle. . for if it crossed.7: The mixed complex and the skin of four circles. we let be an index set and use it to denote the Voronoi polyhedron . These circles touch the hyperbola and have the same curvature as the hyperbola at that point. The set is a two-parameter family spanned by three circles. The skin of any finite set of circles can be decomposed into simple pieces. We apply this result to the coaxal system consisting of orthogonal pencils and . We claim that the envelope of is the exact same hyperbola. We decompose the slab between the two planes into pyramids and tetrahedra. As usual. As shown in Figure III.38 touch if they are of the same size and they are disjoint in all other cases. which implies that the skin of is the same circle. contains a circle    Decomposition. The skin of is trivially a circle.1. every circle in for which there is an equally large circle in touches the hyperbola because it touches that circle. a pair of circles defines a hyperbola. which requires a local rewrite here and in Section III. As shown earlier.          III S URFACE M ESHING skin of consists of circles. [The order of the chapters on skin and pockets has changed now. The mixed complex is then obtained by intersecting the pyramids and tetrahedra with the plane parallel to and halfway between the other two planes. is the affine hull of two intersecting circles. each defined by at most three of the circles. Note that the construction of the mixed complex is symmetric in the Voronoi diagram and the Delaunay triangulation. smallest separating circle. connected to each other by blending hyperbola and inverted circle arcs. To see this.4. which are the convex hulls of corresponding Voronoi polyhedra and Delaunay simplices. and two osculating circles. If then is the Minkowski sum of two orthogonal edges and therefore a rectangle.] As explained there.8. The complementarity of the bodies extends from the case of two orthogonal pencils to the case in which consists of a single circle and contains all circles orthogonal to . If then is a shrunken and translated copy of a Delaunay triangle. The corresponding Delaunay simplex is . Furthermore. In other words. Symmetry. Figure III. If the mixed cell is the shrunken and translated copy of a two-dimensional Voronoi cell. the two asymptotic lines of the hyperbola intersect at a right angle.

That paper also proves that the body of a finite collection of spheres has the same homotopy type as the dual complex.1 Molecular Skin 39 [2] M. Series 2 (1872). Dover. and the mixed complexes of and are the same. Reine Angew. Mathematical Questions and Solutions from the Educational Times 44 (1865). [5] D. and Frobenius [4]. Similarly. Note however that the two bodies are not the same but rather complementary. Annales de L’Ecole Normale. C LIFFORD . DARBOUX . There is another interpretation of the vector space of circles exploited in this section. Deformable smooth surface design. 323–392. middle. Voronoi vertex (including those at infinity) with the radius chosen so that is orthogonal to the circles that define . It has been discovered in the nineteenth century and published at more or less then same time in three different languages by Clifford [1]. 144. 87–115. K. Geom. [1] W. 185–247. 79 (1875). Geometry: a Comprehensive Course.8: The top. P EDOE . Under this interpretation. the mixed complex. Problem 1748. where skin surfaces are introduced as orientable manifolds in .III. The material of this section is taken from [3]. Math. the convex hull of a set of circles corresponds to the usual convex hull of points in . E DELSBRUNNER . We have seen that the skins of two orthogonal pencils are the same hyperbola. Discrete Comput.    ¥       ¤ ©        ¡    ¤ ©    0 ¡ ¢            ¡ ¡¥¢ ¡ ¥¢    ¢ ¡            ¦ §¡ ¡ ¢¢ £ ¤ ¢¢ ¡ ¡ ¢  ¢ ¡    p 0  p W ¥      ¢    ¢   . Since the mixed complex decomposes the entire skin of into such cases. de cercles et de spheres. F ROBENIUS . it follows that the skin of is the same as that of . New York. and the Voronoi diagram. the Delaunay triangulation of is the Voronoi diagram of . and the symmetry between and can be explained as a polarity between two convex polyhedra. De points. G. Figure III. Anwendungen der Determinantentheorie auf die Geometrie des Masses. It identifies each circle in with the point in . Darboux [2]. [3] H. 1988. J. and bottom planes carry the Delaunay triangulation. This interpretation is prominently used in the geometry text by Pedoe [5]. the skins of one circle and the affine hull of three orthogonal circles are the same circle. 21 (1999). The Voronoi diagram of is then the Delaunay triangulation of . [4] G. Bibliographic notes.

For each curve in the plane we consider the space curve .     The skin curves introduced in Section III. The velocity vector at the point is and the speed is the length of that vector. . In this section.1 generalize . For a point   be a smooth surface or 2-manifold in . If then all normal curvatures are the same and the point is an umbilic point of the surface. Similarly.40 III S URFACE M ESHING an open set in . E ULER ’ S T HEOREM . the Gaussian curvature is intrinsic.10: Construction of tangent plane from two tangent vectors. which is defined as long as the speed is non-zero. # x f y Let and be the corresponding tangent directions. and a parametrization. . A closed space curve is a map of a circle to three-dimensional space.10. Derivatives are taken along curves on the surface. as illustrated in Figure III. There is a circle of tangent vectors. which are therefore unique. and if it does we call the normal curvature of at in the direction of the tangent vector . In contrast to the other notions. The curvature of forced by how the surfaces curves in space and another portion accounting for how curves within the surface. By a result of Euler.   T HEOREMA E GREGIUM . which are transformations that preserve the distance between points measured as lengths of connecting paths. Figure III. ¤ ¦ ¨ # ¦ §¡ ¦ Surfaces. it is preserved by isometries. Usually we need only a small number of derivatives. The curvature is the length of that second derivative. and the Gaussian curvature. For example. we take the tangent vectors of two curves that cross at .    ¤      ¤    ¤  ¤   ¤ ¤         ¤ It is often convenient to assume unit speed. we define the curvature at in sections. and for each one we get a normal curvature.3. Two other common notions of curvature are the mean curvature.9: A closed space curve to the left and its Gauss map to the right. They span the tangent plane. .9. and all are obtained by considering the curvature of curves drawn on the surface. which is defined as long as . Let . The second contribution vanishes for geodesics. . the curvature is one over the radius of the osculating circle at . The normal vector is the normalized second derivative. the principal curvatures determine all other normal curvatures at . In other words. This implies that if then all other normal curvatures are strictly between the two principal curvatures. It is a geodesic at if its normal agrees with the surface consists of a portion normal at . and the assumption of the existence of infinitely many is convenient but not necessary. The principal curvatures at are the minimum and maximum normal curvatures. . In this case and the second derivative. The directions thogonal. to compute the tangent plane at . . as illustrated in Figure III. we let be a neighborhood. is an isometric invariant. . The Curvature Variation Lemma proved at the end of this section will play a major role in the meshing algorithm to be discussed in Section III.2 Curvature ¢ £¡ ¦ © § ¢ ¡ © #D $   #D $     #D S $   ¢ ¡ p D© S $ q ©D S $ p ¢    ¡§ $   p #D E $ ' #D E $   #D ¢ p p ©D E p #D I £¢¢ ¢   ©D E $ $ ©D ¤ ©D ¥ ¢   p #D S $ p   #D     #D E $      ¤ #D ¢ ¡ ¢ . we straightforwardly to surfaces in study the curvature of these surfaces.      ©  §         ©  Figure III. There are several notions of curvature of a surface. and if and are orthen ¦    ¢ c ¢ I¥ £ ` ¢   §  $ ¢     ¤  § © ¤   P§ P§  § £   $ §¢  $ ¢      ¤ ¤   ¤ ¢  © ¤   ¤ §  § ¡            ¤   ¤     ¤ ©B ¢         § §  ¤ £ £    £   $ Curves. Note that a curve has a parametrization and the counter-clockwise orientation of the circle gives a sense of direction. This is a famous result of Gauss. Geometrically. We can think of as the Gauss map from to . #B ¢   ¦  III. is normal to the first. It is smooth if the derivatives of all orders exist. which is the circle in the plane spanned by the tangent vector and the normal vector. The tangent vector is the normalized velocity vector. .

In either case. The common limiting case is a double-cone defined by two touching spheres.2 Curvature Skin surfaces. the mixed complex defined by the circles decomposes the skin into circle and hyperbola arcs. the one-sheeted hyperboloid. the symmetry axis along . Either way. as shown in Figure III. and the two-sheeted hyperboloid. which we define as   The second equation defines a hyperboloid with the apex at the origin. The hyperboloid can either be one-sheeted (an hour-glass) or two-sheeted. the body lies on the side of the infinite circle in the symmetry plane.12.12: The sphere. The situation is more complicated for the hyperboloid.11: Typical mixed cells to right we have and 4. Within each mixed cell. it lies locally outside the sphere. the maximum normal cur- r r r x . and note that both the one-sheeted and the two-sheeted hyperboloid can be obtained by rotating the hyperbola about a symmetry axis. or are disjoint. and in the case .1: The cardinality of listed in the first column determines the dimensions of the corresponding Voronoi polyhedron and Delaunay simplex as well as the type of the mixed cell and of the skin patch. the body is on the side of the infinite ends of the symmetry axis. Recall that the skin defined by a finite set of circles in is the envelope of the infinite family of circles in the convex hull.11.13: Every point of the hyperbola is sandwiched between two equally large circles. We have a one-sheeted hyperboloid for and a two-sheeted one for . Whether the hyperboloid is one-sheeted. or two-sheeted depends on whether the two spheres orthogonal to the three ¦   Figure III.1. Maximum normal curvature. the symmetry plane is the affine hull of the Delaunay triangle and the symmetry axis is the affine hull of the Voronoi edge. we 41 spheres with indices in intersect in a circle. The mixed complex that decomposes the surface consists of the four types of cells illustrated in Figure III. For the sphere. touch in a point. ©B ¡ B cases are symmetric and differ from each other by the surface orientation: in the case . We have a one-sheeted hyperboloid if the two spheres intersect in a circle and a two-sheeted one if they are disjoint. Consider the hyperbola in standard form in . The cases are summarized in Table III. a double-cone.  ¡ # Table III. the normal curvature at ev- have a sphere or a hyperboloid patch. Furthermore. In the case of the hyperboloid is the affine hull of the Delaunay edge and the (orthogonal) symmetry plane is the affine hull of the Voronoi polygon. the symmetry axis orientation. £     £ Figure III. From left Figure III. . the two hyperboloid cases are symmetric and differ from each other by the surface . and the symmetry plane . Either way. the skin of a finite set of spheres in is . Similarly. We can translate and rotate every sphere and hyperboloid to standard form.13. The two sphere 1 2 3 4 3 2 1 0 0 1 2 3 mixed cell convex polyhedron polygonal prism triangular prism tetrahedron skin patch sphere hyperboloid hyperboloid sphere  ¤ ¦ vature at a point . ery point is in every tangent direction. as illustrated in Figure III. In the case . the body lies locally inside. Similarly. each reduced by a factor .III. is one over the radius of ¢   B      "   ¥   ¢   ¢ B  B  B B   B   B ¥ £      ¢ B      ¡ © ¨¡ § £ ¡ ¤¢  ¦ ¥ ¥ £  £ ¢ ¦£ § ¥ ¢ ¦£ § ¥   £   © ¦!§  ©  ¡  !   ¥ ¢ ¦£ § ¥    ¤  £ £    ¢ £ £¥ §  ¡ ¡ ©   ¡    ¤ © ¦  ¦    ¢ ¡ .

87–115. G IBLIN .1 by one dimension. C HENG . Elementary Differential Geometry. 525–568.42 the largest sphere that passes through and touches but does not cross the hyperboloid. the triangle inequality gives the Lipschitz bound. We strengthen the result by showing that varies rather slowly. B RUCE AND P.2. K. Curvature variation. Second edition. E DELSBRUNNER . this radius is the same as the distance of from the origin. H. Curves and Singularities. The books by Bruce and Giblin [1] and by O’Neill [4] are good introductory texts to curves and surfaces and other topics in differential geometry. Discrete Comput. We have seen that within a mixed cell. ¦  ¡ B ¡ ¢ ¡ ¢ p ¢ ¡  ¢ ¡ p ¥      p p B ¥ B ¤ B rp ¤ ¢ ¥ B [3] H. Cambridge Univ. J. 21 (1999). S ULLIVAN . III S URFACE M ESHING By applying this to the pieces of the line segment from to contained in different mixed cells.13. Within the mixed cell.-L. this is a continuous function on . In fact. from to . Dynamic skin triangulation. The specific results on the curvature and the curvature variation of skin surfaces are taken from [2]. 1992. . 25 (2001).   ¡ C URVATURE VARIATION L EMMA . By the definition of the mixed complex. is simply the distance to the center. E DELSBRUNNER AND J. D EY. In short. O’N EILL . The maximum normal curvature varies continuously over the skin because the common radius of the sandwiching spheres varies continuously. [1] J. A more direct treatment of the general-dimensional case can be found in [3].   ¢ ¡ ¤ p rp ¤ B B !p ¢   ¢ ¢  ¢    ¥ ¥ ¤       ¤  ¢ £¡ ¤     G ¤   ¥  ¢      ¢ #B        #B ¤ ¤  ¢ ¥ ¤ 0 ©B    ¤ ¢ ¡ ¢   ¢ ¡ B . [4] B. we obtain the result. we extend to a function defined on all of and show that has Lipschitz constant one. for every point of a sphere or hyperboloid in standard form. W. Deformable smooth surface design. Geom. For all points we have  We note that the extension of to a function describes the maximal normal function of all skin surfaces in the family defined by the power growth model of the spheres. Press. The skin surfaces in are obtained by extending the results of Section III. England. San Diego. [2] H. Geom. 1997. as introduced in Section II. Discrete Comput. T. Second edition. As shown in Figure III. Bibliographic notes. Academic Press.

III.3 Adaptive Meshing point
#

43

Closed ball property. One trouble with the restricted Delaunay triangulation is that it may not be homeomorphic to and thus not triangulate the surface. Indeed, it is easy to come up with cases where is not even a 2-manifold. A sufficient condition for to triangulate is what we call the closed ball property. It requires that each common intersection of restricted Voronoi cells is topologically a closed ball of the appropriate dimension. We formulate this condition in terms of the threedimensional Voronoi polyhedra defined by . Assuming general position, the Voronoi polyhedron has dimension , and we require that is either empty or homeomorphic to a closed ball . Depending on the cardinality of dimension of we have a closed disk, a closed interval, or a single point.

Figure III.14: Local decomposition into restricted Voronoi cells and dotted dual restricted Delaunay triangulation.

Figure III.15: To the left a barycentric subdivision of a portion of a Voronoi diagram drawn with solid lines. To the right the isomorphic barycentric subdivision of the corresponding portion of the dual Delaunay triangulation drawn with dashed lines.

¦

Let be the set of points sampled on . We use it as the vertex set of the triangulation, which we construct as the dual of a decomposition of . Specifically, for each

Proving that the closed ball property implies triangulates is not difficult. Decompose the restricted Voronoi diagram by adding a point in the middle of each

¤ ¢    ¤ ©  ©

#

¢ )

#

¢ £¡

 

¢ )

 

¤

¢ )

#

  

¢ )  ¢    ¢ ¤ ©     ¤  ¢  ¦§1¨ ¢    ¡ © !§   ¥ ¡  ¤ ¢ )

¦¢ £ £¥ § ¦¢ £ ¨£¥ §

¥ ¦

¥

#

¥

¦

¤ ¡

#

¦

¦

#

¦

Triangulations. Recall that a triangulation of a surface is a simplicial complex whose underlying space is homeomorphic to . Since is a 2-manifold, it follows that the simplicial complex is the closure of its triangle set, every edge belongs to exactly two triangles, and the star of every vertex forms a disk. Note that the last property implies the first two. We construct a triangulation by first selecting points on and second connecting these points with edges and triangles. Given the Delaunay triangulation of , we have sufficient information to sample points and to compute their maximum normal curvature values. Specifically, for each Delaunay simplex we construct the mixed cell . The center of this cell is the point at which the affine hull of intersects the affine hull of . It is also the center of the corresponding sphere or the apex of the corresponding hyperboloid. Next, we rotate the mixed cell so its center moves to the origin. Furthermore, if or is an edge then we rotate it into vertical position. The sphere or hyperboloid defined by is then in standard form, which can be sampled. For each sampled point we compute the maximum normal curvature from its distance to the origin and we obtain the corresponding point on by the inverse rotation.

where distance is measured in , as usual. It is the intersection of with the Voronoi polyhedron of in , . The restricted cells decompose into closed regions that overlap along common pieces of their boundaries. Locally the picture is rather similar to that of a Voronoi diagram in . The restricted Delaunay triangulation, , is the collection of simplices with non-empty common intersection of the corresponding restricted Voronoi cells, . The construction is illustrated in Figure III.14. We note that is a subcomplex of the (unrestricted) Delaunay triangulation of in . 

 

¡

¢ ¡ 

¡

#

¦   ¦ 

£ ¢

#

¢ ¡

In this section, we focus on constructing an explicit representation of a molecular skin surface. We choose a triangle mesh realized in that is a good approximation of the surface and has good numerical properties. 

& 

¥ § p    ¦

¥

B rp ¤ p    ¥rp ¢ ¦ ¡ B ¡ B

  ¡¡   
 

III.3 Adaptive Meshing

, the restricted Voronoi cell is 

£ ¢

¦

¥¨

¥

#

¥¨  ¥

¥¨ ¥ 

¦

¦

¦

#

#  

 

¦

¦

¥¨

¥

¡

¦

¢ £¡

£

 

   ¤ ©

 

 

¦

44 arc and inside each cell and connect each point to the points on the boundary. The star of every point inside a restricted cell is a triangular decomposition of that cell. The star of every restricted Voronoi vertex consists of six triangular regions that can be homeomorphically mapped to the six triangles in the barycentric subdivision of the dual restricted Delaunay triangle. By construction of , the triangles in the two barycentric subdivisions are connected the same way so we have a homeomorphism between and the underlying space of , which is illustrated in Figure III.15. -sampling. The question remains how we sample the points such that the restricted Voronoi diagram has the closed ball property. Since is smooth, small neighborhoods are fairly flat and the restricted Voronoi diagram behaves locally similar to the (unrestricted) Voronoi diagram of a set of points in the plane. In other words, a dense enough sample of points should have the closed ball property. This intuition can be made precise by formalizing the concept of density. Recall that is the maximum normal curvature at a point . Around we spread points at distance roughly proportional to . We therefore define and call it the length scale at . The Curvature Variation Lemma of Section III.2 states that for any two points , the difference in length scale is at most the distance between them in , . An -sampling is a subset such that for each point there exists a point at distance . Showing that a sufficiently small implies the closed ball property for the restricted Voronoi diagram is rather tedious and we omit the proof. H OMEOMORPHISM T HEOREM . If is an -sampling of with , then the restricted Delaunay triangulation of is homeomorphic to . The precise upper bound for is a root of the function 

III S URFACE M ESHING arbitrarily ugly. To improve the mesh, we impose conditions on the size of edges and triangles that imply both upper and lower bounds on the spacing between sampled points. , Let the size of an edge be half its length, and the size of a triangle be the radius of its circumcircle, . For edges we worry about them getting too short, so we compare size with the larger length scale at the endpoints, . For triangles we worry about them getting too large, so we compare size with the minimum length scale at the vertices, . We use two constants, and , to express the conditions on the size. The constant controls how closely the triangulation approximates , and controls the quality of the triangles. We refer to the two conditions as the Lower and Upper Size Bounds, [L] [U] for every edge , .
 

for every triangle

It is not necessary to bound the edge lengths from above would belong to because an edge with two triangles that both violate [L]. Symmetrically, we do not need to bound the triangle sizes from below because a triangle with would have three edges that violate [L]. Mesh quality. The constants and have to be chosen judiciously. For example would immediately lead to irreconcilable requirements on edge and triangle sizes. Furthermore, cannot be too large, else we would contradict the -sampling condition stated in the Homeomorphism Theorem. Without going into details, we state that and are feasible choices. In particular, these constants imply that is an -sampling for sufficiently small value of . More precisely, they imply that is either an -sampling or it grossly violates the condition for -sampling. An example of such a gross violation are four points close together on a sphere. The points form a tetrahedron whose edges and triangles may very well satisfy the Size Bounds, but the boundary of the tetrahedron is a miserable approximation of the much larger sphere. Fortunately, such a gross violation of the condition cannot be created from an -sampling without the intermediate generation of triangles that grossly violate [U]. The algorithm discussed below is unable to generate such triangles. The two Size Bounds together imply a reasonably large lower bound on the angles inside triangles of the restricted Delaunay triangulation.

Even sampling. The points of an -sampling can locally not be too far apart, but they can be arbitrarily close together. In other words, on a microscopic scale, the points can be placed every way one likes and the mesh can be

 

which arises in the proof of the Homeomorphism Theorem.

   

¡£
¦ 

 

  

¤  £  

¢ ) ¡ #§  

§  ¨¡    ¡ ¡ £ £ 

¢ ) ¡ §   

 

¡

 

¦ ¡ ¦  ¤ § £ $ § £   

¡  £ $  £   

 ¨ ¡

 

¤ 

£

 

£   

¢

§  ©¡    ¡ ¡ 
   

£¡

  

 !"§§ #   §§ #  

  

  

£ &  

¦ £ § ¥ 
£

  

 ¦ § £ ¡   
  

 

¦

#B

  

B

 

¤

¢ )

¥   

 

¦

¥

  ¡  

 

¦

©B

  

© § ¥ £ 

¡ B

¦

¤

  ¡

¡ B

¦

 

 

©B 

  

#B ¡   p   rp B ¦ ¡ B      B rp ¤ ¢  ¢ ©B ¡ ¢ ¢ ¡ ¡ 

¢ )

p 

 ! §    %

¤ 

 

¦

 

¥   

 

¥ 

 

  

 © § ¥ £   ¤ © §

©B ¡

¤  

 

P  

¥

  

B

  ¢

¦

¥

 

III.3 Adaptive Meshing M INIMUM A NGLE L EMMA . A triangle that satisfies [U] and whose edges satisfy [L] has minimum angle larger than . P ROOF. Let be the triangle and its circumradius. Assuming is the smallest angle, we have of length as the shortest edge. We have by definition of length scale. Using [L] and [U] we thus get 

45 violate the Upper Size Bound. It is possible that an edge contraction causes a vertex insertion, but a vertex insertion cannot create edges of size below the allowed threshold. This is what prevents infinite loops in spite of the algorithm’s partially conflicting efforts to simultaneously avoid short edges and large triangles. To prove this claim, that causes the addition of its we consider a triangle dual restricted Voronoi vertex .

Hence

endwhile.

The details of the algorithm that modifies the restricted Delaunay triangulation to reflect the addition of are omitted. A vertex insertion may cause other vertex insertions, but this cannot go on forever because we will eventually violate the Lower Size Bound. Given an edge that violates [L], we contract it by removing one of its endpoints. We are not able to exclude the possibility that the removal creates new violations of [L], and it certainly can create new violations of [U]. void E DGE C ONTRACTION: while edge violating [L] do if then endif; ; V ERTEX I NSERTION endwhile. The details of the algorithm are again omitted. An edge contraction may perhaps cause other edge contractions, but this cannot go on forever because we will eventually

Scheduling. [Summarize the results on scheduling edge contractions and vertex insertions described in [5].] Bibliographic notes. The restricted Delaunay triangulation is a generalization of the dual complex of a ball union. It can be used to triangulate surfaces and other spaces embedded in a Euclidean space. Besides the dual complex literature, there are several other partially dependent roots of the idea, namely the surface meshing method by Chew [3], the neural net work by Martinetz and Schulten [6], the formulation of the closed ball property by Edelsbrunner and Shah [4], and the surface reconstruction algorithm by Amenta and Bern [1]. The last of the four papers also introduces -samplings of surfaces, although in a slightly different formulation in which the distance to the medial axis replaces the length scale. All results that are specific to skin surfaces are taken from [2]. The algorithm in that paper is more general than 

£    

For therefore

and

we have , as claimed.  

¢

 

£ ¡   ¥G $ '¤££ G   

 

  

void V ERTEX I NSERTION: while triangle violating [U] do

and  

  

p

p

  

¥

¥ 

Brp       ¤ p B!p         ¤

   

¦     ¦ ¨¡  ©B  ¡ ¡ £ £ 

£

¡

 

¥

p

 

B !p  ©B ¡ ¤  ¡      ¡ ¤ ©B ¡ 

¥

B rp

£   ¤G

 

 

Density modification. Given an -sampling, we can enforce the Size Bounds by contracting short edges and inserting points near the circumcenters of large triangles. Given a triangle that violates [U], we add the dual restricted Voronoi vertex as a new point to . The insertion may cause new violations of [U] and thus trigger new point insertions.

¡

p

 

 

¥

B rp

B

 

B

¡

¥ ¥ £ ¨¦  

B

 

§

 

¦

©

 

¥    !  

B 

§ 

§ ¡ §  

#§  

¢ 

§ ¡ 

B¡ £       ¢

¥    

For

, the minimum angle is thus larger than , and the maximum angle is smaller than .

B

   ¡ § £ ¡ ¦ ¦§ £ ¡      ¦ £ § ¥  

¡ 

¡

Hence

.

. The sphere with P ROOF. We have center that passes through , , and has radius and contains no other vertices than inhas therefore length side. Every new edge . Assume without loss of generality that . We use the Curvature Variation Lemma to derive upper bounds for the length scales at and :   

 

£ ¥G

N O -S HORT-E DGE L EMMA . Every edge ing the addition of has ratio

B 

¡ £ $ ¤G  

§

¦

¡ CB
 
¡

¦ ¦ § £ ¡  § £   

§ 

B

B

¦ § £    

 

£    V    © § ¥ £  # ¡  £   © § ¥ £ ¢       §   £¡ p §  p  ¤ ¦§ £ ¡       £ ¡  §¦ £ ¡   ¤ p   p § § 

  

§

 

  ¡  ©

  

    #§   
  

V 

  

¥  

 © § ¥ £ 

#§  
¥

created dur.

 

¥

¥   ¦ £        ¥   ¤£      ¦

¢

    ¡  

 

§ 

46 what is explained in this section and maintains the surface mesh while it moves in space.
[1] N. A MENTA AND M. B ERN . Surface reconstruction by Voronoi filtering. Discrete Comput. Geom. 22 (1999), 481– 504. [2] H.-L. C HENG , T. K. D EY, H. E DELSBRUNNER AND J. S ULLIVAN . Dynamic skin triangulation. Discrete Comput. Geom. 25 (2001), 525–568. [3] L. P. C HEW. Guaranteed-quality mesh generation for curved surfaces. In “Proc. 9th Ann. Sympos. Comput. Geom., 1993”, 274–280. [4] H. E DELSBRUNNER AND N. R. S HAH . Triangulating topological spaces. Internat. J. Comput. Geom. Appl. 7 (1997), 365–378. ¨ ¨ [5] H. E DELSBRUNNER AND A. U NG OR . Relaxed scheduling in dynamic skin triangulation. In “Japanese Conf. Comput. Geom., 2002”, to appear. [6] T. M ARTINETZ AND K. S CHULTEN . Topology representing networks. Neural Networks 7 (1994), 507–522.

III S URFACE M ESHING

body. Where is its center in Figure III. that maps each point to the moment in time at which belongs to the skin of .1. In Section V.16 we see seven disks whose union is decomposed into convex regions by the Voronoi diagram. One function in this family is the trajectory of the skin curve. where we considered the minimum weighted square distance function of a collection of circles . Furthermore. We return to an issue left open in Section V.III.4 Skin Software In this section. We see only seven of them in Figure III. The zero-set of is the envelope of the circles .3. the -skin is the envelope of the circles in the convex hull that are reduced by a factor . The skin shrinks the arcs in the boundary of the disk union and smoothly blends between the shrunken arcs using pieces of hyperbolas and inverted circles. rectangles. the body. An example is the mixed complex illustrated in Figure III. Using the Morfi software. and the dual complex all have the same homotopy type.1 we claimed that there is an infinite family of of that all have smooth approximations the same critical points.17 is degenerate. In Figure III.1. and shrunken Delaunay polygons. One of the quadrangles contains most of the hole in the body. which can be seen from the fact that there are three shrunken Delaunay triangles but also two shrunken Delaunay quadrangles. we use two pieces of software to visualize the various geometric concepts introduced earlier in this chapter. ¡ ¦  ¡ §       ¢ ¡ . We choose and construct the family such that and approaches as goes to 1. we think of as time and denote the collection of circles at time by . namely the points where dually corresponding Voronoi and Delaunay polyhedra intersect. ¡ ¡ YD     © ¦P§ ¡  0   ¢ 0 ¡   ©  ¦P § ¡ ©©  ¡¡  ¡ ¥ ¤¡ £   ¡ ¢  ¢ ¡   ¡      ¢   ¡ ¡Y¡ ¡ B ¡        ¡  ¤ © B  Mixed complex.17. we can visualize concepts that are difficult if not impossible to show in . As explained in Section III. Observe also that the five Delaunay polygons visible within the mixed complex apparently have eight vertices (not double-counting the shared ones).17: Decomposition of the skin and body by the mixed complex. The collection of circles generating the diagram in Figure III. and dual complex. Skin curves. It decomposes the skin into circular and hyperbolic arcs. Most striking is the blending for the quadrangular hole roughly in the middle of the figure. portion of the hole boundary inside that quadrangle is circular while the portions outside the quadrangle are hyperbolic. Note that the disk Figure III. Following the notation in Section II. Specifically. The D ¡         D D D   0   ¡ ¦ ¡    ¡ ¢   §    ¡ ¢          §¡  0     ¡ ¡   ¡ ¢    union contains the body and the body contains the dual complex. . This is always true. and the preimage of any real value is the envelope of the circles .16 because one of the eight radii is imaginary.16? Figure III. it consists of shrunken Voronoi polygons. Simulated smoothing. The Morfi software is two-dimensional and constructs skin curves from finite sets of circles. which is converted into an almost entirely circular hole in the body.16: Voronoi decomposition of disk union with superimposed skin.4 Skin Software 47 III. We generalize this construction to any by letting be the trajectory of the modified skin curves. Superimposed on this decomposition is the skin curve with shaded body and the dual complex. the disk union.

The algorithm moves vertices normal to the surface. with as usual. .48 is the skin as defined in Section III. The apparent smoothness is an illusion created by Gouraud shading. Figure III. As time increases. For . the surface moves and the . The function maps every point to the moment in time at which belongs to .18 illustrates the construction by showing the modified skins for several values of . software updates the mesh accordingly. which is a graphics technique that interpolates between normal directions to generate the smooth impression. This is sufficient to justify the Morse theoretic reasoning about the non-smooth function used in Section V. the height function is differentiable and assuming non-degeneracy of the input circles. and the slicing plane is chosen to cut right through the narrow part of the tunnel. all spheres are imaginary. we compute triangulated skin surfaces using the Skin Meshing software.20 correspond to high density regions in Figure III.1 to define pockets. with the time continuously increasing from minus infinity to zero. Observe that the bod- III S URFACE M ESHING ies bounded by the -skins are nested. it is twice differentiable at the critical points. We classify the operations according to the adaptation purpose they serve. The complete surface has genus one. defined for .18: From inside out the sequence of skins for . As mentioned earlier.19 should be compared with the ren £   ¡  ¤ ©   Figure III. Shape adaptation. § ¥ ¤¡ £   ¢ £¡ ¡ ¦ ¢ ¡ Meshed skin surfaces. D   ¡ ¡ ¡   ¡ ¡ dering of the same surface in Figure III. is also the envelope of the orthogonal circles as defined in Section III. We use edge flips to maintain the mesh as the restricted Delaunay triangulation of the moving vertices.3 guarantee that the mesh adapts its local density to the maximum normal curvature.19 shows a portion of this mesh for a small molecule.1. Note that highly curved areas detectable in Figure III.1. The algorithm thus reduces to executing a sequence of elementary operations. ¡ Curvature adaptation.   D ¡    ¡ ¡ B      B       ¡ ¡ ¢ £¡ ¡ g¡ D     ¡ ¡ #B ¡     ¦  ¥ ¤¡ £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡    ©  D § ¤ ¡    ¡       ¤ ©     Figure III. We use edge contractions to eliminate edges that violate [L] and vertex insertions to eliminate triangles that violate [U]. which is facilitated by a motion of the mesh vertices in . The growth of the spheres implies a deformation of the surface. Recall that the conditions [L] and [U] given in Section III. It takes as input a set of spheres and constructs a mesh by maintaining a triangulation of the set of spheres .19: Cut-away view of the mesh of a small molecule of about forty atoms. Only the edges of the mesh and the cut boundary are shown. At time we have the mesh of the skin of .20. and the mesh is the empty complex. In . As it turns out. along the integral lines of the skin trajectory. the innermost -skin. Note that and is the envelope of the original disks. Figure III. the skin is the empty surface. Growing the mesh. the mesh is constructed by maintaining it while growing the spheres. which is . The image of the mesh in Figure III. The image is created by slicing the surface with a plane and removing the front portion of the surface. At the beginning.19.

can be found in [3]. The two-dimensional Morfi software has been developed by Ka-Po (Patrick) Lam. The software permits other parameter settings since a violation of the inequalities does not necessarily imply a failure of the algorithm. Two of the four types of metamorphoses can be seen at work in Figure III. and the smallest angle observed in the mesh is indeed . As proved in Section III. The Skin Meshing software comes with a quantification panel that displays parameters used in the meshing algorithm. There are four types of topological changes that occur. The software has been used in [2] to explain two-dimensional skin geometry and its application to deforming two-dimensional shapes into each other. V   © § ¥ £ £    ¢         £  ¥ £ &&!        . and . which is . or vice versa.22: The quantification panel of the Skin Meshing software. From the first snapshot to the second. the software works fine for small violations but breaks down for moderate ones. The correctness of the algorithm is guaranteed only if the inequalities referred to as Conditions (I) to (V) are all satisfied. A component is born at a minimum. . the ratios all lie inside the allowed interval.22. which controls the size of the angles.III. The quality measures do not include the special edges and triangles that facilitate topological changes and purposely violate some of the properties required for the rest of the mesh.19. [This panel needs to be updated to fit the text. we see two new handles appear. and they correspond to the four types of generic critical points of threedimensional Morse functions.21. Computer graphics techniques used in displaying shapes. 5].] Figure III. For the standard setting of . It displays measurements of mesh quality. including Gouraud shading.3. The three other parameters shown in the panel Figure III. which controls the numerical approximation of the surface.22 shows the panel after the construction of a mesh. Quantification. including size versus length scale ratios of edges and triangles and the angles inside and between triangles. a tunnel is closed at an index-2 saddle. Note that in Figure III. We use metamorphoses to change the mesh connectivity accordingly. provides various measurements of mesh quality. and a void is filled at a maximum. and indicates the number of operations executed during the construction. a handle is created at an index-1 saddle. The only difference is the reversal of inside and outside. In our experience.4 Skin Software 49 are . ¥ §£ £     Bibliographic notes. this is roughly . which control how the metamorphoses are performed. we see both tunnels disappear again. and is described in his master thesis [4]. the algorithm guarantees that the smallest angle inside any (nonspecial) triangle in the mesh is larger than . The three-dimensional Skin Meshing software has been developed by Ho-Lun Cheng [1. Each handle creates a tunnel in the complement. By closing a tunnel we also remove the handle that forms it. and . Topology adaptation.20: Smoothly shaded rendering of the mesh in Figure III. namely a two-sheeted hyperboloid that flips over to a one-sheeted hyperboloid. Figure III. From the second snapshot to the third. The two most important parameters are . Observe that the surface around a handle is the same as that around a tunnel.

Appl.21: Three snap-shots of the deforming triangulation of a molecular skin defined by continuously growing spheres. L AM . Univ. Ph.. F EINER AND J.edu. 1990. F OLEY. From left to center. we note a metamorphosis that closes a tunnel on the left. Dept. . E DELSBRUNNER . A. Computer Graphics. [3] J. Sci. [5] Molecular Skin web-site in the software collection at biogeometry.. Second edition. Sci. Internat. Comput. 205–218.-W.-L. Principles and Practice.50 III S URFACE M ESHING Figure III. D. [1] H. Urbana. Geom. Two-dimensional geometric morphing. Massachusetts. P. 2001. Dept. 1996. VAN DAM . C HENG . C HENG . Comput. F U AND K. Master thesis. L AM . P. thesis. J. [2] S. [4] K. Hong Kong University of Science and Technology. P. S. we note two metamorphoses that each add a handle in the front.duke. 19 (2001). Addison-Wesley. Comput. Illinois. Reading. H. Dynamic and Adaptive Surface Meshing under Motion. H UGHES . Design and analysis of planar shape deformation. From center to right.

(i) Give an example illustrating that is not continuous. For this purpose assume and are two sphere that are both orthogonal to the spheres . Curvature in the plane. Show that goes to infinity as the hyperboloid approaches its asymptotic double-cone. 4. Define the total curvature of a surface as the integral of the maximum principal curvature:  (i) Calculate for a sphere . Note that the curvature of a molecular skin curve in is not continuous. and is orthogonal to and . We write for the height of defined as the distance of from the closest point on the line  ¥    ¥   (i) Prove that every affine combination of and is orthogonal to . Prove that the radius of the circumcircle satisfies 2. we write for the heights of and . Define the total square curvature of a surface as the integral of the maximum principal curvature squared:  (i) Calculate for a sphere . what is the analog of a coaxal system in ? ¦  pi § p g   p p  p i § p p    p § £  p g   p p    p § ¥ ¥    ¥ ¥       1.Exercises 51 and passing through and .3 is proportional to . Total curvature.      § § Exercises ¦  § ¦¡ ¢¡ ¤ ¤ ¡   ¡   ¡ ¤ ¥  #§   ¦  B #B ¡ B #B  ¤ ¢ © G ¡ ¢ ¢ ¥ ¡   ¦¡ ¢¡  ¦ ¦ ¤ ¡ ¦ ¢©G ¦¡ ¢¡ ¢ £¡ £      ¦  ¡      ¦         ¡ ¥ ¥   ¥ ¥ ¡ ¦ ¡   ¦ ¦ ¦ . 5. Let us extend the concept of a coaxal system of circles to three dimensions. (ii) Let be the portion of a hyperboloid of revolution within a unit sphere around the apex. Something about triangles. (iii) Prove that the number of points in a minimal -sampling of (as defined in Section III. Let be a triangle in the plane. Total square curvature. and . 3. (ii) Introduce a new function (perhaps similar to ) that is continuous over . Pencils of spheres. (ii) Prove that every affine combination of . (iii) In the light of (i) and (ii). (ii) Calculate for the portion of a double-cone within a unit-sphere around its apex. and . Similarly.

52 III S URFACE M ESHING .

2 IV.Chapter IV Connectivity Given a shape or a space. we formally ¡ ¢ ££   ¡ §¥ ©  ¨ ¦§¥       ¦  ¦ ¡ ¤ ¡ §¥  ¨ ¦§¥ ¦ ©   ¦ ¡ ¤ ££   ¡   IV.4. [We should stress that homology in this topological context has a precise algebraic meaning. which is in sharp contrast to how the term is used in biology (eg. homology modeling of proteins). However. In this chapter. we focus on algorithms computing the homology groups of molecules represented by space-filling diagrams.4 Equivalence of Spaces Homology Groups Incremental Algorithm Matrix Algorithm Exercises     ¡ ¢   53 . we present the classic matrix algorithm for Betti numbers. there is a polynomialtime algorithm that computes and compares their homology groups. The three notions are progressively weaker: define homology groups and their ranks. which is fast but limited to complexes in three dimensions. the Betti numbers. In spite of the apparent weakness. we describe an incremental algorithm for Betti numbers. or they have isomorphic homology groups ( .] Given two triangulated spaces. where it indicates a vague notion of similarity. could mean they are topologically equivalent ( ). we can ask whether or how it is connected. In Section IV. meaning they are neither homotopy equivalent nor topologically equivalent. homology is the most important tool to study connectivity. It might not be immediately obvious what this question means. For example. we need to be aware that there are perfectly well-defined and reasonable but different precise notions that correspond to the intuitive idea of connectivity. which is significantly slower but not limited to three-dimensional space.2. we prove that space-filling diagrams are homotopy equivalent to their dual alpha shapes. In Section IV. the classification of spaces by homology groups is coarser than that by homotopy equivalence. we can draw from precise definitions developed in topology to answer the question.1. they are homotopy equivalent ( ). if their homology groups are isomorphic then we still do not know whether the two spaces are the same also under the two stricter definitions of sameness.3 IV. However. In words. If the groups are not isomorphic then we know that the two space are different. In Section IV. In Section IV.3. for two spaces and to be “connected the same way”.1 IV. which implies the two have isomorphic homology groups. which in turn is coarser than that defined by topological equivalence.

we can induce the subspace topology. We write if a homeomorphism exists and say that and are homeomorphic. which is the system . and with induced subspace topology it is a topological space.54 IV C ONNECTIVITY Topological equivalence. In other words. for every subsystem . According to a more general definition. The space together with the system is a . Here we only need to distinguish between open and non-open sets. then it is a topological subspace of . ¢ is Topological spaces. but there is no homeomorphism between and .1: The circle on the left is topologically equivalent to the trefoil knot in the middle.2: The stereographic projection maps the sphere (minus the north-pole) to the plane. as illustrated in Figure IV. This map between and is indeed a homeomorphism.1 Equivalence of Spaces The space-filling diagram of a molecule is a subset of . . which is not an open set. We study the connectivity of this space by considering equivalence classes defined by continuous maps between spaces. Let be the three-dimensional Euclidean space. is a subset of . A homeomorphism is a bijective map that is continuous and whose inverse is continuous.  ¡    ¢ ¡  ¡  ¥ ¡¥ ¥    The system is called the topology of and the sets in are the open sets of . topologically equivalent. An interesting example of a pair of IV. symmetric and transitive. Recall that a map continuous if for every there is a such that if have distance less than then the points have distance less than . but this is not necessarily true for infinitely many open sets. topological subspace of the pair non-homeomorphic spaces are the sphere and the plane. and if we choose its intersections with open sets in as the open sets in its topology. Another topological subspace of is the two-dimensional Euclidean plane. As suggested by Figure IV. is just the origin itself. we can define when two are the same. N Figure IV.2. so is indeed an equivalence relation for topological spaces. is continuous if the preimage of every open set in is open in . For example. Figure IV. Note that the identity is a homeomorphism. . and that they have the same topological type. there are spaces that have the same topological type and look vastly different. Now that we know what a topological space is. We thus see that the restriction to finite subsystems in condition (iii) is necessary. we can map points from the sphere to the plane by stereographic projection from the north-pole. . the common intersection of the open balls of points at distance less than from the origin. To get comfortable with these abstract ideas requires a number of concrete examples. and for every finite subsystem . A topological space is a set together with a system of subsets of such that  ¦  ¡ ¢   ¡ ¡ ¦     ¢ ¡ ¢ ¡   ¥       urp ¢ pB ¢ ¡ ¡ ¡ ¥     ¡   ¥     ¢  £ ¡   ¡   §¡ ¢      ¢ ¡ ¢  £   ¡ B¡ ¢      ¡ ¢        &   ¡        ¢ £¡       ¡ ¢ ¡   ¥   ¡¢ ¡  ¢  ©B ¢   ¡ B ¢ £¡ ¢ ¡ ¢ ¢ ¡ £  £ ¢ ¡ ¤¢ ¢ ¡    £    ¢ ¡ ¢ ¡ ¢ ¢  . and there are spaces that look quite similar and do not have the same topological type. being homeomorphic is reflexive. but both are not topologically equivalent to the annulus on the right. we thus have be able to measure the distance between points in both sets. (i) (ii) (iii) and . If . and an open set is a union of open balls. Note that the common intersection of finitely many open sets is again open. Here is one. The two-dimensional sphere. The lower hemisphere maps to the shaded disk and the upper hemisphere to the complement of that disk. This distinction is the motivation for the following definition. After embedding both in . the inverse of a homeomorphism is a homeomorphism. for . and the composition of two homeomorphisms is a homeomorphism.1. An open ball is the set of points at distance less than some from a fixed point. To check whether or not is continuous.

for all . ¡ "   ¡   Figure IV. ¦ P ROOF. and . Two continuous maps are homotopic if there is a continuous map with and . It is easy to show that two topologically equivalent spaces are also homotopy equivalent. a join between two sets and in some Euclidean space is the union of closed line segments that connect points in with points in . Next we introduce an equivalence relation that is less sensitive to the local dimension of spaces than topological equivalence. is equal to the identity on and therefore certainly homotopic to it. If is a topological subspace of then we may prove that the two spaces are homotopy equivalent by constructing a map that retracts to .3. but there is no deformation retraction to the circle. We write and say that the two spaces have the same homotopy type. Figure IV. Define and . A deformation retraction from to is a continuous map with    Two spaces and are homotopy equivalent if there are continuous maps and such that is homotopic to the identity on and is homotopic to the identity on . Similarly. symmetric. which maps to . Decomposition into joins. for all . A space is contractible if it is homotopy equivalent to a point.5 uses two kinds of joins to decompose the difference between the union and the dual complex of a set of disks. there is a deformation retraction from the double annulus to the figure-8 curve. As illustrated in Figure IV.4: The arrows indicate a deformation retraction from the double annulus to the figure-8 curve. The only requirements has to satisfy is that it starts with . which is the identity on . . which is the same as saying that the image of may be self-intersecting. but the two are not topologically equivalent. for all . ¡      (ii) ¡ ¡   (i) . This definition is illustrated in Figure IV. and both map the circle into maps the circle times to three-dimensional space. a ball is contractible but a sphere is not. A triangle is the join between a    B   © £ 8 © G ! ©      § ¨ Deformation retraction. ends with .3: In this example. and that it is a map. and and it is defined iff any two such line segments are either disjoint or meet at a common endpoint. ¤ ¥  ¡      ¡ 8D ¡ "   ¡ ¡  "   " D  "     . If is a deformation retraction from to then and are homotopy equivalent. and the cylinder connecting the two images of the circle. namely triangles and disk sectors. D EFORMATION R ETRACTION L EMMA . We begin by comparing maps between the same spaces. We construct maps and with the required properties.IV. Furthermore. We write and call a homotopy between and . We may think of the parameter as (iii) . for all     55 and all ¡  Note that is a homotopy between . (Why not?) im k im H im h Figure IV. For example.1 is homotopy equivalent to the circle.       ¡ ©B £    ¡ 9D  ¥   ¡ £  ¤      ¢¦ ¡ §   ¥ ¤             ¡ #B ¦   ¡ ¡   ¡ B   CB ¡   ¡   CB ¡ ¥ ¢ ¦   ¦       ¡  § ¢ ¡     ¡    ¦ ¨      ¡    § B ¦ D #B    ©B ¥     § ¦¥      § "    B# " #B "      ©B I ¢ #B    ¤   ¦ . We construct a deformation retraction between a union of balls and its dual complex using a decomposition into joins. For example. a disk is contractible but a circle is not. Note that is reflexive. ¦ ¦   ©B  " ¡% §   ¡ #B ¡¢ ¦  § ¢  "   ¢  ¦  ¦  "¢        ¦   time and sweep out the image of by the images of the .4. In general. The simplest homotopy type is that of a point. To see that the reverse is not true we note that the annulus in Figure IV. and transitive and is therefore indeed an equivalence relation for topological spaces.1 Equivalence of Spaces Homotopy equivalence. is not required to be injective. Then is homotopic to the identity on because is a homotopy between the two maps.

the outer vertex of each triangle join belongs to more than one line segment and thus retracts towards more than one point of the dual complex. and they correspond to the four principal edges of the dual complex. and deformation retractions are covered in most texts of algebraic topology. and corner and its dual vertex. Recall that the boundof the dual complex. ary of consists of sphere patches separated by circular arcs connecting corners. we choose and move the points differently in the time interval .7 shows an entire sequence of shapes during the deformation retraction visualized for the model of gramicidin also shown in Figure II. this initial motion needs to bridge the non-zero gap between the boundary of and the boundary of the image of at time . or a pair of points.) There are also four arcs that consist of more than one component each. arc. Shrinking joins. (As defined in Section II. we can make the gap arbitrarily small and easy to bridge. It is illustrated in Figure IV. The decomposition is constructed by forming the join between every patch. In the assumed case in which is in general position. an edge is principal if it is not face of any other simplex in the complex.6. A triangle in the decomposition shrinks from its outer vertex towards the opposite edge. We get a deformation retraction from to by shrinking joins from outside in. Figure IV. Let be a finite collection of closed balls in . An arc may be a full circle. To be specific.5: The union of disks is decomposed into the underlying space of the dual complex and two types of joins connecting that complex to the boundary of the union. Figure IV. we define a patch as the contribution of the sphere bounding to the boundary of . To finesse this difficulty. Subtleties of the definitions of a topology      D          " £      D Figure IV. Figure IV. There is a technical problem at the very beginning of the shrinking process that arises already in two dimensions. In the Alpha Shape software. a point. By choosing small. such an edge is referred to as singular. Specifically. There are four corners that are point pairs. to the underlying space . It turns into a trapezium whose height decreases and reaches zero at time . homotopies. Homeomorphisms. We shrink IV C ONNECTIVITY point and an edge and a sector is the join between a circular arc and a vertex. which belongs to the dual complex. we define an arc and a corner as the contribution of the intersection of two and of three spheres to the boundary of . Each join is the union of line segments with on the boundary of and on the   Bibliographic notes. and triangle.6: The decomposition after shrinking the joins half way to zero. It does not have to be connected or simply connected. It maintains its shape while getting smaller until it reaches the size of a point. A corner may be empty.56 boundary of . .5 illustrates the construction in the plane. Similarly. A disk sector shrinks from its outer arc towards its center. or any number of intervals along the circle.3. The deformation retraction is obtained by shrinking all joins simultaneously.     B D   D $ ¥      B   D  " $   $   ¡ by defining D   § "   ¡ ¡ ¢ ¡ ¢¢ ¤ ¢¢   ¡   ¢¢   ¤ ¢¢ ¢   ¡       ¢       B       ¦      B          . We assume general position and construct a deformation retraction from the union.4. including Seifert and Threlfall [6] and Munkres [5]. for every point on the line segment . which shows the image of the retraction at time . which is a vertex of the dual complex. edge. and they correspond to the vertices on the boundary of the dual complex that are exposed to the outside in more than one interval of directions.

The union of balls and its dual shape. [3] J. 1984. 13 (1995). San Diego. General Topology. E DELSBRUNNER . 1975. Academic Press. Maybe the paper by Leray [3] is the first publication on that topic. California. Redwood City. 415–440. J. R. R. M UNKRES . E.IV. T HRELFALL . Prentice Hall. Elements of Algebraic Topology. . and of a topological space are discussed in texts on general topology. 24 e (1945). [2] J. L ERAY. The particular deformation retraction used to prove the homotopy equivalence between a union of balls and its dual complex is taken from Edelsbrunner [1]. The history of the Nerve Lemma is complicated because different versions have been discovered independently by different people.1 Equivalence of Spaces 57 Figure IV. New York. That equivalence can also be derived from general theorems about coverings. New Jersey. A First Course. Englewood Cliffs. 1980. 95–167. [6] H. K ELLEY. S EIFERT AND W. Geom. Topology. Springer-Verlag. [1] H. M UNKRES . including Kelley [2] and Munkres [4]. Pure Appl. [4] J. The Nerve Lemma says that a space is homotopy equivalent to the nerve of a finite open cover whose sets have either empty or contractible common intersections. Discrete Comput. Addison-Wesley. Sur la forme des espaces topologiques et sur les points fixes des repr´ sentations. A Textbook of Topology. 1955. [5] J.7: Six snap-shots of the deformation retraction from the union of balls representation of gramicidin to the dual complex. We can turn the Voronoi cells of a union of balls into such a cover and get the homotopy equivalence result from that lemma. Math.

and the image is the subset of whose elements have preimages in :  B©   ¥   ¥ ¢ ¦£ § ¤£¥ §   ¥ ¢ ¥    ¢ £   ¢ £¥ £ §       ¥   B   B ¥ ¢   B   ¥  ¢   ¥       ¤  ¤     £¦ ¢  ¡     ¤ ¢ ¡     ¢ ¢¢ ¤ ¢¢ §  ¤ ¢ ¥ ¤   ¢¢ ¤ ¢¢    ¥ ¡ (ii) if both. If has cardinality then has dimension and is also referred to as a -simplex. . The remainder of this section ically equivalent. and we have seen an example in Section II.1 that the underlying space of the dual complex is homotopy equivalent to the spacefilling diagram. A group is a set together with an associative operation for which there is a zero and an inverse for every group element. and § ¦ Triangulations. two cosets are either disjoint or the same. In topology.   ¢ ¢ ¥  ¡ ¥ B ¡ ¥ ¥ ¥  B    ¥    ©  ¨ B B (i) if and then .58 IV C ONNECTIVITY . We note that it does not matter which representatives we choose in computing the sum of the two cosets. denoted as . introduces the algebraic concepts we will use to define homology groups of triangulated spaces. has the same number of faces. A face of is the convex hull of a subset . which we now develop. A simplex is the convex hull of an affinely independent point set. Examples are the infinite group of integers with addition. We thus define a triangulation of a topological space as a simplicial complex whose underlying space is topolog. The kernel of is the subset of whose elements map to .8: Partition of into cosets defined by in which contains a quarter of the elements. is the collection of cosets. The resulting coset is always the same. We connect chain groups of different dimensions by          ¡    ¥   £   ¥      ¥      Abelian groups. for the case   ¥    ¢ ¥  B % ¡    This section introduces homology groups as an algebraic means to characterize the connectivity of a topological space. The zero of this chain group is the empty set. there is a bijection between and each coset ¢ This is like adding modulo 2 where .3. We have . including the empty set and as its two improper faces.   ©B ¥   ¥     ¥  #B ¢ §   IV. Since has subsets. The group is abelian if the operation is commutative. So if and then . mod . . A simplicial complex can be used to represent a topological space. In words. Let be a simplicial complex. Observe that implies G x+y+ H An isomorphism is a bijective homomorphism. A simplicial complex is a finite collection of simplices with pairwise proper intersections that is closed under the face relation. By definition. we have talked about triangulations in an intuitive geometric sense. A subset forms a subgroup if is a group. In the preceding chapters. so addition is indeed well defined. the term has a precise meaning. We construct groups by defining what it means to add sets of simplices. Addition in the quotient group is defined by . A homomorphism between groups and is a function that commutes with addition. and we write . . If is finite this implies that all cosets have the same cardinality and .      ¥ Suppose is abelian and is a subgroup. .  x+ H H 0 y+ H Figure IV.  then is either empty or a face of .2 Homology Groups  ¥ ¡  ¢ ¥ ¥¢    B¡       ¥ ¤ ¡     © ¨ ¦P§ ¢    ¨ ¤ ¨  £ ¤ ¨¡¨ ¡ ¨ ¨ ¨ ¤ ¤ ¤¡ ¨ ¨ ¤    ¨ ¥ C ¡ ¥ ¨ ¢   ¤ ¨    "¥ ¥ . that is. To keep the discussion reasonably elementary. The quotient divided by . Call a set of -simplices a -chain. since a chain belongs to iff it belongs to neither or to both chains. and the finite cyclic group of elements.  ¥   with ¥          #B    § ¥ ¡ ¥     ¡C6¢ ¥ ¡ B ©B   ¢¢ ¡ B ¡¡       ¥ ¤   ¥ ¡        ¥          ¥ ¦ ¡¤ £ ¤   § ©B       Recall that the underlying space of is the union of all simplices. We proved in Section IV. is the set of -chains and is the group of -chains. we restrict it to triangulated spaces and to addition modulo 2. the sum of two -chains is the symmetric difference of the two sets. and because implies . Its kernel is the zero element of and its image is the entire . where the dual complex of a space-filling diagram was used to represent a molecule. A topologically more accurate representation would have a homeomorphic underlying space. Chain complex.

It is isomorphic to . There is only one non-empty 2-cycle. This assumes of course that and have the same dimension. . For this purpose we define .IV. Homology groups. Similarly. . A -cycle is a -chain with . There are two types of chains that are particularly important to us: the ones without boundary and the ones that bound. else would not be defined. the homology groups of (any triangulation of) a union of balls are the same as the homology groups of the dual complex. Equivalently. cycles. ¥ §  ! ¥ #§ ! ¥ "§  ¨      F UNDAMENTAL L EMMA OF H OMOLOGY. for every . . and boundaries as sketched in Figure IV. We thus have a boundary homomorphism . . as required.10. the boundary of every boundary is empty. group. This is because every -simplex belongs to exactly two -simplices. In other words. The rest follows because taking boundary commutes with adding:   which is the empty set.9. which implies that is a subgroup of . Hence . which is the group of elements with component-wise addition modulo 2. The set of -cycles is the kernel of the th boundary homomorphism.9: The chain complex and the groups of cycles and boundaries contained in the chain groups. If then is the trivial group consisting only of one element.  ¥ 59 0 a a b a+b b a+b b a 0 0 a P ROOF. We can therefore draw the relationship between the sets of chains. Hence and no non-empty 2-boundary. . the homology groups are properties of the space and not artifacts of the complexes used to represent that space. namely the ones with even cardinality.10: The curves and represent the homology classes and .2 Homology Groups homomorphisms that map chains to their boundary. The two nonbounding 1-cycles labeled and generate a first homology group of four elements. . b a 0 a b a+b 0 a a+b 0 a+b b b a+b Figure IV.10. as sketched in Figure IV. Two -cycles add up to another -cycle. . we get the same homology groups for different triangulations of a topological space. which implies that is a subgroup of . as shown in Figure IV. Proving that this is really the case is beyond the scope of this book. which generate the homology group . The cosets are the elements of and are referred to as homology classes. Two -boundaries add up to another -boundary.  . The set which there exists a of -boundaries is the image of the -st boundary homomorphism.9 illustrates the sequence but contains information about subgroups that will be introduced shortly. We prove that is a subgroup of . Note that for every -simplex . ¥          ¢    ¥   ¥ ¡ £¦  ¨      ¥    £ ¤   ¦ ¥ ¥   ¥       ¢ ¥              ¡§     § © £¦ ¤¤¢  ¥         ¡ ¨   ¢© ¨ ¤ ¡   ¨   £   ¥    ¢  ¥ ¡ ¤    ¢ Ck+1 Z k+1 Bk+1 k+2 k+1 Ck Zk Bk k Ck−1 Z k−1 Bk−1 k−1 ¥ ¡  ¦ £ ¢  ¨   ¦   ©  ¨   ¦               ©         ¥         ¥ ¥    ¥ ¥ ¥ § ¥ £¦ ¨¦¤¢  ¥   ¥ £       ¥  ¨          V ¦¥¤£¢                ¦    ¥               ¥       ¥ ¥   0 0 0 Figure IV.    ¤  ¥ ©    ¡    ¤ ©    ¡ ¡  ¢ ¡  R      '     ¤ ¡ ¥   ¤  &            ¤ ©         ¢ Cycles and boundaries. Observe that the boundary of the sum of two chains is the sum of their boundaries. The -th homology group is the quotient of the -th cycle group divided by the -th boundary An important property of homology groups is that they are the same for triangulations of homeomorphic and of homotopy equivalent spaces. A -boundary is a -chain for -chain with . In particular. The sequence of chain groups connected by boundary homomorphisms is the chain complex of . As an example consider a triangulated torus. All 0-chains are 0-cycles and half of them are 0-boundaries. ¥ ¥ ¥ ¥ Figure IV. The size of is a measure of how many -cycles are not -boundaries. The boundary of a chain is the sum of boundaries of its simplices. . .

. For example. This operation can also be expressed in the terminology of linear algebra. two spaces with different Euler characteristics have homology groups that are different in at least one dimension. and . the Euler charactherefore teristic of the two-dimensional sphere is and that of the torus is . We refer        ¥      ¤      ¥          ¤   ¥     ¤ Revisiting the example above. . Similarly.    ¡ ¤     ¥ 0   ¡  ¡       ¤  £ ¥    ¤ ¦¨£¥  ¡       ¡ ¥ ¡ 0 ¡   ¡ ¡     ¥    ¤ ¢ ¢   ¥    ¥    ¥    ¥  ¢                ¡     ¤ £¥     ¡    §   ¦ £ ¤ . no non-bounding loop. All these groups are idempotent. that size is the binary logarithm of the number of group elements. the rank of is the size of a basis: . To see this remember that a 0-cycle bounds iff it contains an even number of vertices in each component. As for the torus. and and therefore . then the rank of is equal to the sum of ranks of the kernel and the image. . . He named the ranks of the homology groups after the English mathematician Betti. we see that the Betti numbers of the torus are . For the closed disk we have . Today. In this case. we can form all sums of elements in and thus generate a subgroup. the rank is known as the -th Betti number of that space: ... By definition. Bibliographic notes. and because is idempotent. Note that if is a homomorphism. We show that is also the alternating sum of Betti numbers. where the subgroup is knows as the linear hull. . This is hardly surprising but not easy to prove with elementary means. Similarly for the two-dimensional sphere we have . all other Betti numbers vanish. Since we have Since and IV C ONNECTIVITY is a homomorphism. Homology groups have been developed at the end of the nineteenth and the beginning of the twentieth centuries. ¡ ¨ ¤  ¤ Euler characteristic. consisting of all . who introduced a slightly different version of the numbers years earlier. we rewrite this relation as . It follows that . . The most useful aspects of homology groups are their ranks. §¥   ´ E ULER -P OINCAR E T HEOREM . Similar to . ¤ ¡  ¥  ¢    0  © ¡ ©   ¥  ¡           ¤     ¡¡£   ¡ ¤ ¥ ¥    ¢ ¦£  § ¥      £ ¥          £   ¢ 0    ¡   ¢ ££   ¤ £¥   ¤ £¥ 0 £      ¢'    ¥  ¥ ¤ £¥    0      ¡ ©   ¥          ¡ 0 ¡    ¡ ¤ ¢  ¥ 0 ¥ ¥    £ ¦¨¥  ¡ ¥V £   £ ¥    ¥ ¡   ¥   ¥  ¡ 0   ¥   ¡ ¡    ¤   ¡       ¡ ¢    B ¦¥ ¥ ©      ¤  ¡  £ ¥   ¡§ 0 ¡     ¥ 0    £ ¢ ¥         ¥ ¤ £¥       £ §  ¤ ¡ ¡ 0 0    ¢ ¦£ § ¥ § ¨   ¥ ¢    ¡ ¡ ¦ B  B     ¡   ¢           ¢ ¦£ § ¥ . hence We state this result because it is important and so we can use it for later reference.. the 0-th Betti number is the number of connected components. homology is a general method within algebraic topology. the spaces are neither homotopy equivalent nor topologically equivalent. Given a subset of such a group . Earlier we derived . By definition. the sphere and the torus are pairwise non-homeomorphic. with and . the Euler characteristic is the alternating sum of these numbers: . the closed disk has one component. Note also that exactly half of the subsets of a finite set have even cardinality. . This subset is a basis if it is minimal and generates the entire group. we have Using corresponding lowercase letters for ranks. the 1-st and 2-nd Betti numbers have intuitive interpretations as the number of independent non-bounding loops and the number of independent non-bounding shells. no shell.) and the coefficient groups they used ( . all bases have the same size. boundary and homology groups.   ¥  In general. which have intuitive interpretations in terms of the connectivity of the space. The homology groups of dimensions are all trivial and the corresponding Betti numbers are all zero. The concept of a rank applies equally well to chain.). and .. Consider a simplicial complex and let be the number of its -simplices. cycle. If the group is the -th homology group of a space. The beginning of the twentieth century witnessed parallel developments of homology groups that differed in the elements they added (simplices. This relation can often be used to quickly find the Euler characteristic of a space without constructing a triangulation and counting simplices. cubes.   § ¡ ¡  £  0 0 . all this work was unified by axiomizing the assumptions under which homology groups exist [1]. Indeed. and .60 Betti numbers. and . Even though there is no unique basis. Eventually. general cells. . Note that this implies that the disk. The French mathematician Henri Poincar´ is usually credited with the conception of the idea e [4]. . for every . If there are components and vertices then and . that is. . The number of -simplices in the complex is also the rank of the chain group.

Elements of Algebraic Topology. Springer-Verlag. [2] P. M UNKRES . An Introduction to Algebraic Topology. Surfaces and Homology. Addison-Wesley. [5] J. ROTMAN . Princeton Univ. 1981. G IBLIN . J. [3] J. New York. 1952. 285–343. 61 . R. S TEENROD . Graphs. P OINCAR E . E ILENBERG AND N. New Jersey. Press. Foundations of Algebraic Topology.2 Homology Groups to Giblin [2] for an intuitive introduction to that area and to Munkres [3] and Rotman [5] for more comprehensive sources. Rendiconti e ` del Circolo Matematico di Palermo 13 (1899). Compl´ ment a l’analysis situs. Chapman and Hall. Redwood City. London. J. [1] S.IV. 1984. ´ [4] H. 1988.

Adding a simplex. Otherwise. it just closes a tunnel formed by the surface holes. ++ -- ¨ ¥ ¤       ¡ ¤ ¥ ¥ ¢    ¢  ¨ ¤ ¢ ¡ ¤  ¡ ¡   ¢       ¡ ¥ 0 ¨ ¥ ¨ ¨ ¤   ¨ ¨ ¤    ¥ 0 ¨ ¤   ¤ ¨  ¨   ¡ 0     0   ¨  ¤  ¤   ¤ ¢  £ © 0       ¥  ¤ ¡ 0 0 0   ¤ ¡ 0    0    ¨¡     ¡   ¤ £     0 ¤ ¤   ¡   ¡ ¤ ¤ ¨¤ ¨ ¨ ¨¢ ¡ ¡ ¡   0   . by adding one simplex at a time.62 IV C ONNECTIVITY Observe that the four cases follow one and the same rule: if belongs to a non-bounding cycle in then we increment the Betti number of the dimension of and. all three sides are equally long and are glued to each other with matching orientations. but we have to avoid pitfalls such as creating edges that share more than one endpoint and triangles that share more than one edge. For example. σj σj Figure IV. cannot connect to and thus forms a component by itself. then all edges. otherwise.   ¥  0  ¥  0  ¡ 0 ¨ Case is a triangle. . Therefore. ¢ £  £ ¤  ¢ ¥ £   ¨ for if ¤ ¨   ¥     0   ¢ u v ¨¡ £     0¨ ¡  ¤   ¤    ¤ is a vertex. . To run our algorithm. the filtration contains all alpha complexes and we get the Betti number of all of them in one sweep. and it is convenient to assume that any two contiguous complexes differ by only one simplex: .12: To the left. we mention only the Betti numbers that change. Adding can therefore only turn a non-bounding 2-cycle (its boundary) into a 2-boundary. In the first case. In this section. to do . In the case analysis. It is not difficult to construct one. which is particularly well-suited for filtrations. Hence. This is justified by the equation developed in Section IV.14. .11. A valid triangulation is shown in Figure IV. Both cases are illustrated in Figure IV. The dunce cap is best created from a triangular piece of soft cloth. we need a triangulation of the dunce cap. The algorithm is but a simple scan along the filtration. closes a tunnel and we have . both illustrated in Figure IV. When we run our algorithm. Again we have two sub-cases. To compute the Betti numbers of a complex. If completes a 2-cycle then . Case 0 Case is an edge.2: adding a -simplex always increments the rank of the -th chain group. Assuming is a complex in . we form a filtration that ends with that complex:  IV. By observing how fits into . belongs to a -cycle in then else endif endfor. Betti numbers of the dunce cap. We analyze what happens to the Betti numbers when we add a simplex to a complex . We study this problem after illustrating the algorithm for a small example. Algorithm. integer B ETTI: .3 Incremental Algorithm The Betti numbers of a simplicial complex can be computed incrementally. while to the right. we can determine the Betti numbers of from those of .13. As illustrated in Figure IV.  ¡ 0 0  Figure IV. return The only difficult part of the algorithm is deciding whether or not belongs to a -cycle. Case is a tetrahedron. we describe the details of this algorithm.3. In the latter case. we may use the filtration of a Delaunay triangulation introduced in Section II.11: The edge closes a loop on the left and connects two components on the right. Let and assume that all proper faces of belong to . and it does this by either incrementing the rank of the -th cycle group or that of the -st boundary group. we first add all vertices. we decrement the Betti number of dimension one less than that of . we may sort the simplices in non-decreasing order of dimension and take all prefixes of that sequence. the triangle completes a surface. Alternatively.12. and in the second case have . There are two sub-cases depending on whether the endpoints of belong to the same component or to two different components. u v All are complexes. we . so is also a complex. it cannot have any 3-cycle. Being a vertex.

IV.3 Incremental Algorithm

63 Classifying vertices and edges. We now return to the problem of deciding whether the addition of a simplex increases the rank of a cycle group or that of a boundary group. In the former case, we say the simplex creates, and in the latter case it destroys. All vertices create, but edges in Figcan create or destroy. For example, the edge ure IV.11 creates on the left and destroys on the right. To distinguish between the two cases, we maintain the components of the complex throughout the filtration using a union-find data structure, which represents a system of pairwise disjoint sets: the elements are the vertices and the sets are components of the complex at any moment in time. The data structure supports three types of operations:

Figure IV.13: In the first step, we glue two sides of the triangle, thus forming a cone with a seam. In the second step, we glue the seam along the rim of the cone (not shown).
1

3 8 2

2 4 9 A B 2 C 3 1 D

1

Figure IV.14: A triangulation of the dunce cap.

The algorithm scans the filtration from left to right and classifies each vertex and each edge as either creating or destroying: for to do case is a vertex : creates; A DD ; case is an edge : F IND ; F IND ; if then creates else destroys; U NION endif endfor.

Classifying triangles and tetrahedra. In three-dimensional Euclidean space, every tetrahedron destroys but triangles can destroy or create. Deciding whether or not a triangle belongs to a cycle is not quite as straightforward

£ 

¡

¤

£

triangulation, each closing a tunnel and thus decrementing . Indeed, no collection of triangles has zero boundary, which can be proved by observing that three edges belong to three triangles each and all other edges belong to two triangles each. The final result is therefore and . Indeed, the dunce cap is connected, all its closed curves bound, and the surface formed by the triangles does not enclose any volume in .
 

£

£

£ 

¡

£

 

¡

0

¢ ¡

 

Table IV.1: Evolution of and triangulation in Figure IV.14.

while adding the edges of the

Standard implementations of the union-find data structure take barely more than constant time per operation. To be more precise, let be the extremely fast growing Ackermann function. Its inverse is extremely slow growing. To get a faint idea of how slow the inverse grows, we note that cannot be bounded from above by any constant, but unless is larger than the estimated number of electrons in the universe. Any sequence of operations takes time at most proportional to . For all practical purposes, this means that each operation takes only constant time. 

¤

£

¤

12 12 0 28 3 1 3C 2 10 56 1 19

13 11 0 29 3 2 45 1 10 5D 1 20

16 10 0 2A 3 3 46 1 11 67 1 21

17 9 0 2B 2 3 47 1 12 78 1 22

19 8 0 2D 2 4 48 1 13 89 1 23

1A 7 0 35 2 5 49 1 14 9A 1 24

1C 6 0 36 2 6 4A 1 15 AB 1 25

1D 5 0 37 2 7 4B 1 16 BC 1 26

23 5 1 38 2 8 4C 1 17 CD 1 27

25 4 1 3B 2 9 4D 1 18

#

©

§

§    

¨ 

¨

 

 

#

£¤

  

¡

#

¤

  © 

¨

  © 

¨

and finally all triangles. After adding the thirteen vertices, we have , and . The evolution of Betti numbers while adding the edges in lexicographic order is shown in Table IV.1. There are 27 triangles in the 

£ 

¨

 

  ¨

 

A DD

add

as a new singleton set to the system.

£

substitute U NION the system. 

 ¡

§ 

§ 

5

for the sets

and

#

©

 

#

©

#

 

F IND

§©

7

6

3

return the set that contains vertex . in

§ 

  

0

¥ 

 

¢ £

  

0

¥   

 

 

¡

0 

0

 

§ § § §
   

  ¡¥   ¡¥

¥ ¥ ¥ ¥ ¥ ¥ 


0 0

64 as it is for an edge. However, with an extra assumption on the filtration, we can use the dual graph of the complement to classify triangles and tetrahedra the same way as we classified edges and vertices. The most convenient version of this assumption is that the last complex in the filtration, , is a triangulation of . Think of as the one-point compactification of . Given a Delaunay triangulation in , we can construct such a triangulation by adding a dummy vertex and connecting it to all boundary simplices of the Delaunay triangulation. In and also in , every closed surface bounds a volume. In other words, a triangle completes a 2-cycle iff it decomposes a component of the complement into two. We keep track of the connectivity of the complement through its dual graph, whose nodes are the tetrahedra and whose arcs are the triangles. Figure IV.15 illustrates this construction in two dimensions. Adding a triangle to the

IV C ONNECTIVITY tetrahedra, but this is exactly what compactification does for us when it adds tetrahedra outside the boundary triangles of the Delaunay triangulation. The running time for classifying all triangles and tetrahedra is again propor. tional to Summary. The entire algorithm consists of three passes over the filtration: 1. a forward pass to classify all vertices and edges, 2. a backward pass to classify all triangles and tetrahedra, 3. a forward pass to compute the Betti numbers. Figure IV.16 illustrates the result of the algorithm. In the first two passes, we maintain a union-find data structure, which takes time proportional to . The third pass does only a constant amount of work per step, namely incrementing or decrementing a counter. The total running . time is therefore at most proportional to

Figure IV.15: A subcomplex of the Delaunay triangulation and the dual graph of the complement. The region outside the Delaunay triangulation is represented by a single node.

complex effectively removes an arc from the dual graph of the complement. Deciding whether removing an arc splits a component is more difficult than deciding whether adding an arc connects two components. We therefore scan the filtration backward, from right to left:
 

for downto do case is a tetrahedron: destroys, unless , in which case it creates; A DD ; case is a triangle: let and be the tetrahedra that share ; F IND ; F IND ; if then destroys else creates; U NION endif endfor. The algorithm requires that each triangle is shared by two

Figure IV.16: The evolution of the Betti number (the number of tunnels) in the filtration of gramicidin, which is shown in Figures II.3 and II.15.

Bibliographic notes. The incremental algorithm for computing Betti numbers described in this section is taken from [2]. It exploits the fact that the connectivity of the complex determines the connectivity of the complement. This relation is a manifestation of Alexander duality, which is studied in algebraic topology [3, Chapter 3]. This algorithm has been implemented as part of the Alpha Shape software, which computes the Betti numbers of

£  ¡

¥

¤

£ 

£

¡

¤

£

£  ¡

¤

£

¢

  

¨

¢ 

$

 

¢

  ¢ ¡ 

¨ 

£

  ¨ 

¨ 

¨

¢

 

  £¢

¢

 

¨

¢ £¡ 

¢ 

¨  ¨

¡ ¤
  

¨ £ ¨ ¨  

 

¨ 

¢ ¡

 

IV.3 Incremental Algorithm typically thousands of complexes in the filtration of a protein structure in less than a second. The key to achieving this performance is a fast implementation of the union-find data structure, namely one with running time proportional to for operations. The details of such an implementation can be found in most algorithm texts, including [1, Chapter 22]. A proof that the running time cannot be improved from to has been given by Tarjan [4].
[1] T. H. C ORMEN , C. E. L EISERSON AND R. L. R IVEST. Introduction to Algorithms. MIT Press, Cambridge, Massachusetts, 1990. [2] C. J. A. D ELFINADO AND H. E DELSBRUNNER . An incremental algorithm for Betti numbers of simplicial complexes on the 3-sphere. Comput. Aided Geom. Design 12 (1995), 771–784. [3] A. H ATCHER . Algebraic Topology. Cambridge Univ. Press, England, 2002. [4] R. E. TARJAN . A class of algorithms which require nonlinear time to maintain disjoint sets. J. Comput. System Sci. 18 (1979), 110–127.

65

£

£  ¡

¤

£

£

£ 

¡

¤

£

66 IV C ONNECTIVITY IV. and similarly the form a basis of . The matrix is in normal form if bases of its non-zero entries are lined up along an initial segment of the main diagonal. . Let with -simplices and -th incidence matrix is hj hj − h s hs gi + gr gr + gi where iff is a face of . . return . we extend it to integer addition. The + £ ¨   £      ¡     ¨    0   ¨ ¥ ¦ ¡ ¡ ¢        ¡ ¦ £   ¥  ¨ ¦  £   ¦   £   £    3    ¥ ¦  ¤           ¥ Figure IV. As illustrated in Figure IV. is no longer the -th incidence We use the phrase “assume without loss of generality” as a short-form for expressing that there is another case. Incidence matrices. we need to consider more general bases. that if col then col else find row endif endwhile. After a few elementary row and column operations. ¥  ¨ ©§ ¨ ©§   ¤ ¤ ¤ ¥£ ¡    ¥ ¡   ¡  ¡   ¡          ¡ ¥ ¡     3     ¡              ¦ ¥  ¡ ¥      0 ¡        ¤ be a simplicial complex -simplices .g. we develop the linear algebra view of homology and formulate a matrix algorithm for computing Betti numbers. . (Since we deal with idempotent groups. add row to row . After explaining the algorithm both for addition modulo two. ¤ ¡ £  0   ¤  ¤ 0 £ X£  £  £ ) ¡   )   ¡    £       £    ¤¦ ¦ .17. but it is still describes a correspondence between and . Adding column to column has the effect of replacing by . We can use Gaussian elimination to transform the incidence matrix into normal form.   . £  col . adding row to row has the effect of replacing by . Using this notation.4 Matrix Algorithm In this section. -with . .. matrix. we can write the -th boundary homomorphism in matrix form:  col endif exchange row with row . The al-    ¤ 0   £ £ 0 ¦ ¦  © © £              £    Exchanging two rows or columns is equivalent to reindexing the or .l. £ £     ££   £ ) ¡ £ £ ¤ £ £ )X   ¡ £     ¡ £  £       ¡   Recall that the form a basis of the -th chain group. . ¦ . boolean NON Z ERO and while assume w.18.o. . The algorithm uses a boolean function NON Z ERO that makes sure that during the -th iteration the -th diagonal entry. is non-zero.17: The effect of elementary row and column operations on the bases of and . It does this by exchanging rows and columns. while the basis of changes at the modifying column. that can be handled symmetrically. namely . subtraction is the same as addition.) Note that the effect is not symmetric: the basis of changes at the modified row. The above formula thus expresses the boundary of every basis element of as a sum of basis elements of . . . forall columns do if then col col endfor endif endfor. row     £ Normal form algorithm. To make this interpretation of the incidence matrix useful for computing Betti numbers. add column to column . The function fails to make non-zero iff all entries in the remaining sub-matrix are zero. as illustrated in Figure IV. : do . . exchange column with column . These can be generated by performing elementary row and column operations:  row endif £ 0 for to do if NON Z ERO then forall rows do if then row row endfor.   .  .

and the group of -boundaries. We start at the beginning. A curious new phenomenon that arises with the use of integer addition is algebraic torsion. Letting the running time is therefore at most proportional to 67 1 1 bk −1 ck −1 bk −1 zk ck Integer coefficients. and the -th Betti number is the rank of that homology group: .4 Matrix Algorithm gorithm consists of three nested loops. Two ordered simplices have the same orientation if their orderings differ by an even number of transpositions. It follows that the number of non-zero entries along the main diagonal is . An ordered -simplex is an ordering of the vertices of a -simplex. Each simplex has two orientations. . . It is convenient to write  1 4 5 1 2 3 3 2 1 4 5 1 Figure IV. It does not occur for spaces that can be embedded in . We add two chains componentwise. the -th matrix has rows and columns.    ¨   ¥ ¥ ¥    Figure IV. . where the hat marks the deleted vertex. Either way. formed all incidence matrices of As illustrated in Figure IV. As before. The -th Betti number is the rank of where is the function value of . The zero-rows correspond to -cycles. It can be constructed from a rectangular    ¢       ¥ ¥   ¥ ¤ ¦¨£¥    0  ¥ ¥ ¥ the -th cycle group minus the rank of the -th boundary .  "    ¥  ¡ 1                ¡   ¡               ¥ By definition. A -chain is a function from the -simplices to the integers.18. except if it is a vertex. Similarly. Maybe the simplest topological space whose homology groups have torsion is the Klein bottle. piece of paper by gluing opposite sides as shown in Figure ¢ ¡ We note that the ranks of the incidence matrices suffice for computing the Betti numbers and it is not necessary to go all the way to normal form. and we write . and that it is the negative boundary for an ordering of the opposite orientation: . in which case it has only one.IV. we talk about what this means in terms of adding simplices and chains. . we can check that the Fundamental Lemma of Homology still holds: . of which we have many. The matrix algorithm can be extended to coefficients in instead of . this function as a formal polynomial: ¥     ¥   ¤    ¡     ¥ ¡ ¡ ¤ ¥  ¥ ¡ ¡     ¡  ¨  ¢ ¨ ¥ ¤ ¥ ¥  ¥ ¡ ¨  ¥ ¡ ¢ ¡      0  ¢ ¥ ¥ ¥ ¥ ¤  ¥ . we give each simplex in an arbitrary but fixed orientation. so it is not part of people’s immediate experience. Before discussing the necessary modifications. To set the stage. We can thus derive group: the Betti numbers from the sizes and numbers of non-zero entries in the normal form matrices. Suppose we have transinto normal form. we define the group of -chains. and for a given oriented simplex . The -th homology group is again . Torsion. the running time of the algorithm is cubic in the number of simplices in the complex. the boundary of alternating sum of ordered dropping one vertex at a time:  ¨  ¡   #R   ¨ ¡    ¨ R  is the -simplices obtained by  ¡      ¨ ¥             ¨   Deriving the Betti numbers. the group of -cycles. as long as it belongs to the same orientation.19: A triangulated rectangular piece of paper glued to form a Klein bottle. we write for the other orientation of the otherwise same simplex. by adding the coefficients of like simplices: ¤ ¥  ¨ £¨£R      ¨ R  ¨ R      ¨  ¡ R   ¨  £ ¢ ¥    ¥      .18: The normal form of the -th incidence matrix. We can check that the boundary is independent of the ordering.

we may require that all are larger than one and that divides for each . we have .  ¤ ¡   ¡      £      ¡ ¤               ¤  £ ¡ ¤    ¡   £  0 0  ¤ 0 ¤  ©   ¥  ©  ¥ ¡ ¥  ©  ©  ¡ ¥      ¤      ¥ ¤   ¤  ¤ £     V    ¡ ¡¤     ¨   ¤ ¤ ¨ ¥    § ¨ § £       ¢ £¡  ¤      ¡0 ¤      £  £ ¤    ¤¤ ¥ ¦ ¤   £ £       £   ¢    £¡    £ ¥      £  §  ¥  ¡ ¤ ¤    £  ©   ¥  ¥ ©  ¤ ¤  ¤ £ 0    . We get the rank of the -th homology group from the -th and the -st normal form matrices: . We modify the above algorithm to transform the incidence matrix into normal form. J. 1997”. . Near optimal algorithm for computing Smith normal forms of integer matrices. and the rest. Symbol. For integer coefficients. By adding row to row we keep unchanged and we change to . the algorithm generates the torsion coefficients with the required properties. such that divides . Chapter 1]. Since it has torsion. with . The 1-cycle marked around the neck of the bottle does not bound. . The normal form of a bases transition matrix is the same as before. for each . . The matrix algorithm presented in this section is taken from [2. which is . we may assume that divides both and . The rank of the group is the number of copies of . namely . if we get such a positive integer in a single row operation. which is not an integer multiple of .. but their alternating sums are both equal to the Euler characteristic: . We get the torsion coefficients from the -st normal form matrix: they are the diagonal entries that exceed one. and similarly. Indeed. which now attempts to turn the next diagonal entry. . SIAM J. namely . the Euler-Poincar´ Theorem is true independent of the type e of coefficients we choose to define homology groups and Betti numbers. and when we draw it. In “Proc. and it is not even clear whether or not it is polynomial in the input size. This is what causes torsion. S TORJOHANN .  ¤ ¤ Algorithm revisited. The abelian group is thus the direct sum of a free subgroup. we know that the Klein bottle cannot be embedded in . the algorithm is sometimes called the Smith normal form algorithm. The normal form it uses is sometimes referred to as the Smith normal form [3]. K ANNAN AND A. The Betti numbers obtained for and (or other coefficient groups) are not necessarily the same. [4] A. it will also divide the future nonzero diagonal entries. [3] H. Algebraic Comput. Bibliographic notes. Philos. Chapter 7]. Sympos. but twice that 1-cycle bounds. it is unclear whether or not its running time is polynomial in the input size. . Internat. Elements of Algebraic Topology. To see this property. 293–326. Addison-Wesley. We thus get different Betti numbers for addition modulo 2 and for integer addition. 4]. 1984. Unless the entire remaining sub-matrix is zero. and we can make zero with a row operation. Comput. Now we get a positive integer smaller than in a single column operation. and for integer addition. . Since divides every entry in the remaining sub-matrix. However. all larger than one. 267–274. California. which is referred to as its torsion subgroup. S MITH . Symmetrically. M UNKRES . On systems of indeterminate equations and congruences.68 IV. [2] J. 151 (1861). . and for addition modulo 2 and . we can determine the homology groups directly from the normal forms of all incidence matrices. Specifically. Hence. the sequence of operations is sensitive to the size of the integers that arise. To describe the phenomenon more generally. Polynomial algorithms for computing the Smith and Hermite normal forms of an integer matrix. except that we now allow entries in the main diagonal that are neither zero nor one. it is possible to modify the algorithm to guarantee polynomial running time [1. For the Klein bottle. assume there is an entry . First we extend the elementary row and column operations by allowing the multiplication of entire rows or columns by non-zero integers. but their differences are predictable and described by the Universal Coefficient Theorem of Homology [2. This extra condition fixes and the indices .19. the initial sequence of ones is followed by integers . Otherwise. as before. . we have to allow for a self-intersection. R. that is not an integer multiple of : ¡ ¥   ¤ ¥ ¡  ¦      ¡ ¥   ¥ 0 ¥    ¥ Furthermore. Indeed. The running time of the algorithm is no longer guaranteed to be at most cubic in the number of simplices. The are the torsion coefficients. A more substantial modification is needed within the function NON Z ERO. 499–507. As for coefficients. . into the smallest positive entry achievable by row and column operations. [1] R. Redwood City. Trans. 8 (1979). BACHEM .. this attempt will be successful and will divide every entry in the sub-matrix. we need the fact that every finitely generated abelian group is isomorphic to a direct sum (Cartesian product) of copies of and of cyclic groups: ¥ ¦ ¤ ¤ IV C ONNECTIVITY If we get a positive integer smaller than in a single column operation.

and a plane with origin removed. 3. Let finite collection of balls in . Torus and projective plane. (i) Are there any two amino acids with isomorphic graphs? If yes. Joins and simplices. in general. Stars and links. no matter whether or not it has (partial) double bond character. Here an atom is a vertex and a bond is an edge. (iii) Partition the collection of graphs into classes of the same homotopy type. 6. which are well-known for simple graphs: 1. a sphere with north-pole and south-pole removed. You get a torus if you glue the left side to the right side and the top side to the bottom side. be the dual complex of a 4. (i) Show that is a complex. A tetrahedron can be defined as the join of two skew line segments in space.8 and I. Equivalence classes. a trefoil knot. Consider the following topological spaces: a circle.   0 0   ¡     0 ¥ ¥ ¥ # #     ¦ ¦ (i) (ii) (iii) if the graph is a tree. Draw the decomposition and highlight the intersection with the halfway plane. which ones? (ii) Calculate the Betti numbers and Euler characteristics of the graphs.Exercises 69 . (ii) Partition the collection into classes of same homotopy type. (i) Partition the collection into classes of same topological type. (i) Show that the halfway plane intersects the tetrahedron in a parallelogram. The sphere (ii) Assume is the center of bounding intersects all other balls in caps. A simple graph is a simplicial complex that consists of vertices and edges but has not triangles or higher-dimensional simplices. You get a projective plane if you glue again the left to the right and the top to bottom sides but now with opposing orientations. which are smaller tetrahedra. 5. Show that is isomorphic to the dual complex of that collection of caps. Define the star of a vertex as the collection of simplices that contain . Download a protein structure from the pdb database and use the Alpha Shape software to compute the Betti numbers of its van der Waals and its solvent accessible diagrams. each time with matching orientations. #   ¡ ¡ ¦   ¤ £¡   Exercises #   ¦ ¨    £ ¡ ¢  ¡¨    ¤ ¨ ¤¢¤   ¢ ¢  § ¢ £¡ ¤ ¤ ¡¡  ¡ ¤ ¨ ¡   ¤ ¡  ¤ £ ¨£           ¤   ¡ ¢  £ . a M¨ obius strip. Simple graphs. and the link as the collection of faces of simplices in the star that do not belong to the star: 7. Protein structure. (ii) Decomposing the line segments into and pieces implies a decomposition of the tetrahedron into joins.9 as definitions of the amino acids as (onedimensional) topological spaces. Take the graphs drawn in Figures I. every face of a simplex in the link also belongs to the link. that is. (ii) Compute the Betti numbers of the torus and the projective plane by running either the incremental or the matrix algorithm (by hand) on your triangulations. The halfway plane is parallel to both line segments and lies exactly halfway between them. Amino acids. Since the line segments are skew. 2. Take a rectangular piece of paper and orient the left and right sides from top to bottom and the top and bottom sides from left to right. if the graph is connected. the halfway plane separates the two line segments. (i) Triangulate the rectangle such that you get a valid triangulation for both ways of gluing its sides. Let be the number of vertices and the number of edges. Use the language of homology groups to re-confirm the following formulas.

70 IV C ONNECTIVITY .

In Section V.4. we need to have a purpose. To decide what is appropriate. is an important first step. interactions that are based on shape complementarity are not entirely so. We do this be introducing three essentially new concepts.2 V. While this idea seems simple enough. It appears that organic life is based on computations performed by dynamically matching the (changing) pieces of a three-dimensional puzzle. we return to homology groups and introduce the concept of topological persistence.Chapter V Shape Features The topological analysis of spaces. Finally. There is overwhelming evidence that interesting events in such interactions happen preferably in cavities.1. which are partially protected regions in the protein or molecular assembly. and this information is the evolution of the shape under growth. but by itself is insufficient to appropriately characterize the shape of protein structures. the details are tricky and require that we use what we learned about pockets and topological persistence.2. In other words. and the relevant shape complementarity is local and imperfect.3 V. Our goal in this chapter is to introduce mathematical and computational methods that allow us to start talking about the real problem in more precise terms.1 V.4 Pockets Topological Persistence Molecular Interfaces Software for Shape Features Exercises 71 . in Section V. A statement like this needs to be accompanied by a series disclaimers: not every interaction is based on shape complementarity. the situation is hopelessly complicated. In Section V. and that local shape complementarity plays a significant role in making such events happen. The goal we have in mind is understanding how proteins interact with each other and with other molecules. V. The main idea here is to combine the topological concept of a hole with a minimum amount of geometric information. as discussed in Chapter IV.3. It is a measure of how important a topological feature is during the evolution. We define it as a two-dimensional sheet separating the molecule. we illustrate the concepts using the Alpha Shape software and extensions. We see this as a tool to cope with imperfections as it permits us to distinguish topological features from topological noise. we make an attempt to give a precise meaning to interfaces between interacting molecules. we make an attempt to give a precise meaning to cavities in proteins. In Section V.

the Voronoi cells.2 illustrates this view in two dimensions. . which we refer to as the mouths of the pocket.   0 ¢  ¡   ¡ ¡ gD D   0 ) " ¤       ¤   ¤     ¢ ¤ "    ¤ ¤ )    ¤ " ¥ " ¢ ¡ ¤ " ¥ " ¢ ¡  0 . which is the same as the number of voids in . Figure V. See Figure V. All we require is that a pocket be wider on the inside than at possible entrances from the outside. the balls cannot cover the entire space. which we define as a bounded connected component of the complement. consists of one or more connected components. The corresponding void in the dual complex consists of five triangles. Voids. According to this in remains model. we formalize the idea of a cavity in a protein by introducing the concept of a pocket in a spacefilling diagram. Starting at a point outside the space-filling diagram. the points in the shaded region have paths that end at Voronoi vertices.1: The union of disks has a single (shaded) void. In other words. Indeed. Exactly on component is unbounded (infinitely large).1 for an illustration of the definition in two dimensions. Each pocket is open where it borders the space-filling diagram and closed where it borders the outside. Recall that Figure V. It is convenient to use the one that gave rise to the sequence of alpha complexes. We extend it to the rest of space by using the circles that sweep out the Voronoi polygons and the intervals that sweep out the Voronoi edges. We may think of the growth as pushing the points on the boundary of the space-filling diagram outwards. Following the vectors. A pocket generalizes the concept of a void by relaxing the requirement it be disconnected " ¥ ¢ ¡ ¡ B in Chapter II. Definition of pockets. It follows that represents a homology class in the second homology group of . the vector field is defined by the sweeping spheres. which implies that the complement. In the interior of V. Its connected components are open twodimensional sets. . The points that flow to infinity form a single component. in normal direction. . Since the dual complex is a subcomplex of the Delaunay triangulation. for example. Suppose. we described a deformation retraction from the space-filling diagram. we may think of each void in as a collection of tetrahedra. The latter set of points may formally be defined as the intersection of the pocket with the closure of the outside. . the center of the ball fixed and the radius at time is equal to the square root of . but we should keep in mind that this choice does affect what we do and do not call a pocket. to the dual complex. and all other components are voids. a pocket is a maximal portion of space outside the spacefilling diagram that turns into a void before it is subsumed by the growing diagram. Figure V. Since is finite. that is a finite collection of closed balls in and is the space-filling representation of a molecule. This collection bounds in but not in . The plain existence of that retraction implies that for each void in we have a void in that contains the void in .72 V S HAPE F EATURES from infinity. Hence. To formalize this intuition. is the number of voids in . Note that voids are pockets without mouths. in the direction normal to the surface. The boundary is a collection of triangles in . we grow the space-filling diagram and observe how it changes: the relatively narrow entrances close before the inside disappears.2: The growing disks push the points on the boundary outwards. which we refer to as the outside. we can reverse the deformation retraction to show that the two voids have the same homotopy type. Indeed. We define a pocket as a connected component of the set of points whose paths do not go to infinity.1 Pockets In this section. The simplest type of pocket is a void. To make this idea concrete. the boundaries of the voids form a basis of that homology group. we follow vectors and thus form a path that may or may not go to infinity. we need to settle on a growth model.

In Case C and in the last sub-case each of Cases C and C . which marks the orthocenter of the triangle. sees a vertex of from the outside. edge.   ¢ Figure V. edges and vertices visible from . this is only possible if the ball centered at that vertex is contained inside the union of the balls centered at the other vertices of . namely when the space-filling diagram encounters a new vertex. Case M : . both illustrated in Figure V. polygon or cell of the Voronoi diagram. Its orthocenter is necessarily the corresponding Voronoi vertex. Case C : . Case is a tetrahedron. . This is unlikely to happen for molecular data and usually indicates a measurement or modeling mistake. Here we have three sub-cases depending on whether sees one. namely in              M1 C1 Case is a triangle and lies in the interior of the corresponding Voronoi edge. which is the moment when the -th ball changes from imaginary to real radius.4: The thin solid lines represent polygons that meet along a common edge in space. the three balls touch the Voronoi edge at the same moment they encounter the Voronoi polygon dual to the visible edge. Similar to voids. There are ten cases distinguished by the dimension of the dual Delaunay simplex. In four of the ten cases. ¨ ¡  ¡ ¨ ¡  ¡ £ ¨ ¢¡        ¨ ¡ ¨ ¨ . eventually touching it at .3. Metamorphoses and collapses. Case is an edge and lies in the interior of the corresponding Voronoi polygon. Here we have two sub-cases depending on whether sees one or two edges from the outside.V. In the second case. From left to right. In the first case.1 Pockets Evolution of dual complex.       ¨ ¨    ¢   Case M : . There are three generic subcases. the balls touch the edge at the same moment they encounter the two polygons and one cell dual to the two visible edges and the vertex they share. . The latter is defined combinatorially. only one simplex is added to the dual complex. it lies on ones side of the polygon. Case M : . and the relative position of its orthocenter. all illustrated in Figure V.4. we may associate a pocket of the space-filling diagram with a pocket of the dual complex. The solid dot marks the orthocenter of the Delaunay edge. Case C : . this edge intersects its dual Voronoi polygon. This cell is encountered at time .3: The vertical lines are side views of polygons in space. The four balls completely surround the Voronoi vertex before they reach it. There are two generic sub-cases. The dual complex changes only at discrete moment. . again by observing how the space-filling diagram changes as it grows. lies outside and sees two edges and their shared vertex. At the moment they touch. two or three triangles from the outside.     73 Case C : . polygons and cells that correspond to the triangles. Assuming lies outside the space-filling diagram. ¨ ¢¡ ¨ ¡  ¡ ¨ ¢¡  ¢ ¨  # ¨   ¨ C2 M2 C2    0 ¥   D ¨   Figure V. the smaller ball breaks through the outer sphere and starts sweeping out the Voronoi cell on the other side of the polygon. The three balls completely surround the Voronoi edge before they touch at . The four balls touch the Voronoi vertex at the same moment they touch the Voronoi edges. while on the right. the orthocenter lies inside the triangle. Case M : is a vertex and the orthocenter lies in the interior of the corresponding Voronoi cell. The two balls approach the polygon from the same side. We recall that is the point at which the affine hull of intersects the affine hull of its dual in the Voronoi diagram. The two balls approach the Voronoi polygon from both sides. On the left. lies outside and sees one edge. That edge appears as a solid dot.

 01−collapse ¨  12−collapse 02−collapse implies that the square radius of the orNote that thosphere of is less than that of the orthosphere of .5. if any. we define . since they change the homotopy type.    ¢ ¢  ¨ ¢  © ¨ ¨ ¨ ¨  © ¨   ¢    ¢  ¡  ¢   ¤    ¨   ¨  £ ¤    ¥   ¥  ¥ ¤      ¨ £ ¤  ¢    ¨   ¥     ¤ ¨ ¢ ¤ ¨ ¥  ¤ ¨ ¢ ¡ ¤ ¨     ¡ the flow along normal vectors. the transparent triangles. and collapsing an edge from a vertex. its orthocenter is at infinity. Partial order. For each triangle visible from . In the process.6. Pockets of dual complex. this is true because the orthoradius of is infinity. and so on:  ¨ §  ¢ ¤!£ © ¥    £  ¡  ¢¡   £ § ¢ ) ¡   ¨ Figure V. we call these operations metamorphoses. Being a deformation retraction. They can be understood as inverses of the six types of collapses illustrated in Figure V. Recall that a princi- V S HAPE F EATURES 23−collapse 13−collapse 03−collapse pal simplex is not face of any other simplex in the complex. The other sinks are the tetrahedra that contain their orthocenters. they define metamorphoses in the evolution of the dual complex. Using the classification into ten different operations. To cover the case in which the triangle lies on the boundary of the Delaunay triangulation. M and M . its predecessors. that represents the space outside the triangulation.74 Cases M . By definition. The centers of both (dotted) orthospheres lie on the right of the separating plane. A proper face of a principal simplex is free if all simplices that contain are faces of . where is the tetrahedron on the other side of the shared triangle. which is the operation that removes all simplices between and including and . With this notation.5: From left to right. we may introduce a partial order on the Delaunay simplices. an edge and a vertex. This is what we call a sink of the relation. and neither does its inverse. We will see shortly that the remaining six cases do not affect the homotopy type. Consistent with the discussion in Chapter III. the retraction removes and all faces of that contain . is acyclic and its transitive closure is transitive. the operation does not affect the homotopy type of the complex. by definition. the complex obtained from by collapsing the pair is . It is convenient to specify the type using the dimensions and and to talk about -collapses. and the dotted vertices. This implies that the square radius increases along every chain of the relation. We are now ready to define and compute the pockets of the dual complex using the partial order over the tetrahedra. for .6: Think of the triangles as projections of tetrahedra and the circles of projections of spheres. If and are both (finite) Delaunay tetrahedra. the predecessors of the predecessors. the changes in the dual complex described in Case C are caused by inverses of -collapses. In each case. Each collapse can be realized as a deformation retraction that pushes a portion of ’s boundary through toward the remaining portion of the boundary. As noted in Case C . two or three of the triangles. the dashed edges. The ancestor set of a tetrahedron contains . we introduce a dummy tetrahedron. . As illustrated in Figure V. which we think of as a discretization of Figure V. We are only interested in tetrahedra. Hence. Formally. the two orthospheres intersect in a circle that lies in the separating plane and the orthocenter of is further from that plane than the orthocenter of . M . top to bottom: collapsing a tetrahedron from a triangle. if the orthocenter of a Delaunay tetrahedron lies outside then it sees either one. this is true because their orthocenters are Voronoi vertices that lie on the same side of the plane separating and . so can only be a successor but not a predecessor of other tetrahedra. the collapse removes the tetrahedron. Such a pair defines a collapse.  ¥ ¤  ¥ ¨      . If . for . collapsing a triangle from an edge and a vertex.

namely the growth model of the input balls. 83–102. See Figure V. only one dimension lower. Aronov. [4] I. Holes and Other Superficialities. dra of all pockets. which connects to the outside along one mouth. We compute the pockets in two steps: ¥ ¥ 75 complex. [1] R. Surface reconstruction by wrapping finite sets in space. although this is not the common case. [3] H. Partition this collection into components. we collect the triangles in that belong to exactly one pocket tetrahedron. we now collect all unmarked tetrahedra in a single scan through the list. mark the tetrahedra in the dual complex. we mark the tetrahedra in the ancestor set of by searching backward from along the pairs of the relation. In Step 1. We ¤ ¥ ) Step 2. K UNTZ . Basu. Discrete Appl. Math. Massachusetts. Step 1. 88 (1998). Collect the boundary triangles not in Step 2. Berlin. we can compute the connected components using standard graph algorithms. C. ω K Figure V. such as the ones counted by the Betti numbers. We may do the computation for individual pockets or for all pockets at once. C ASATI AND A. to appear. The pockets in the dual complex are defined by the tetrahedra that neither belong to the dual complex nor to the ancestor set of .7: Ordered list of simplices with relation over the tetrahedra indicated by arrows. The importance of cavities in drug design and discovery has been known for a while [4]. 1078–1082. M. E DELSBRUNNER . we call two triangles adjacent if they share an edge does not belong to . The definition of a pocket is not purely topological and requires a crucial geometric component. Bibliographic notes. Cambridge. J.8: The eight disks form one pocket. L IANG . eds. E DELSBRUNNER . This has also been noticed by the philosophers Casati and Varzi [1]. On the definition and the construction of pockets in macromolecules. A. Sharir. VARZI . which form a prefix of the sub-list of tetrahedra. Finally.V. Springer-Verlag. An extension to include simplices of all dimensions has been used for reconstructing the surface of scanned point sets [2] and might have further applications in the analysis of protein shape.8 for a two-dimensional illustration. [2] H. B. It is also possible that it belongs to more than one ancestor set. such as depth-first search or union-find. This growth model forms the basis of the partial order over the Delaunay tetrahedra. The corresponding pocket in the dual complex consists of four triangles and a single mouth edge. Structure-based strategies for drug design and discovery. Collect the tetrahedra in . we use the same standard graph algorithms to compute components. Next. ¤ ¢ §  ¢ ¤ ) ¤ ¢ . Pach and M. Science 257 (1992). Computing mouths is similar to computing pockets. Call two tetrahedra in this collection are adjacent if they share a triangle that is not in the dual ¤ To collect the tetrahedra.1 Pockets We have seen that a tetrahedron can have more than one successor. To complete Step 1. . Discrete and Computational Geometry — The Goodman-Pollack Festschrift. The resulting collection contains the tetrahe¢ ¢ Figure V. In everyday language we barely make any difference between pockets and other holes. Partition this collection into components. Based on this adjacency information. we assume the Delaunay simplices are given in a list ordered by birth-time. Step 1. Note that this is more conservative than collecting all tetrahedra outside that belong to ancestor sets of finite sinks.7. who introduce a concept they call a hollow which is similar at least in spirit to our formal notion of a normal pocket. 1994. S. the relation over the tetrahedra is acyclic and goes monotonically from left to right. As illustrated in Figure V. MIT Press. FACELLO AND J. The formalization as pockets introduced in this section has been described in [3] and implemented as part of the Alpha Shapes software. D. In Step 2.

then we are talking about a void with positive life-time. When the components merge the first time. a 1-cycle gets destroyed. We will see that even if a triangle and a tetrahedron are added at different moments. the rank of the -th boundary as it is for . For example. The labels indicate the types of metamorphoses that correspond to the topological changes. It should be clear that M destroys what the upper M created. a component gets destroyed.10. and we may interpret that life-time as a measure of significance of the void. The measure can be used to distinguish between pockets with relatively wide and narrow entrances and they are essential in the definition of molecular interfaces discussed in the next section. The only matrices affected by adding to the complex are the ones of and of . and when the hole gets filled.10: The addition of to the complex appends a column to the matrix of and a row to the matrix of .9: The region grows from two vertices. Case creates. A prime example of an evolving topological space is a space-filling diagram that grows in the way discussed in the preceding section. We may also interpret it as a shape measure of the corresponding pocket. Consider the it destroys if its addition decreases evolving two-dimensional space illustrated in Figure V. may remain the same or it may increase. As before. a 23-collapse consists of a triangle creating a void and a tetrahedron filling the same. Then belongs to a -cycle.2 Topological Persistence 0 )   ¡¤  ¤ ¢    ¢  ¤ ¢¡ ¤   ¥ . The are the complexes that arise during the evolution and. Hence . which implies that its row in the matrix of can be zeroed out. On the other group. If it does. and the second merge creates a void that eventually disappears. Nobody destroys the component created by the left M . The life-time of this void is zero because the triangle and the tetrahedron are added at the same moment. we measure the life-time or persistence of a topological feature in an evolving topological space. ¡ 0  V. Each anti-collapse may be viewed as a sequence of metamorphoses in which the later simplices destroy the topological features created by the earlier simplices. We study the algorithm in terms of matrices of boundary homomorphisms. is the same for hand.  ¡  ¤  ¡   ¨ ¦  ¡§     £ ¤  ¢     ¡© ¨ ¥  ¨ ¨¡   £ ¦   ¡   ¡§       ¤       ¤ for the corresponding filtration. We can thus write the Betti numbers of in terms of the ranks of various groups defined for as follows:   ¥ ¨  ¡ ¨ ¡ M1 ¡ ¥ ¤   ¡ ¤  M0 M0    ¤ M2 of is zero because is not a face of any simplex in . There are three events at which homology classes are created.3. and that the lower M destroys what the right M created. We will formalize the idea of pairing creations with destructions by revisiting the incremental algorithm for Betti numbers presented in Section IV. ¡ ¡  Ck C k −1 C k +1 0 Ck   ¡  ¡ ¥ ¡  ¡ ¡     ¡  ¡   ¥  ¡ ¥ ¥       ¤  0  ¤ ¡   0 The idea of creation and destruction is the same as in Section IV. The new column of the matrix   The intuition. in the generic case. Let the dimension of be .76 V S HAPE F EATURES .  ¥ ¨ ¢ ¨ £ ¥ ¤¨ ¢ M1 Figure V. Recall that a single step in that algorithm computes the Betti numbers of a complex from the Betti numbers of . it is possible to decide in an unambiguous manner whether or not the tetrahedron destroys what the triangle created. Incremental algorithm revisited.3 and depends on the effect on the Betti numbers: a -simplex creates if its addition increases and  ¤  ¡  ¤ Figure V. namely when the two components get born at the points labeled M and when the components merge the second time at the upper point labeled M . we write ¡  In this section. the two components merge twice.9 as an example. any two contiguous complexes differ either by a metamorphosis or an anti-collapse. which are displayed in Figure V.

we now define the persistent -th homology group of as the cycle group divided by the boundary group at positions later in the filtration: Taking the intersection of the boundary group with the cycle group is necessary for technical reasons to define the quotient group.12: The cycle group and its decompositions into solid -persistent homology classes and dotted 0-persistent homology classes. we use row operations to reinstate the property before adding the next row. Besides re-proving the correctness of the incremental algorithm. such as scale in the case of alpha shapes. but to simplify matters here.2 Topological Persistence -st Betti number remains unIn words. we maintain inductively that each column is last for at most one row. the changed and the -th Betti number increases by one. To make this precise. we re-define time equal to is added at time . that row is either zero. which are slower but more general. After that addition. we define persistence so it depends on the time when simplices are added to the complex in the filtration. each row has at most one last column. the index. ¨ $ £ In words. it returns zero if that last column does not exist. we let be the index of 1 1 1 1 1 1 1 1 1 1 1 After running Function DOES C REATE for the -th row. in which destroys. For example. and we assume a function L AST C OL that returns the index of the last column. we attempt to zero out its row from right to left.   ¥ The case analysis confirms that the incremental algorithm as described in Section IV. the row that corresponds to the new simplex . or it has a unique last column. columns in the matrix of correspond to individual -simplices and rows represent cycles. Recognizing creations. the above analysis points the way to an alternative procedure for distinguishing creating from destroying simplices. We argue below that Function DOES C REATE computes more than just Betti numbers: it also determines how long a homological feature lasts along the filtration. this property is satisfied by the matrix in Figure V.12 illustrates the difference Zj 0 B j+p Bj Figure V. we call the column of the rightmost non-zero entry in a row its last column. Clearly. To describe how this is done.11: The shaded rightmost non-zero entries identify last columns of rows. When we add .   ¥ 0  £    ¡ ¡ ¦     ¢    £  ¤ ¨ ¡ $ £  £  £          ¥     0 ¥ ¨ ¨ ¢     ¡    ¡  ¡      ¡  ¡ ¡   ¡  ¡   ¥ ¥ ¥ ¥ ¨     £ ¥ ¥ ¥    ¨   ¨ ¥ ¥     ¥ ¥ ¥  ¤  0  ¤ ¡    ¥    ¥ 0  ¨ . In other words.11 before the shaded last row is added. for which is index of the row. Its row in the matrix of can therefore not be zeroed out and we get a new non-zero entry in the normal form of that matrix. case Persistent homology. Since we only use row operations. Case destroys. Instead of a unionfind data structure. we use elementary row operations. the -st Betti number decreases by one and the -th Betti number remains unchanged. Conversely.V. Then does not belong to a -cycle. In general. we return to the situation in which the filtration represents meaningful information. in which case the corresponding simplex creates. return TRUE. To explain the algorithm.    77 rows. among the first the last column. Given a column .3 computes the Betti numbers correctly. It returns zero if the row is not defined. we also assume a function ROW that returns the Figure V. boolean DOES C REATE int while L AST C OL do if ROW then row row row else return FALSE endif endwhile. Hence. we say Keeping this convention in mind. Figure V.

13: Each right-angled isosceles triangle in the indexpersistence plane represents a non-bounding cycle that persists over the complexes covered by its interval. the number of destroying -simplices is the rank of the boundary group: . We can therefore pair them up and form vertex disjoint intervals. The running time of the pairing algorithm is roughly the same as that of the normal form algorithm described in Section IV. the creating -simplices and destroying -simplices are arranged like opening and closing parentheses in an expression.   ¥ $ 6 5 4 3 2 1 0 0 I NTERVAL P ROPERTY. In the assumed simplified case in which is added at time . as it witnessed by the cycle represented by the row. Function DOES C REATE spends fewer than row operations per simplex. $ V S HAPE F EATURES We illustrate this property by drawing a right-angled isosceles triangle below every interval. Specifically. there is exactly one pairing that has the following stronger property for persistent Betti numbers:    ' ¨ $  ¤ ¥ ¥  ¥ ¨ ¥    ¥   ¤    £ ¥   ¥   ¡    ¤  £ ¨ ¥    ¥ $ ¨ $ ¢ $   ¤ ¥ ¤  £ ¥      0 ¨ ¥ ¨   ¥ ¡  ¥ ¥ ¢   ¨ ¥    0 ¥ persistence . Similarly. except that some closing parentheses may be missing at the end. Because Betti numbers are non-negative. According to the Interval Property. £ £ £ Figure V. Each triangle is closed along the top and left edges but open along the hypotenuse. The pairing of simplices to obtain intervals satisfying the Interval Property is done using Function DOE S C REATE explained above.78 between the -persistent homology group and the usual or 0-persistent homology group. namely cubic in the number of simplices.4. Indeed. it is the number of right-angled isosceles triangles that contain this point. Note that this simplex indeed creates. which shows the persistent first Betti numbers of the space-filling diagram modeling the gramicidin protein.14: Graph of scale for gramicidin. Observe the large triangular plateau. Pairing. The -persistence -th Betti number of is represented by the point in the index-persistence plane. The Betti number at position is then the number of intervals that contain . We develop an intuitive picture of persistence using the distinction between creating and destroying simplices. The index in the filtration varies from left to right and the persistence from back to front. every prefix contains at least as many creating -simplices as destroying simplices. the persistence is the difference between indices: . each destroying -simplex corresponds to a non-zero row in the matrix of and is paired with the -simplex that corresponds to the last column in that row. In particular.16. Any arbitrary pairing creating vertex disjoint intervals has this property for Betti numbers. which corresponds to the dominant tunnel that passes through gramicidin. each taking time at most proportional to . The -persistent -th Betti number at position is the number of intervals that simultaneously contain and . The Betti number is the surplus of creating versus destroying simplices: . which is at most some constant times . We use intervals that are closed to the left and open to the right. This is the convention we used to generate Figure IV. The -persistent -th Betti number is the rank of the -persistent -th homology group: . the number of tunnels in logFigure V. (Can you prove that?) In contrast. $ 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 [ [ [ ) [ ) ) ) index .  ¨  ¨  ¥ #     ¥ £¨ ¥ §¥ ¥ ¨    £ £ ¡ ¤¢  ¨   ¥ Interval property of persistence.13. The persistence of a pair is the time-lag between the additions of the two simplices to the complex in the filtration. Note that the number of creating -simplices until position in the filtration is the rank of the cycle group: . as shown in Figure V. each starting at the position of a creating -simplex and ending at the position of a destroying -simplex (or extending to infinity if there are no destroying simplices left).

L ETSCHER AND A. E DELSBRUNNER . where we find the definition of persistent Betti numbers. It might be interesting to explore the other groups in that table and to find meaningful interpretations in the context of alpha complexes. the algorithm and its correctness proof. 28 (2002). that the implementation in [1] differs in two possibly significant aspects from the algorithm described in this section. who uses them to study the fractal nature of two-dimensional point patterns. Topological persistence and simplification. Cambridge Univ. D. Toward computing homology from finite approximations. 2001. M C C LEARY.V. Geom. Second edition. Discrete Comput. 79 . 511–533. and second. A User’s Guide to Spectral Sequences. [3] V. however. which are special tables of related homology groups [2]. The algorithm has been implemented and experimental results suggest it is considerably faster than the obvious cubic time bound.2 Topological Persistence Bibliographic notes. Z OMORO DIAN . First. Press. Topology Proceedings 24 (1999). [2] J. 503–532. it uses a sparse matrix representation that permits row operations in time proportional to the number of non-zero entries. [1] H. ROBINS . We should note. England. Persistent homology groups are embedded in spectral sequences. Persistent Betti numbers have been defined independently by Robins [3]. The material for this section is taken from [1]. the implementation uses a union-find data structure to classify simplices as creating or destroying.

that 2manifold is orientable.and four-chromatic vertices. Specifically. a polygon can be mono-chromatic or bi-chromatic depending on whether the two cells that share the polygon have the same or different colors. We use colors to keep track of the correspondence between balls and molecules.   ¡          ¢   ¨      #  # ¢ ¡   ¡        ¡ ¡ ¡ . every curve is a maximal component consisting of tri-chromatic edges and vertices of a given color triplet. Our definition of a molecular interface is a formalization of two intuitions. we get a 2-manifold. edges and vertices of a given color pair. In other words.    Figure V. We will come back to the second intuition later and formalize the first intuition now.16. namely that the best separation of two or more molecules is part of the Voronoi diagram and that the interesting portion of that separation is protected by a relatively tight seal. edges and vertices get their colors from the cells they belong to. every interface vertex is a four-chromatic vertex in the Voronoi diagram. An interface edge belongs to two cells of one and to one cell of the other color. Every sheet is a maximal component consisting of bi-chromatic polygons. On the left. As illustrated in Figure V. The polygons.3 Molecular Interfaces The interface between two or more interacting molecules is the location of that interaction. Consider an assembly of molecules. Figure V. For any two colors. and let be the collection of all balls. the sheets. edges and vertices shared by the cells. the local structure of the interface can For be more complicated because we may have tri-chromatic edges and tri.  ¡ the interface for colors is a -manifold. the interface is a two-dimensional complex of sheets. we have two cells of each color. Figure V. The dotted mono-chromatic edges show the rest of the Voronoi diagram.15: The solid bi-chromatic edges form the interface of the two collections of disks.16: The shaded polygons and their edges belong to the interface.15 illustrates the definition by showing the interface of two collections of disks in the plane. we have three cells of one and one cell of the other color. While all cells are mono-chromatic. Similarly. Recall that the Voronoi diagram of consists of a polyhedral cell for each ball and of the polygons. On the right. curves and vertices form a complex in the sense that the boundary of every sheet consists of finitely many pairwise disjoint curves and vertices. Finally. By construction. We conclude that in the generic case  V. we present a proposal for a surface or complex of surfaces that geometrically represents that interface. with the cells of one color on one side and the cells of the other color on the other side. which is a topological space in which every point has an open neighborhood homeomorphic to . and exactly two of the three polygons sharing the edge are bi-chromatic and thus belong to the interface. and the boundary of every curve consists of finitely many interface vertices. One of its applications is to display functions defined over the interface. each represented by a collection of balls in . Local structure. In this section. Together. every edge belongs to three and every vertex to four Voronoi cells. curves and vertices. This implies that for colors. the local neighborhood of both types of vertices is a topological disk. There are two types of interface vertices: those that belong to three cells of one and one cell of the other color and those that belong to two cells of each color. the interface has a particularly simple local geometric structure. The interface between the is the subcomplex of the Voronoi diagram consisting of all colors. if belongs to then we say and have the color .80 V S HAPE F EATURES bi-chromatic polygons and their edges and vertices. but now these 2-manifolds meet along curves formed by tri-chromatic edges. In the generic case. Interfaces without boundary.

We use 23-collapses to remove these tetrahedra. we add the dual polygon to the interface. The interface as defined above is dual to the subset of multi-chromatic simplices in . we connect the cut points in contiguous pairs and retain the portions of the polygon with vertices of the first type. Our goal here is to shrink the interface back to where the molecules are sufficiently close to interact. : is collapsible then do delete endfor     ¨   ) ¨ ¨ ¤ ¤ ¢¡    ¨ ) Figure V. the interface may go to infinity. we take pairs from the stack and add new pairs whenever we create new boundary triangles by collapsing. we clip at the endpoint that is closer to the plane.as well as multichromatic edges. A partially surrounded bi-chromatic     Complex R ETRACT : while the stack is non-empty do P OP. We clip the polygon by cutting each edge connecting vertices of different types with the plane of the corresponding boundary triangle. which is sometimes a disadvantage. edges and vertices. Initially. We have mono-chromatic vertices and mono. we maintain a stack of candidate pairs. we consider the Delaunay triangulation of the collection of balls . we consider collapsible if the pair is part of an anti-collapse in the construction of the filtration and the collapse of and renders the other simplices in this anti-collapse principal.   © ¢ ¨ ¦ ¤ ¢   §§£§¥£¡ 81 We may think of a retraction as successively removing sinks from an acyclic directed graph. It follows that the result of the operation is independent of the sequence in which the collapses are performed. we delete principal triangles. which happens in rare cases. In the latter case. which represents the space outside the Delaunay triangulation. This is equivalent to saying that the effect of the 23-collapse is the inverse of that anticollapse. If that plane does not intersect the dual Voronoi edge. we retract the interface back to the multi-chromatic dual of the dual complex and its pockets. We therefore shrink from outside in and use relative rather than absolute distance measurements to decide where to stop the process. In the implementation of this operation. The boldface interface is dual to and clipped at the boundary of this collection. C OLLAPSE endwhile. Let denote the dual complex. in other words. this stack contains all boundary triangles of the Delaunay triangulation together with their incident tetrahedra. we use topological persistence to shrink the interface even further.     ¨ . We further remove all mono-chromatic tetrahedra and let denote the remaining collection of multi-chromatic tetrahedra. however.V. In other words. for each bi-chromatic edge of the tetrahedra in . Figure V. but this would most certainly lead to the deletion of interior portions and produce fractured surfaces.    ¨   ¨   ¤  ¤   ¨ void C OLLAPSE if and forall faces endif. We will return to the second step later. edges and vertices as soon as they arise. Clipping. we clip the polygon before adding it to the interface. Finally. complications because such a bi-chromatic edge may either be completely or only partially surrounded by tetrahedra in . To describe the shrinking process. During the process. The interface is now obtained as the dual of . We define a retraction as a maximal sequence of collapses. In the first step. It seems natural to do this with a distance threshold. edge corresponds to a polygon with two types of vertices: those dual to tetrahedra in and the others.  In this context. we collapse as long as we can. Note that the first step of the shrinking process is equivalent to removing all tetrahedra outside the dual complex that belong to the ancestor set of the dummy tetrahedron.17: The triangles drawn with solid edges are the bichromatic triangles constructed by the contraction algorithm. As defined above. but we should keep in mind that the situation in three dimensions is more complicated.3 Molecular Interfaces Retraction.17 illustrates this idea in two dimensions. More specifically. There are. The result of the retraction is the collection of tetrahedra in the dual complex together with the tetrahedra in the pockets. triangles and tetrahedra. We simplify the algorithm by ignoring principal triangles. In the second step.

Initially. although it can be. The algorithm maintains a stack of triangletetrahedron pairs formed by the topological persistence algorithm. we get the interface by duality from the computed collection of tetrahedra. but we have to modify the retraction to allow for collapses of simplices in the dual complex. The running time is dominated by the topological persistence algorithm. For dual complex of contains bi-chromatic triangles. For a fixed . To decide whether or not to remove and in the first place. The dimension of is one larger than that of . Since a smaller threshold permits as many or more removals than a larger threshold. endfor. :   ¨ ¨       D   ¨ ¨       ¡     ¨   ¡ ¡   ¡     ¡      ¡ ¡  ¡      ¡ ¡ ¨  ¡    ¨ ¡    ¨ ¢      ¤ ¨ ¨ ¡ ¡   £ ¨       &¡      &¡   ¡   ¨   ¨ ¤ ¢     ¨ ¨ ¨ ¨ ¢  ¢ ¡          ¨ ¨ ¡ ¡ ¡     ¢ £  D ¡     ¡ ¡ ¡ ¨ ¨   endif . We now take the shrinking process beyond the retraction from the dummy tetrahedron. Eventually. if we use . Note that for we have ¢   Complex R ETRACT M ORE while the stack is non-empty do P OP . if then R EMOVE endwhile. we get a filtration that is parametrized in a way similar to the sequence of alpha shapes. for . we can implement the rest of the algorithm so it takes only constant time per simplex in the Delaunay triangulation. There are two kinds of one-dimensional elements: the original tri-chromatic curves and the new bi-chromatic curves outlining the sheet boundary created by shrinking. where and are the moments when and are born. In other words. but it is more complicated because is generally not a face of . and are the moments when and are born. which would remain. Because of our policy to delete principal triangles. edges and vertices. However. however. the interface is the original surface or complex defined by the set of bi-chromatic Voronoi . in which case we recurse for other pairs of simplices before deleting . namely the original four-chromatic vertices and the new tri-chromatic vertices forming the curve boundary created by shrinking. but we are only interested in the case in which is a triangle and is a tetrahedron. all other collapses can be ignored. For example. the interface shrinks with decreasing . In this case. © ¨ ¦ ¤ §§¨ ¥   ¡ ¡ ¡ £      &¡  ¢     &¡     ¢   ¨  ¨ on the boundary of the current set . We do the operation only if is a boundary triangle of and does not belong to the dual complex. forall triangles do P USH R ETRACT endif. Its two-dimensional elements are sheets defined by bi-chromatic Voronoi polygons. The monotonicity guarantees that the simplices between and are removed by recursive deletions so that can eventually be deleted. We may start with the set of all Delaunay tetrahedra. which takes cubic time to form the triangle-tetrahedron pairs.2 generates simplex pairs with the property that destroys what created. Indeed. gets deleted just because it becomes principal. If the retraction from reaches far enough. the stack contains all pairs with  (V. Here. we can further decrease the interface by making negative. it can happen that the retraction does not reach all the way. For . we remove principal triangles. £¤¥      This monotonicity property is important for the correctness of the algorithm because if the retraction from does not reach then this can only be because there is a triangle between and that split the void created by before it was destroyed by . . the interface is a two-dimensional complex. Finally. We first delete and then retract from . the interface is empty. edges and vertices as soon as they get created. With some care. We take all sheets and curves as open sets so the complex is a collection of pairwise disjoint open elements. Recall that the topological persistence algorithm of Section V.1)  ¡ ¡ ¡ £      &¡  £   ¢        Here. We now restate the algorithm and simplify its description by declaring a 23-collapse as a special case of a removal. Note that we may get different interfaces for different values of the threshold . we compare their persistence with a constant threshold and remove only if . We think of the operation that removes and as a generalization of a 23-collapse. A second potential advantage of this function over the inverse of the persistence is that it is dimensionless and thus amenable to the use of universally meaningful constant thresholds. Note. we may bias the shrinking process against large triangles and tetrahedra by using . But then the other part of the void must have been destroyed by a tetrahedron preceding in the filtration. unless the polygons. is the tetrahedron that shares with . the interface is guaranteed to be empty. This is done implicitly during the retraction. void R EMOVE : if then delete . As before. We note that it is possible to use other functions that satisfy the monotonicity property (V.82 Further retraction.1). there are two kinds of zero-dimensional elements. Global structure. V S HAPE F EATURES As before.

. Algebraic Topology: an Introduction. Acad. A classic result in topology says that two orientable 2-manifolds with boundary are homeomorphic if and only if they have the same genus and the same number of holes. The material in this section is taken from the recent manuscript by Ban et. New York. [3] A. RUDOLPH . [2] W.-E. 1–6. al [1]. W RIGHT AND D. C. we excise thin strips along the curves to turn each sheet into a connected 2-manifold with boundary. computing and visualizing molecular interfaces. Given a sheet. Natl. D. [4] J. In topology. The fact that the topological type of a connected orientable 2-manifold is determined by the genus and the number of holes can be found in a number of texts. We then get the genus as . Defining. Duke Univ. 2002. There is evidence that the geometric interfaces shed new light on the hot-spot theory of protein-protein interaction [4]. P. 36–43. it is easy to compute its Euler characteristic and to determine its number of holes. North Carolina. Binding in the growth hormone receptor complex. In “Proc. B ROOKS .. 1967. Manuscript. [1] Y. Sci. and are the number of vertices. S. J R . A competing proposal for a geometric definition of molecular interfaces can be found in [3]. H. Proc. where two independent real parameters are used to define the interface as a portion of the molecular surfaces of the two or more molecules. M ASSEY. VARSHNEY. Durham. ¦ ¦    ¥ ¦  ¥    "  ¦       ¥ ¥ # ¤ " ¥ ¨      ¦ ¤ ¦ #  ¡   . A. 2-manifolds with and without boundary have been studies for more than a century. edges and triangles of any arbitrary triangulation of the 2-manifold. F. W ELLS . M INOCHA . Springer-Verlag. 93 (1996). Each component of the boundary is a closed curve outlining a hole in the 2-manifold. IEEE Visualization. Furthermore. A definition of interfaces for protein oligomers. including [2]. the Euler characteristic of a 2-manifold with genus and holes is  ¡ 83 where . V. BAN . 1995”.V. We may think of this manifold as obtained by punching holes into a -fold torus. E DELSBRUNNER AND J. Bibliographic notes. W.3 Molecular Interfaces that the elements are not necessarily simply connected. To explore this further. R ICHARDSON .

As explained in Section IV. The two systems can be detected in the tunnel signature shown in Figure V. we have experimented with other and more simple-minded ideas aimed at getting a handle on cavities in molecular data. tunnels and voids of a complex in are counted by the Betti numbers . ¢ ¡  0  0 ¡ 0 Figure V. then proceed to pockets. 12 10 8 6 4 2 0 0 5000 10000 15000 20000 25000 30000 35000 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 . the components.20 shows the two-dimensional Displaying pockets. and . It follows that there are complexes in the filtration that have the tunnels in the first system closed while the tunnels in the second system are still open. As an example consider the zeolite data shown in Figure V. we explore extensions of the Alpha Shape software that are concerned with connectivity information and shape features. Betti number signatures. Prior to developing and implementing pockets. Two of the three views are taken along tunnel systems that intersect orthogonally and give rise to a rather complicated cave system. The noise in the signature decreases from back to front. which implies that both tunnel systems are open in the displayed complex. To the left of each button we can toggle the display of the evolution of the number as a function of the index in the filtration. the number of tunnels in log-scale. They are computed by the algorithm explained in Section IV.296 atoms.  ¤ ¥ ) Figure V.354 belongs to the higher of the two plateaus. The persistence of the tunnels is formally defined in Section V. or more generally the difference #   §¥    £ £ ¡ ¤¢  Figure V. Figure V. The index 2.84 V S HAPE F EATURES V.19.2. One such idea was to display the difference between the Delaunay triangulation and the dual complex.19. and finally look at molecular interfaces. of the zeolite data.20: The graph of .354-th dual complex in the filtration of a periodic zeolite molecule consisting of 1.4 Software for Shape Features In this section.3 and displayed to the right of the correspondingly labeled buttons in the signature panel shown in Figure V. We begin with signatures. The two persistent tunnel systems are visible as plateaus that escape the noise removal the longest.19: The signature panel with the tunnel signature displayed in log-scale.2. We refer to these functions as signatures of the data set. tunnel signature with filtration index increasing from left to right and persistence increasing from back to front. . Note that the tunnels shown in the second view are smaller in diameter than those shown in the third view.18: Three axis-parallel views of the 2.18.

4 Software for Shape Features . Pocket panel. which can be used to display the edge skeleton of the dual complex together with the pockets. The second index. such as side pockets of larger pockets. is similar to that of the signature panel. can be chosen anywhere between and the maximum. as shown in Figure II. which start after the index of the first chosen complex. ¥  ¤  ¤ ¥ ¨ Figure V. . The panel also provides a means to step through the sequence of individual pockets and to select pockets by their number of mouths. with .21. as in Figure V.23.21: All pockets in the dual complex of the zeolite data for index 2. An ex- .22: Side view of the largest pocket of the collection shown in Figure V. The results are not encouraging because a typically large number of inessential simplices clutters the view of important cavities.23: Pocket panel of the Alpha Shape software. and using the explosion function to separate all simplices. all tetrahedra . whose dual complex is considerably smaller than a corresponding space-filling diagram. ¨ Figure V. This effect is the reverse of that for the molecule. The main design of Figure V. The skeleton does not block the view and helps positioning the pockets relative to the complex. Two boundary triangles that share a common edge may or may not belong to the same mouth depending on which shared edges belong to the pocket. It is used to eliminate ancestor sets of tetrahedra whose indices are larger than or equal to . the dual set of a pocket usually gives a clear indication of the cavity.22 shows the largest of the pockets in Figure V. which is facilitated by that panel. Pockets can be computed without opening the pocket panel. which may lead to confusion. It contains a window for its own signatures. We observe the same phenomenon for the mouths of a pocket.926. the internal connectivity of the pockets is not immediately visible. two pockets may appear connected but are not because of missing shared triangles. The software indicates the presence or absence of boundary triangles by the choice of color.21. For example. The interface also supports the ¥ 85 closed under the face relation. However.V. A useful feature is the ‘Shapewire’ button. In contrast.21 from a different angle. the pocket panel. This difference between two dual complexes. but a more detailed exploration requires interaction with the software. This elimination of large pockets helps in the exploration of detailed structures. can be computed in the Alpha Shape software by first selecting and and second pushing the ‘Difference’ button in the scene panel. shown in Figure V. In other words. display of individual pockets. The mouth regions are therefore visually easily identifiable. It is possible to visually inspect the connectivity by turning on the display of simplices of all dimensions in the scene panel.17. are treated like in the computation of pockets. We should keep in mind that the pocket in the dual com- Remember that pockets in the dual complex are not ¥  ¡   ¨ ¢ ¨ ¥ plex is geometrically considerably larger than the pocket in the corresponding space-filling diagram. and Figure V.

Analyzing and Comprehending the Topology of Spaces and Morse Functions. 2001. The pocket software has been developed by Michael Facello and is described in his dissertation [1].18. Liang and collaborators [3] studied ¥ .24: Three axis-parallel views of the pockets representing the narrow tunnel system decomposed into pieces by opening up the wide tunnel system. Univ.24. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand binding. Dept. Some of these features can be seen in visualizations of interfaces presented in this section. [4] A. Univ. Sci.86 ample is shown in Figure V. which we remove for simplicity. [1] M. [The input is a complexed collection of proteins.] Bibliographic notes. Ph.] [Show the sequence of figures illustrating the interface filtration. In another application. [3] J. It is currently not part of the Alpha Shape software. [Say a few works about the particular two proteins. The most interesting outcome of that study is perhaps that in about 80% of the cases. 81 (2001). Figure V. 1884–1897. Liang and Dill [2] provide numerical evidence that proteins are packed tighter in the core than near the outside. Comput. these remains are not connected. Z OMORODIAN . [2] J. It is built on top of the Alpha Shapes software but requires a variety of additional features to be useful to biologists. D ILL . V S HAPE F EATURES fifty-one proteins and their cavity structure. Comput. D. Dept.] A human growth hormone example. which shows the pockets filling the system of narrow tunnels visible in the second view in Figure V. and as can be seen in the first view. Displaying interfaces.. Geometric Techniques for Molecular Shape Analysis. Urbana. Sci.18. Illinois. The interface software has been developed by YihEn (Andrew) Ban but is not yet complete. 751–766. A. E DELSBRUNNER AND C.] [Mention the issue of water molecules. Urbana. Using this software. Ph. The pockets thus only fill the remains of the narrow tunnels.] [Talk about the weighted square distance function over the interface. FACELLO . the pocket with the largest volume is also the biologically active site of the molecule. H. the largest pocket is assisted in its function by smaller auxiliary pockets in the vicinity. Protein Science 7 (1998). thesis.18 are still open. Both systems are shown as holes in Figure V. A. Illinois. Are proteins well-packed? Biophysics J. 1996. The persistence software has been developed by Afra Zomorodian and is described in his dissertation [4]. but with set such that the system of wider tunnels visible in the third view of Figure V. thesis.] [Show one figure with iso-lines of that function. In many instances. L IANG .. W OODWARD . D. L IANG AND K.

(i) Prove that if is embedded in then is collapsible iff its underlying space is contractible. (i) Show that a two-dimensional simplicial complex in which every edge belongs to exactly two triangles is not necessarily a 2-manifold. Each parenthesis has an integer position in the sequence. for every . Let complex and let denote its barycentric subdivision.  ¡ B ¤   ¤ ¤  ¥ ¤ ¤ ¥     B ¡R   ¡  ¡ R ¡R   B     ¡     ¡ R ¡ R ¥ ¡R  ¨ ©  ¤ ¤  ¡ ¥    ¤£ ¤ ¡ ¤ ¤  ¤ ¥  ¢ ¤ ¥ for all points . (iii) Explain how the Gabriel graph relates to the ancestor sets of the sinks. The label of a vertex in that triangulation of lies on the edge is either or . for . such as for example . (iii) What would be a natural generalization of these results from a triangle to a tetrahedron?  Exercises 1. Write if the two Delaunay triangles share an edge and both orthocenters lie on ’s side of that edge. (ii) Prove that the Gabriel graph is connected. if is collapsible then its underlying space is contractible. 2. (ii) Show that a simplicial complex in which the closed star of every vertex is the triangulation of a disk is necessarily a 2-manifold. Ancestor sets in the plane. and the label of a vertex in the interior of is either . (ii) Strengthen the result in (i) by proving that the number of triangles with three different labels is odd. (ii) Prove that the ancestor sets of any two different sinks in the order are disjoint. Recall that a 2-manifold is a topological space in which every point has an open neighborhood homeomorphic to . 3. (i) Prove that is a partial order. We call a simplicial complex collapsible if there is a sequence of collapses that reduces it to a single vertex. let be the sum of lengths of . Let be a finite set of points in . Prove that (ii) Prove that depends on the given sequence but not on the pairing. Consider a sequence of parenthesis of a well-formed expression. the pairs. 2-manifolds. (i) Show that each -simplex in gives rise to -simplices in . can the Euler characteristic of a void be any integer or are there restrictions? 6. (ii) Prove that the Euler characteristic of and are the same. Gabriel graph. The Gabriel graph of consists of all edges for which  (i) Prove that all edges in the Gabriel graph belong to the Delaunay triangulation of . 5. or . A void of a space-filling diagram is by definition connected but can have handles and islands. Let be a triangle and a . Collapsible complexes. Sperner’s Lemma. ¥ d       ¡ §    pB u § p   sB p ¥  §  ¡  p ¥     ¤   ¢  §¡CB p§ ¢ £¡   © ¤  ¥ ¤  p ¢   £ . Paired parentheses. (i) How would you define the Betti numbers of a void? ¥ £ 8. be a simplicial 4. (i) Prove that there exists at least one triangle in whose vertices have three different labels. Recall that a contractible topological space has the homotopy type of a point. Barycentric subdivision. and the length of a pair is position of the closing minus the position of the opening parenthesis. Connectivity of voids. A pairing is a perfect matching between the opening and closing parentheses such that the opening parenthesis precedes the closing parenthesis in every pair. 7. (ii) Give an example of a simplicial complex embedded in that is not collapsible but whose underlying space is contractible. Consider the Delaunay triangulation of a finite points set in . Clearly. (i) Given a pairing.Exercises 87 (ii) Following your definition.

88 V S HAPE F EATURES .

While Morse theory requires differentiable spaces and thus seems to be built on rather specialized assumptions. In some ways. Morse theory is but a different language or framework to talk about connectivity. we make an effort to relate the Morse theoretic concepts with the discussion on connectivity. In the second section.1 VI. In this chapter. we will see that many themes are familiar from Chapter IV. Possibly the best known result in Morse theory is the relation between the critical points of a smooth real-valued function over a manifold and the Euler characteristic of that manifold. Because of this relation. Morse theory is sometimes also referred to as critical point theory. it brings order into the complicated world of geometric form.Chapter VI Density Maps Morse theory grew out of the study of the variational methods in analysis. We use two sections to introduce the basic setting of Morse theory and one to explain the concept of molecular pockets in Morse theoretic terms. Together with suitable non-degeneracy assumptions.4 Morse Funcitons Critical Points Morse-Smale Complexes Jacobian Submanifolds Exercises 89 . The initial interest focused on highand possibly infinite-dimensional settings.2 VI. we introduce Morse theory with an emphasis on the twoand three-dimensional cases.3 VI. The differentiability assumption allows the introduction of otherwise undefined concepts. [The material will have to be partially rearranged according to the following plan of sections:] VI.

As shown in Figure VI.90 VI D ENSITY M APS do not belong to . changes its topology only at certain critical values of . we can interpret this event as attaching a cell of some dimension. The attachment of to a space requires a continuous map . In order to relate the topological type to differential properties.  ¦   ¡ § ¢  ¦ ¦ © § ¢ ¡   ¡ ¥   ¡  ¡ B   © © ¡ B  ¡     VI. a -dimensional hyperplane in that best approximates near . The primary goal is to find out about the topological type of the manifolds through a differential analysis of the functions. from an open set to another open set is smooth if the partial derivatives of all orders exist and are continuous. note that the boundary of is a -sphere. is defined by mapping each point to its distance from the plane. A -cell.1. The elements of the vector space are called tangent vectors to at ¦ ¨  ¡ B  ¡ G ¦© G ¦© £ ¨¥ ¨ B ¨ ¦ It is instructive to look at the evolution of the homotopy type of . . the map is smooth if for every there exists an open set containing and a smooth map that coincides with throughout . This section introduces Morse functions as a crucial piece in the basic mathematical framework of Morse theory. The tangent space at is the -dimensional hyperplane through the origin of that is parallel to this best approximating hyperplane. we can construct coordinate planes. Morse theory talks about manifolds and smooth functions over these manifolds. All interior points Figure VI. Piecewise Linear   ¢  ¡   ¢ ¡  ¡    © # ¡     ¦     p B ¤ urp ¢  ¡ ¡ B       ¢ ¡  ¡    § ¦  R ¤ ©B ¢ ¢ ¦ ¡ B ¡ ¡ ¡ ¦ ¦ ¡ ¡ ©B  £    ¤£ ¡   ¦  R ¡ § ¦       ¦ ¦ R ¦ ¢ £¡ ¡ ¡B  ¡   ¡ B     ¥ ¡ YR ¥   ¡¦ ¥    ¡   ¦ ¦ ¥ . which we refer to as the gluing map. and two spaces are diffeomorphic if there is a diffeomorphism between them. Formally. Smooth manifolds. two 1-cells. each hemisphere can be parametrized by orthogonal projection to one of the . we consider the set of points with height less than or equal to . The standard introductory example is the torus embedded in upright position in and the height function this embedding defines. . We can cover with six open hemispheres defined by for . Each time the homotopy type of changes.1 Smooth vs. . A particular diffeomorphism is called a parametrization of . ¦ ¨¡ B ¥ (¢'     p  £ h (q)  B    q "     ¦  ¦ ¦ h ( r) ¡ ¥ r © ¦ ¦ h ( s) s ¥ ¤ ¨ pB s!p ¢ ¢ ¡ ¡ B ¡      ¦ ¡ ¥ ¦ ¡ §§ ¡ ¥ ¦ ¡ ¦ © § ¥ ¥ ¡  © ¦ ¡ ¦ ¡ B ¡  ¦ As illustrated in Figure VI. so attaching a point or 0-cell is the same as taking the disjoint union. A diffeomorphism is a smooth homeomorphism whose inverse is also smooth.1: Evolution of the torus in the sweep from bottom to top and the corresponding construction by attaching a 0-cell. is a space homeomorphic to the -dimensional ball. A subset is a smooth manifold of dimension if each has a neighborhood that is diffeomorphic to an open subset .2. Then with attached by is the space obtained by identifying every points with . We need some basic definitions from differential geometry to express these restrictions.1. Sweeping a torus. Note that the composition of two smooth maps is smooth. For each . and a 2-cell. The evolution of the torus during the sweep and the interpretation of attaching cells is illustrated in Figure VI. . and its inverse is called a coordinate system on . For we have empty of boundary. we need to restrict ourselves to sets for which such properties are defined. For a point   A Morse function is a smooth real-valued map over a manifold that satisfies certain non-degeneracy assumptions. For general and not necessarily open sets and . As an example we may consider the 2-sphere .  B B   A map   h ( p) attach 0-cell 0 attach 1-cell attach 1-cell attach 2-cell Figure VI. To define what attaching a cell exactly means.2: The upper open hemisphere is parametrized by projection to the -plane.

and . lar degenerate critical point exists for the monkey saddle shown in Figure VI. . all eigenvalues are non-zero. . 1. Index.1 Smooth vs. the second derivative can be used to compute the best quadratic approximation. where is the dimension of the manifold . Critical points. The middle function has a degenerate critical point at 0. The Hessian is symmetric and we can compute its eigenvalues. Just like the first derivative can be used to compute the best linear approximation to .VI. the indices of the critical points . . Assuming a local coordinate system in a neighborhood. a circle drawn around a regular point has only one peak and one pit. a saddle for or . A simi A quadratic function in two variables has only three types of critical points. . This fact is also expressed in the lemma of Morse. and a minimum for . The index of at a non-degenerate critical point is the number of negative eigenvalues and is denoted as . a maximum and a minimum. Figure VI. These are the points with horizontal tangent planes. and marked in Figure VI. Critical points with small circles that oscillate more often than twice are necessarily degenerate. the Hessian of at is the matrix of second derivatives. It may be specified as the graph £ ¥ ¤§ ¥ ¡ ¨ ¢ ¨ B    ©B   ¡ § ¨£ ¡ ¦ ¡   ¡ £ £ §     ¡  ¥ ¦£  A critical point is non-degenerate if is nonsingular. .1. Non-degenerate critical points are isolated. a point is a critical point of if all derivatives vanish. and in Figure VI. . Note that the dimensions of the cells attached to the evolving torus in Figure VI. graphs of the function for . and minima. 1. This is generally the case because a critical point with index connects to the past along directions. Specifically. The homotopy type of the partial torus changes when passes the height value of the points . These directions span a dimensional cell needed to realize the connections. . The origin is a critical point for every possible assignment of signs to . and 2. Piecewise Linear . Degenerate critical points. There is a neighborhood of and a local coordinate system in with for all and If is a critical point then is a critical value. In contrast.1 are equal to the indices of the corresponding critical points.4. the tangent vector is a tangent vector and thus an element of . which is unfolded in different ways by the other two functions. Let be a non-degenerate critical point with index of . which means there is an open neighborhood without other critical points. Consider the height function defined by .1 are 0. and it is a maximum for . The second derivative vanishes too. which identifies 0 as a degenerate critical point. The saddle is the most interesting case of the three because a circle drawn around it has two peaks alternating with two pits. saddles.3 illustrates the instability of the degenerate critical point. maxima. Critical points are marked. A connected open subset is an open interval. $ . The derivative vanishes at 0. For example. Recall that the eigenvectors define an orthogonal coordinate system in the Figure VI. Note that for every smooth curve passing through . Geometrically. 91 M ORSE L EMMA . A 1-dimensional manifold is a closed curve. that is.3: From left to right. . Assuming the Hessian is non-singular. which is homeomorphic to . The index is then the number of eigenvector directions along which decreases. Noncritical points and non-critical values are also referred to as regular points and regular values. We call a Morse function if all critical points are non-degenerate. the degeneracy is manifested by the fact that an arbitrarily small perturbation can remove the critical point or turn it into two non-degenerate ones.   £ ¦     £ ¡   ¡¡ ¥    ¢ ¡ ¦      ¦ 0 (  ¥            ¦     ¢¡ § ¡ $      ¦         ¥   ©   £ ¢  ©        ¢   ¦ ¡       ©       ¢    £             ¦    ¦ ¦ ¡ ¡ 7 #B  B    B   ¦   £ § $   ¦   ¢   ¡ B ¡   ¢      #B I ¢ ¢ ¢          B   B "   ¢  ¦ ¡   5 ¥        ¡ ¡   B  G ¦©   ¢    " ¥  ¡ R   ¡   XB ©B    ©B ¢     ¡     B 0  ( ¥  ¥   ¦ B tangent space of .  throughout .

Princeton Univ. As always. [4] H. G UILLEMIN AND A. M ILNOR . S EIFERT AND W. 1963. for the entire -axis is critical. Good introductory texts to the related subject of differential topology are the books by Guillemin and Pollack [1] and by Wallace [6]. but none of its points are isolated. 1974. Press. New Jersey. P OLLACK . which has a minimum at the south-pole and a maximum at the north-pole. Introduction to Linear Algebra. Variationsrechnen im Großen. which is also the alternating sum of critical points.4: Monkey saddle with degenerate critical point. Differential Topology. Published in the United States by Chelsea. Differential Topology. Milnor’s later book [2] emphasizes the topological analysis of manifolds and has since become a standard reference in Morse theory. Amer. and . Wellesley. The only critical point is . ¥ which is zero at 0. WALLACE . The original development of Morse theory from its variational background is described by Morse [3] and by Seifert and Threlfall [4]. [5] G. Similarly. S TRANG . Morse Theory. Let be a compact and smooth manifold without boundary and a Morse functions. WellesleyCambridge Press. Benjamin. the height function has a circle of minima and another circle of maxima. We will see in Section VI. Let be the number of critical points of index . [3] M. As we go around a circle centered at the origin. no matter what Morse function we use.92 . Math. M ORSE . New York. . A minimum example is the ordinary height function. Englewood Cliffs.    ¡    ¤  §  ¥   ¥          ¦ ¡¢ B ¡    ¥ ¦ ¥ B ¢¢ ¦ £ ¦F ¤£   ¥    ¥ §  ¥ ¥ ¥ ¦ §   B   ¥  ¥  B ¢¢     ¦ ¥ £  B  ¥ £ & ¤     ¥ ¥   ¥ §  ¤ ¢         B  ¢ of         ¡ B d¥ B ¢ B        B ©B XB £ #B       XB ©B     ¥ . for every minimum and maximum we get exactly one (non-degenerate) saddle point. New Jersey. In words. the function has three peaks at . All critical points in the above examples are isolated. A good introduction to linear algebra including an intuitive discussion of eigenvalues and eigenvectors is the book by Strang [5]. Bibliographic notes. 1951. if we lay down the torus on its side. Figure VI. For example. The Calculus of Variations in the Large. This implies that every Morse function of the sphere has at least two (non-degenerate) critical points. Euler characteristic. Prentice-Hall. the Euler characteristic is the alternating sum of cells. [1] V.2 that we can construct a -cell for each index. . T HRELFALL .. Massachusetts. and . Soc. For the sphere we get . [6] A. 1934. 1993. but there are others that are not. First Steps. [2] J. The matrix of second derivatives at that point is ¥ & ¦  ¥ §     VI D ENSITY M APS For example for the torus we get . which is the real part of . New York. 1968. and three pits at .critical point so that can be constructed by successive attachment of these cells. New York.

Each stable manifold is the injective image of an open balls. Nevertheless. We can define it also without reference to a coordinate system. The patterns of integral lines in the neighborhoods of a regular and several critical points on a smooth 2-manifold are shown in Figure VI. It is the projection of a normal vector of the graph of and points in the direction of the steepest ascent. a saddle. symmetrically. where is the directional derivative of along . However. The stable manifold of a saddle is an open curve. which is its regular starting points. the closure of each stable manifold is the union of (open) stable manifolds.VI. The same concept can also be defined for a Morse function . By symmetry. All three cases are illustrated in Figure VI. which is the union of a circle of integral lines and maximum itself. The dimension of the unstable manifold of a critical point is the co-dimension of the stable manifold. of the real line. the gradient of is . and two maximal integral lines are either dis-      © ¦ Figure VI. a minimum. It depends smoothly on the initial condition. Note that the dimension of each stable manifold is the index of the critical point that defines it. which we refer to as its origin and destination.6. For example. Assuming an orthonormal local coordinate system at . . The gradient vanishes precisely at all critical points of .2 Morse-Smale Complexes 93 joint or the same. which is the union of two integral lines and the saddle itself. two integral lines can also not merge. Every maximal integral line is open at both ends and thus a map of an open interval or. the closure of a stable manifold is not necessarily homeomorphic to a closed ball. the flow in the neighborhoods of a regular point. as indicated by the examples in Figure VI. A vector field. It is convenient to consider each critical point as an integral line by itself so that the collection of integral lines partitions . and . The stable manifold of a critical point is the union of integral lines with destination and.6: From left to right. It approaches two critical points. for every vector field . VI. we introduce the gradient of a Morse function and use it to construct the -cells whose inductive attachment reproduces the evolution of the homotopy type of . Neither can an integral line fork. which is a solution to the ordinary differential equation . Every regular point belongs to an integral line. that stable manifold of a minimum. The gradient of a linear map is the vector . This path is called an integral line.2 Morse-Smale Complexes In this section. and a maximum. the unstable manifold is the union of integral lines with origin .6. The gradient is the particular vector field that satisfies . In a 2manifold .5 Figure VI. a saddle. Gradient flow. Two integral lines can therefore not cross. and a maximum of a two-dimensional Morse function. The collection of stable manifolds thus satisfies the two conditions of an open complex: its cells partition and the boundary of every cell is a union of other cells. if we have a smooth curve with velocity vector then the derivative of can be computing using the gradient as The stable manifold of a minimum is the minimum itself.5: From left to right. . maps every point to a tangent vector . equivalently. £ ¤  ¢   £ ¢ ¥ ¦ £ ¤  ¢       £ ¤  ¢ !§ Stable manifolds. . and because we can reverse the gradient vector field by considering . for continuously increasing real threshold .     ©D   ¤ §   ¥ ¦ I £ ¤  ¢ £ ¤  ¢      £  ¤  ¤ £  !   ¤  £  ¥ ©£F !¨ ¡ © £ ¡ ¦ ¢ ¦ ¥ ¡ D© § ¤ ¤     § ¡     © ¥ ¦ I   £  ¢     ¦ ¦ ¤ ¥ §  §   £ ¢ G ¦ © ¡ #B     B   ¡ # ©B ¦ R   ¢ ¢ § ¢ ¢  £¡¢     ¡¢  ¡¢       # G ¦  R  §          #  #   VG G ¢ £ ¦  ¥ #  ¦ § ¤   ¡ ¦ ¡ ¦ ¡ § ¤   R R  § D  ¤ ¦    ©B ¥ ©       ¢ £ ¦ ¥       ¢ £ ¦ ¥ ¢ £ ¦ ¥ §   EB § § ¤ D   ¦       ¦ ¡ B #      ¥   ¢ ¨  £ ¥  B R §   ¦ I ¡¢¥ ¢ # ©   . same as for linear maps. . the stable manifold of a maximum is an open disk. If we start at a regular point and follow the gradient we trace out a path. everything we said about stable manifolds is also true for unstable manifolds.

7: Solid stable and dashed unstable 1-manifolds with overlaid dotted iso-lines of a rectangular portion of a MorseSmale function. minima remain minima. it suffices to tilt it ever so slightly sideways in order to get transversality. From left to right they have one. but they are never smooth in the mathematical sense of the word. Saddles become regular points. Piecewise linear height functions. two. The result is a topological 2-sphere with minima and maxima. Note that all 2-cells in Figure VI. The intersection is transversal at if the tangent spaces and span the tangent space . We can see in Figure VI. We need some definitions to explain the linear interpolation. and the function would be specified by its values at the vertices. . Using linear interpolation. for . and the non-saddles alternate between minima and maxima. and all 2-cells in the boundary are quadrangles. we consider a point common to and . but they can also assume more general shapes with arbitrarily many saddles alternating between index-1 and index-2 separating the minimum from the maximum. . . Any such cyclic sequence has length . which implies . Height functions over manifolds occur in many practical problems. In other words. This amounts to overlaying the two complexes. as shown in Figure VI. Each point of a triangle is a convex combination of the three vertices. Morse-Smale functions are again dense in the set of maps from to .7 have four sides. The two bold 2-cells share the same origin and destination. Every 2-cell of a two-dimensional Morse-Smale complex is a quadrangle. The 3-cells of a Morse-Smale complex may have the structure of a cube.94 Morse-Smale functions. We may refine the complexes of stable and unstable manifolds by forming unions of integral lines that agree on both limiting critical points. The common features of all 3-cells are that they have one minimum and one maximum. with ¤ ¥ ¦     ¥ ¦ ¥ ¥   ¥   ¡ ¥  ¥ ¦ ¥     ¦  ¦   § G © © ¡ G   © ¦ ( © ¡ 0   ( 0 B ¦ G © £ ¤  ¢ £ ¤ © £ ¤  ¢  ¢ ¤ ¥  ( G © © £ ¤  ¢ § ¦ £ ¤ ©   ©  ¢ ¥  0   © £ £ ¤  ¢      G ¦© B ¢             £ ¤   ¡ ¢ ¡ . A few examples of 3cells are shown in Figure VI. meets the unstable 1-manifold of the lower saddle. Figure VI.8: Three 3-cells of a three-dimensional Morse-Smale complex.9. provided we count an arc twice if it bounds the cell on both sides. The surface would typically be given as a triangulating simplicial complex .8. the height function of the upright torus in Figure VI. Assuming a Morse-Smale function.7 that it is indeed necessary to take components. all two-dimensional Morse-Smale cells are quadrangles. In doing so. and three index-1 saddles and the same number of index-2 saddles. along entire one-dimensional integral lines. and maxima remain maxima. we can extend these values to a continuous function over the entire surface. Q UADRANGLE L EMMA . the dimension of the intersection of the two tangent spaces is . VI D ENSITY M APS Shape of Morse-Smale cells. We take two copies of a -gon and glue them together along the shared boundary. To explain what this means. In the case of the upright torus. it is convenient to assume that the stable and unstable manifolds intersect in a generic manner. Equivalently. A Morse-Smale function is a Morse function whose stable and unstable manifolds intersect only transversally. An example is a surface of a molecule model and the electrostatic potential on this surface.1 is Morse but not Morse-Smale because the stable 1-manifold of the upper saddle. minimum saddle maximum $  ¡  R ¡R  B  0   B Figure VI. The vertices of a 2-cell alternate between saddles and other critical points. P ROOF. The Euler characteristic of the 2-sphere is . we define the Morse-Smale complex as the collection of connected components of intersections of stable and unstable manifolds. For example.

and . J. and adding the lower star of a critical point is similar to attaching a cell in the smooth case. Lower stars. The values computed for within the two triangles that share thus agree. Z OMORODIAN . saddle. The sequence of complexes ¤ Figure VI. It still shares many characteristics with Morse functions. Define the star of a vertex as the collection of simplices that contain . F. . . The height function is continuous but not smooth. It is convenient to assume pairwise different height values at all vertices so that each simplex belongs to exactly one lower star. 245–256. With this assumption.2 Morse-Smale Complexes 95 times between lower and higher values of as a -fold saddle. minima. minimum. which implies that is continuous. maximum. Furthermore. The Morse-Smale complex has been introduced recently in [2] along with algorithms for piecewise linear height functions over 2manifolds. for points along the edge we have . we may consider a vertex whose circle of neighbors alternates ¦ ¡ ¦  ¤   £ ¤ ¦  ¤  ¡ ¢   ¤  ¤     Note that the barycentric coordinates of the vertex of are and . Instead. The alternating sum of simplices in the lower stars of a regular point. It follows immediately that is the number of minima and maxima minus the number of saddles counted with multiplicity.  Another similarity between smooth and piecewise linear height functions arises when we sweep in the direcfor tion of increasing height. Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds.10: The star of every vertex in the triangulation of a 2-manifold is an open disk.VI. The shaded portions are lower stars. §    ¢      ¨ § ¢    and . BANCHOFF . saddles. Figure VI. Critical points and curvature for embedded polyhedra. E DELSBRUNNER . This interpretation is consistent with the result that regular minimum saddle maximum Figure VI. [2] H. Indexing the vertices accordingly. The gradient and related concepts from vector calculus are intuitively described in the booklet by Schey [3]. . and we cannot remove them just by perturbing the height values. J. Assuming all . to appear. the alternating sum of critical points is equal to the Euler characteristic of . ¥ ¥    ¥     ¥    ¦ ¥   ¤¤  R ¤ ¡ $ R      ¨ ¤ § §    ¦ ¥ ¡  ¦    $  ¡    R   § ¢ ¢ ¤ ¢ ¢   ¡R        §  ¤ §      ¨ ¤ ¢  ¢ B  ¡ 0   $   B $ 0  B ¡ ¨ ¡ ¢  ¡ ¡ ¤ ¨ ¡       0       B     #B ¤         $ 0 $  0        ¡R ¡¢  ¡ ¢    ¡R    . Adding the lower star of a regular point does not change the homotopy type of . we define as the the union of the first lower stars and note that is a simplicial complex. Bibliographic notes. the lower stars partition the complex . More complicated lower stars are possible. The three parameters are unique and referred to as the barycentric coordinates of . The idea of writing a triangulated manifold as the disjoint union of lower stars goes back to Banchoff [1]. which implies that the linearly interpolated agrees with the value specified at . and -fold saddle are . [1] T. Geom. and the lower star as the subset for which is the highest vertex.10 illustrates the definitions by showing the lower stars of vertices that behave like regular points. Differential Geometry 1 (1967). The value at is now defined as the analogous combination of values at the vertices.9: Portion of a triangulated surface of a molecule. we sort the vertices in the order of increasing height. is a filtration and a discrete version of the evolution of during the sweep.. and maxima. Discrete Comput. The transversality condition for stable and unstable manifolds has its origin in dynamical system and is named after Steve Smale [4]. H ARER AND A.

96 VI D ENSITY M APS [3] H. . S CHEY. S MALE . and Related Topics. An Informal Text on Vector Calculus. Essays on Dynamical Systems. Springer-Verlag. New York. Norton. [4] S. 1980. 1992. The Mathematics of Time. Div. M. Economic Processes. New York. Grad. Second edition. Curl and All That.

] [We can describe the cancellation as a combinatorial restructuring operation and we only need this one to go up the hierarchy. Durham. J. Manuscript. C. S CHIKORE . to appear. L. Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds. Hierarchy of Morse-Smale complexes for piecewise linear 3-manifolds. Discrete Comput.] [The most important part of the algorithm is maybe the handle slide. R. [2] H. 1997”. J.. maybe the first time by Smale(?). IEEE Conf.. VAN O OSTRUM .. BAJAJ . 212–220. 2001. Visualization of scalar topology for structural enhancement.3 Construction and Simplification [Explain the sweep construction for two-dimensional Morse-Smale complexes using the simulation of differetiability. Z OMORODIAN . PASCUCCI AND D. E DELSBRUNNER . V. 13th Ann. Duke Univ. which is the only restructuring operation necessary to go between different complexes. . 18–23. H ARER AND A.] Bibliographic notes. Visualization. V. Comput. VAN K REFELD . 9th Ann. NATARAJAN AND V. Contour trees and small seed sets for iso-surface traversal.] [Again. S CHIKORE . Comput. [3] H. [1] C. Sympos. V. Geom. [4] M. In “Proc. North Carolina. Dept.] [That operation has been used in early work on Morse theory. Sci. PASCUCCI . Geom. E DELSBRUNNER . H ARER .3 Construction and Simplification 97 VI. R. BAJAJ . 1998”. R. L.VI.. there should be reference to the early mathematics literature on the topic of cancellation.] [Build a hierarchy through prioritized cancellation. In “Proc. PASCUCCI AND D.

Dover. Germany. 2002. Duke Univ.4 Simultaneous Critical Points [Explain the work with John on the topic and mention papers by Hassler Whitney and books in Catastrophy Theory. A RNOL’ D . North Carolina. E DELSBRUNNER AND J. . S TEWART. [3] T. Catastrophy Theory and Its Applications. Durham.] Bibliographic notes. Catastrophy Theory. Berlin. Manuscript. 1978.98 VI D ENSITY M APS VI. Third edition. Springer-Verlag. [2] H. Mineola.. P OSTON AND I. [1] V. New York. H ARER . Jacobian submanifolds of multiple Morse functions. I. 1992.

Exercises 99 Exercises The credit assignment reflects a subjective assessment of difficulty. (2 credits).  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Section of triangulation. Let be a triangulation of a set of points in the plane. 1. Every question can be answered using the material presented in this chapter. Let be a line that avoids all point. Prove that intersects at most edges of and that this upper bound is tight for every .

100 VI D ENSITY M APS .

1 VII.1. is at the root of natural and other re-production processes and it takes part in protein interaction.3. Minor variations in the type or arrangement of the components are frequently inessential and do not alter the role of a molecule within the larger organization. The complementarity question. In particular. As always in this book. The similarity question is at the core of human understanding. we study the problem of finding the best rigid motion for matching one points set with another. VII. This is particularly true on the molecular level. we may also ask the related question of how well two shapes fit side by side.4. we apply the methods to questions of similarity and complementarity. It really makes sense only for space-filling diagrams and does not seem to apply to information expressed in terms of sequences and space curves. There are various approaches to the question applied to proteins. we explore rigid motions in three-dimensional Euclidean space and introduce quaternions as a tool to specify and compute with rotations. there are seemingly small variations that do have significant consequences. we look at the related problems of sampling a rigid motion and of covering the space of such motions with small neighborhoods. In Section VII. which crucially relies on classification to simplify and create order. and shapes formed by space-filling diagrams. space curves modeling backbones. In Section VII.2. In Section VII. The measure of choice is the root mean square distance between the two sets. But then again. on the other hand. Instead of asking how similar two shapes are. which forms the basis of functioning life. including the comparison of amino acid sequences.3 VII. The complementarity question is a similarity question between one shape and (a portion of) the complement of another shape.2 VII. The molecules that participate in the mechanism of life tend to be large and composed of small molecules.Chapter VII Match and Fit As a general theme in biology. The underlying question is one of definition: when do we call two molecules the same or of the same type. we focus on mathematical and algorithmic methods that shed light on the broader biological issues. In Section VII. questions are almost always about populations and rarely about individuals. and how do we quantify and assess that notion of sameness. we look at the problem of identifying matching subsequences with minimum root mean square distance and at score functions that assess the shape complementarity of two space-filling diagrams.4 Rigid Motions Optimum Motion Sampling and Covering Alignment Exercises 101 .

Rotation and translation. A rotation about a coordinate axis has a comparatively simple rotation matrix.102 VII M ATCH AND F IT can be obtained by a sequence of three rotations about coordinate axes. we can write . Every rigid motion can be written as x3 SO is not injective. As an alternative to orthonormal 3-by-3 matrices. Indeed. the map ¥ VII. Quaternions. the composition of any two rotations is another rotation. In general.1 Rigid Motions A motion in three-dimensional Euclidean space can be decomposed into a rotation and a translation. This suggests that the Cartesian product of three circles is not an appropriate model and we will indeed see shortly that is not homeomorphic to the space of rotations. and a translation is a rigid motion that preserves difference vectors. J and K are three different imaginary units. More formally. followed by a rotation by followed by a rotation by about the -axis and note that we get the same composite rotation if we switch and . In preparation of an operation that multiplies two quaternions. Consider for example a rotation by about the about the -axis. we may use quaternions to represent rotations. In other words. It is important to specify the Euler angles in a fixed sequence as other sequences of the same angles usually specify different rotations. but there are exceptions. which provide a particularly elegant mathematical framework. a rotation is a rigid motion that preserves the origin. It is mostly true that two different triplets of angles specify different rotations. it is a map such that and for every pair . If I J K is another quaternion then the product of and is   ¥ ¥ K J   ¥ K I  ¥   ¥ ¥   the composition of a rotation and a translation: .1. . Quaternions can be viewed as a generalization of complex numbers: ¤  B  B ¤ ¢ B ¥ ¦ ! ¥    ¦           ©            § ¥   ©  ¢ ¡     ¡ ¦ B         ¢ D  ¦ D D ¢ ¡   ©B   ¢ ¢ ¡ ¢ ¡ §   ¡ ¦ £ ¦ ¡   ¢ B  ¤  § © ¤  © ¥    B B ¢p £¡ P   ¢ #B   £ ¦  ¡ ¥ ¡ ¡ ¢ B ¢ ¢ ¢¢0    £  ¤ © ¤  © 0 0 !§ ©B p      ¢0 ¢ ¡ 0 0 0       ¢ ¢      ¢0 0 0 p    ¢ 0 0     ¥   B rp        ¡ £ ¦   ©B   ¢ ¢          ¢     B ¡ . rotating about the -axis gives £ Note that reversing two different imaginary units changes the sign of the result. and that make up the columns of . Sometimes it is more convenient to    ¥ K   ¥ J ¥ I (¢0 ¡ (¢  0 (¢0 ¢(¢0 ¢ 0  0  0 ¡0  (0  ¡(0 ¢( 0 ( 0 ¥    ¢   ( 0  ¢ ( ¡ 0    ¢ ( 0 ( ¡ 0   ( 0  ( ¡ 0  ( 0 ¡ ( ¡ 0 ¥ ¡  £  ¥   £  £ ¢   ¢£ The rotation matrix moves the unit coordinate vectors to the vectors . In this section we consider different ways to mathematically represent rotations. and are real numbers and I. abbreviated as SO . For example.1: The translation of the boldface original coordinate system preserves the directions of the axes while the rotation preserves their anchor point. we specify how to multiply the imaginary units: I I J J K K J I  I J K ¢ ¢( (  (  (  ¡(    ( ( ¡( ¢ Figure VII. Note. where is an orthonormal 3-by-3 matrix with unit determinant and is a 3-vector: £ where . As illustrated in Figure VII. x2 x1 The angle of rotation about a coordinate axis is referred to as an Euler angle. Leonhard Euler proved that any rotation The product has a similar form but six of the terms have their signs changed. Using matrix notion. that this group is not abelian because the multiplication of matrices and therefore the composition of rotations is not commutative. the rotations form the so-called special orthogonal group of 3-by-3 matrices. A rigid motion in is an orientation-preserving isometry of three-dimensional Euclidean space. axis. however. and we focus on quaternions.

the products with their transposes are diagonal: . While the product of two quaternions is another quaternion. This 3-by-3 matrix is the familiar rotation matrix that takes to . we show that the rotation by an angle about the axis defined by the unit vector can be represented by the unit quaternion  ¢ because . This can be done by expanding either the first or the second quaternion to a matrix: 103 Take a moment to verify that the matrices and are indeed orthogonal. We can express think of a quaternion as a vector in the product of two quaternions in terms of an orthogonal 4-by-4 matrix and a vector. Instead. we  As illustrated in Figure VII. This implies that every non-zero quaternion has an inverse. with . which shows that the composite product preserves cross-products. The justification for to represent a rotation is not yet complete. We use purely imaginary quaternions to represent vectors in and compound multiplication with unit quaternions to represent rotations. as required. we use the composite product . the conjugate of a quaternion is obtained by negating the imaginary parts: I J K . an observer who looks against the direction of the axis sees the vector rotate in a counterclockwise order.1 Rigid Motions . is the result of applying the composite product with the unit quaternion to . The expansion of given in Table VII. where is the 4-by-4 identity matrix. If we now apply the composite product with a unit quaternion . Similarly. In the special case when has . we think of a quaternion as composed of a scalar and a vector. the imaginary parts vanish when we multiply a quaternion with its conjugate: . multiplication with a unit quaternion neither changes the angle nor the length. Notice that Hence. First. always assuming . In the reverse direction. and . we get and . a reflection reverses the orientation of a sequence of three vectors. We start with a few properties. The rules for computing can be rewritten as When and are purely imaginary then these results simplify to and . Axis and angle. The imaginary part of gives       I J K     ¢           ¢    §£ § Representing rotations. Similar to complex numbers. Same as rotation. namely . we have  §£ ¦ ¢£ ¢   ¡£ § £ §   ¦ ¢ £  ¢   ¦ ¢  ¢  ¦ ¢ £ ¢    ¡  ¡ £ ¦ ¢ ¢  ¡ ¦ ¡   0   ¡#   ¡ 0  ¡   0  0¡¡  ¡¡0    ¡  ©0 ¡ ¡ ¡ 0   ¡  £ (  ¡(   ¥¢ §£ § ¢ 0 (  ( (   (     ¡ 0 £  ¢ 0  0  0   0 £ § £ §   ¦ §  ¡ £§ ¢ £ § § §   ¦ ¢ £ ¢ ¦   ¢£ ¢   p ¢p  ¢   ¢ ¢ £ §£  ¦£           £    ©   ¢ £  p ¢p 0 £ ¦£     £ ¢ §£ ¢ ¥    © ¢0¢ ¡  © ¥ ¤ ¥£ ¤ ¤ ¥£ ¤ £ ¦ ¦    (  0(  0 (  ¡0¡( 0   ¦££    ( ¢  ( ¢ ¡ ¡ ¦ £¥ £ ( ( ( ( ( ( £  ¡ ¡ ¢  ¡ ¡ ¢ § £ § £   ¥   © ¦ ¡  ¤ ¤ cannot use simple multiplications to represent rotations because the product of a unit quaternion and a purely imaginary quaternion is not in general purely imaginary. Observe that the matrices associated with are the transposes of those associated with . we can use the scalar product to define the length of a vector: . Furthermore. We expand the product of the two matrices in Table VII. However. differs from by having the lower right 3-by-3 submatrix transposed. Observe that ¤ ¥£ ¤ ¥£ 0 £   ¦££   ¦ ¦ ¢ ¡   ¡ ¡ ¢  ¢  ¥ 0 0 0 0 0 0 0 0  £ ¡ ¥ ¥ ¥ ¥  0   ¦£ £    ¡ £   ¢0   ¢0 ¡ ¡  0 0 0 0 0 0  £ ¦¥ £ ¢  £  £ ¦  ¥ ¥ ¥ ¥ ©  ¢ ¢©   §  ©¨§   ¥ ¡ ¡  ¢ ¢  £  ¤ £ ¦£   £ ¢    0 0 0 0 0 0 © 0 0     £   ¢§ £ § © ¥ ¥ ¥ ¥ ¦£   p ¢p ©    ¢  ¢ ¡ ¡    0 0 0 0 0 0 0 0  £ ¢ © £ ¢ ¢ ¥ ©   ¡ ¡ ¢  ¡ ¡ ¢ ¡ £     ¢ £ ¢ ¢£ . However. unit length. This implies in particular that multiplying with also preserves length: . It follows that the lower right 3-by-3 submatrix of is also orthonormal. which also preserves scalar products.2. since . both and are orthonormal. the scalar product is preserved if we multiply with .1 provides an explicit method for computing the orthonormal rotation matrix from the unit quaternion. Another possibility is that it represents a reflection. and we can check that composite multiplication does not. To do this. Since the matrices are orthogonal.VII. the scalar product is a real number:  where and are the 4-by-4 matrices that correspond to . This is true from either side and we show it for multiplication from the left:   ¢     ¡  0 ©  ¥    $  © ©    P§   © $ ¥ $ ¡   ¡   §¡ 0 $ ¢£ ¢   ¡£ ¢ £ As usual.1 and see that is purely imaginary.

The dashed great-circle through the two poles represents the set of rotations about a fixed axis. and the angle of rotation is twice the angle enclosed by the planes. obtained by identifying antipodal points of is usually referred to as the real projective three-dimensional space. although we usually prefer because it is easier to imagine. The above relationships provide a convenient conversion between unit quaternions and axis-angle pairs. and points on the equator correspond to rotations by . The axis of the corresponding rotation is the line common to the two planes.3: The north. ¨ ¨      © © P§  ©  ¤  P§  ©       © !§      ©   © !§   (   ¡( ( 0  (  0   (  ¡ (  0 (  ( ¡ (  and use the   ©  ¨  ¨  © ¥  ©   ¡ 0 Figure VII. The composition of two rotations represented by the unit quaternions and is x1 x2 Figure VII. The three dotted vectors correspond to the terms in the formula of Rodrigues. To compose two rotations. We show that this can also be written in the form . composition of rotations corresponds to multiplication of quaternions. θ r. Thus.and south-poles correspond to the identity. and from the product it is easy to again get the axis and the angle. and as given above. In other words. We have .2: The rotation of the vector by an angle of about the line spanned by . Tedious but straightforward calculations show Composing rotations. as illustrated in Figure VII. we write the vector rotated by about the axis defined by using the formula of Rodrigues. and .104 VII M ATCH AND F IT Table VII. and the real part deterrepresents mines the angle of the rotation. they just need to pass through the axis of rotation and enclose half the angle of rotation.2. The two planes defining the reflections are not unique.u u r r’ which can be seen from Figure VII. as in Figure VII. The middle two reflections cancel and we are left with two reflections. or for short.     £     © 0     ux r To prove the claimed correspondence. The space ¨ ¥ ¨ If we substitute and identities and then we obtain the formula of Rodrigues. Note that the same rotation as and that non-antipodal pairs of unit quaternions represent different rotations. where . A more direct geometric description of the composition of two rotations uses the fact that every rotation can be written as the composition of two reflections.4.3 illustrates the correspondence with a picture in one lower dimension. ¦   ¢£ ¢   ¡£   ¤ ¥£ ¤ ¦   ¢ (   (  (   ¡ ( ( ¡  ( ¢ ( ¢ ( ( (  ( ¡ ( ¢(¡( ( (  ¢ ( ( ©  § P  ¡ ¥     ¥  ¥     (    © ©0    P§ 0 ¥ ©   ¡ 0      ¥   ¡£   ¥   (¡(  ¢ ( (    (  ¡ ( ¢ ¢ ( ¡  (  ( ( ( (  0   0 ¡  0 ¥    ¤ ¥ ¡ ¥ ¢ £  ¡ ¥  ¥ ¢   ( ( ( (  (  ¡ (  p p ¢  ¢ ¢     ¥   ¢       ¡ ¡ ¢   ¦ ¡ §£ § ¢ ¢ ¢ ¢  ¡ ¢ ¦ § ¢ ¡ ¡ . we write each as the composition of two reflections. Figure VII. the unit sphere in is a double cover of the space of rotations in .1: Product of matrices in the representation of a rotation by composite multiplication with unit quaternions. making sure that the second plane of the first rotation is also the first plane of the second rotation.4. It is a good model of the set of rotations in .  ¡¡ £ x0 p(p (  ¡    ¦ ¢ ¥ £ ¢ ¥     © ¥ p ( ' ( p ¢       ¦¥ ¦ ¢ £ ¢  p ( ¢ ¡ ( p ¥   ¥     © !§ the direction of the rotation axis.

It is commonly acknowledged that quaternions have been discovered by Hamilton in 1844 [1]. H ORN . [3] J. Proc. Soc. Rodrigues studied the composition of rotations in space and gave a purely geometric explanation that is equivalent to Hamilton’s algebra [5]. 1997. J. Opt. Pures Appl. Closed-form solution of absolute orientation using unit quaternions. Clarendon Press. The exposition of quaternions and their connection to rotations chosen for this section follows [2]. 424–434. England. 380–440. B. On a new species of imaginary quantities connected with the theory of quaternions. 5 (1840). Quaternions and Rotation Sequences. Gauss recorded his discovery of quaternions in his unpublished notebook in 1819. We recommend the primer by Kuipers [3] for background on rotations and the text by Needham [4] for background on the more general context provided by complex analysis. Irish Acad. Des lois g´ om´ triques qui r´ gissent les e e e d´ placements d’un syst` me solide dans l’espace. Press. K UIPERS . New Jersey. Princeton Univ. Math. K. [2] B. and . H AMILTON . A 4 (1987). 629–642. Even earlier. It is less well known that a few years earlier. RODRIGUES . Bibliographic notes. N EEDHAM . P.4: We see three rotations defined by the axis-angle pairs . Each rotation is the composition of two reflections illustrated by the great-circles at which their planes meet the sphere. Visual Complex Analysis. J. R. [5] O. 2 (1844).VII. [1] W. Oxford. # ¤ &¥  ¢ £ # ¡   ¦¢  !¡  #   ¢ . Amer. et de la e e variation des coordonn´ es provenant de ces d´ placements e e consid´ r´ s ind´ pendamment des causes qui peuvent les proee e duire. 1999. [4] T.1 Rigid Motions 105 w ρ ψ v ϕ u Figure VII.

to . In other words. That minimum is characterized by a vanishing gradient: As mentioned earlier. . The space of rigid motions is therefore six-dimensional. Quite the opposite is true. The translation minimizes the sum iff the origin is the centroid of the points : ©   ©   #       ¡  # ©     Optimum translation. the centroid of is . namely . Suppose we are given two finite collections of points in and a bijection between them. the translation that minimizes the root mean square distance between and is defined by . We need some notation to make this precise. and the main reason for this is the convenience provided by quadratic functions. Recall that the centroid of a collection of points is the average of the points. is also the sum of square distances of the points from the origin. the motion can be optimal only if translates the centroid of to the centroid of . We are interested in finding the rigid motion that minimizes the root mean square distance between and . for each . the (solid) difference vectors all radiate out from the origin. we are interested in moving one collection so it best matches the other. as §  ¨ § ¥      §  §     ¡  ¥ ¡    £       §     p  §     p ¥ ¥  ¥  ¥      Note that minimizing the root mean square distance is equivalent to minimizing the sum of square distances. Indeed. We may therefore simplify our problem by translating and independently translating such that   This implies that the best translation moves claimed. We consider rotations and translations separately.106 VII M ATCH AND F IT point for which the sum of the vectors to the points in the collection vanishes:   VII. . is a quadratic function with a unique minimum. Then the sum of square distances between the correspond- ing points. Note that rotating and taking the centroid commute. we solve it using quaternions representing rotations in three-dimensional space. and it might seem that computing the particular rigid motion that minimizes would be hopeless or at least difficult. we may apply it to the first collection and recompute the root mean square distance. Problem specification.5. We are now ready to prove that the best translation is the one that moves to . We use the root mean square or RMS distance to assess how similar the two collections are. In other words. This operation is illustrated in Figure VII. Let and be the two collections and assume that corresponds to .5: After moving the shaded points to the origin. Since every rigid motion can be written as a rotation followed by a translation. Let us move every point to the and move the translated copy of with it origin of to . § ¨¢ Given a rigid motion . More formally.2 Optimum Motion In this section. While entertaining the possibility that the two collections are structurally the same or at least similar. the centroids of and are and . we study an optimization problem that arises when one attempts to match two molecular structures or to fit two structures snug next to each other. ¦ ¡ # Figure VII. We begin by showing that the best translation moves to . the latter sum vanishes iff .       § B     B !p   #B ¢  p     §  § ¡  £       ¥  ¥ ¥   #B ¡  £   ¡  ¡  §       ¥   ©B ¢ e    ¡  ¢ ¡  §        ¡ £ ¥ ¢ ¡ ©   ¥ £ ¦ £   §   £   ¥ ¢   ¦   ¡     p  §    p ¡   ¥ £     ¤  ¢  §   ¡ ¥   ¢ ¡ ¥    §  B § ¡ £ #  ¢ £¡ #  ¡¡ ¢   ¢ £¡ © © § ¡ ©B  ¥      £ #  ¤  ¢      © ¡ © ©  ¥ ¡ £ ¤¡ £   § § £   §       £  # ¡ . After formulating the optimization problem. A crucial insight used in proving this fact is that the centroid is the only Optimum rotation. This measure is the square root of the average square distance: This implies that minimizes the sum of square distances from the . Recall also that every rigid motion can be decomposed into a rotation followed by a translation.

the dashed lines represent the zero-set and the boldface curve represents the graph of the restriction of that function to . and this maximum is attained for . Hence where . Since multiplication with a unit quaternion preserves scalar products. If there is no bijection specified between the two sets then the problem of finding the best rigid motion seems significantly more difficult. we may assume . The sum that we have to maximize can now be rewritten as £   ¥  ! !   ! ¦ ¡    "  ¢    #! " "B   ¢         ¡    ¡ ¢ #   ¢     §¡ ¢ ¡ $  ¤ ¡ The sums of the and the are not affected by is equivalent to maxithe rotation. the eigenvalues are all real. Short of being able to draw the graph of this function in . we have . we could of course try all bijections.2 Optimum Motion both centroids lie at the origin.    £    ¢ ¢   ¡¢ © . we have four eigenvalues. The sum of the square distances after the rotation is 107 for which the quadratic function gives a maxpoint imum. Recall that the eigenvalues of a square matrix are the complex numbers for which the determinant of vanishes. so minimizing mizing the sum of the . . We can compute such a with a modest amount of linear algebra. Take a moment to verify that each matrix in this sum is symmetric. Using quaternions. which drops two of the dimensions. The corresponding eigenvectors are pairwise orthogonal and therefore span . we have . Equivalently. Since the sum of symmetric matrices is again symmetric. we can express the rotation of a point as . We can thus write any quaternion as a linear combination of the eigenvectors. It is convenient to order them as . we have . and because is symmetric. the partially dotted circle represents . the surface represents the graph of the quadratic function over . we illustrate the idea in Figure VII.1. Without bijection. we have . Recall from the previous section that       ¡  ¥ " ¢  £ ¢ © ¡ ¡   ¢  ¢  ¢  £ ¢ ¢   ¢ ¢  ©  ¡¢    ¥ ¦ ¢  ¡¢    ©¢ ¦ ¢  ¡¢   ¢ " ¢  p  ¦p  p    p  ¢ §¨ p  ¦p   ¢¥ ¦ ¢   ¡¢  ¥  p    p ¡ ¤  ¡ £     p  ¢£¥   ¦ ¢  ¡¢ p  ¡ ¡ £   ¢  "     ¢   ¢   ¢ £  ¢ £ ¢     ¡   £  ¡  £ ¢ £  ¢  ¤ ¤ ¥£ ¤ £ ¤  ¦ ¦  ¢     ¢ £  £  ¢     § §      §  ¥ ¥ ¥ ¥  ¢  ¡ ¡   § ¢ §  §   ¢        ©  ¦ ¢ £ ¥ ¥ ¥ ¥   ¡¢          ¢    ¢        § § © §    £ ¥ ¥ ¢  ¢ ©  ¡ ¥ ¥   ¢ ¡  ¥ ¡ ¢ §  §   ¢    §     ©      ¡    ¡ ¡ ¢  ¡ ¡ ¢      §  ¡          Figure VII. Eigenvalues and -vectors. A more effective algorithm alternates between improving the     ¡ B   "     B ¡     "   ¢         B ¡    B ¢ ¢   ¢©     ¥ ¢  £ ¢ The two matrices are skew symmetric as well as orthogonal. The corresponding eigenvectors are the unit vectors such that . and because we are only interested in unit quaternions. The corresponding quaternion is . but that would take a long time.6: The plane represents . In other words. the optimum rotation is defined by the unit eigenvector that corresponds to the largest eigenvalue.6. Letting . as explained in Section VII. Our goal is to find a By the assumed ordering of the eigenvalues.VII. We can interpret geometrically as a quadratic function over four-dimensional Euclidean space. where is a unit quaternion and is the pure imaginary quaternion that corresponds to . Assuming and contain points each.

A 4 (1987). Given a rotation.108 root mean square distance by changing the bijection and by changing the motion. S TRANG . K ABSCH . the version that works with injections rather than bijections is known as the iterated closest point or ICP algorithm [1]. [2] O. In com- § ¡       # © ¢ ¡        $©  §    © ©  ¡    # © © loop       identity. Mach. Opt. ROTATE returns the rotation that minimizes the mean square distance under this permutation. J. After each iteration. Sect. M C K AY. we may use a subroutine A SSOCIATE . Since there are only finitely many permutations. Bibliographic notes. H EBERT. IEEE Trans. We use three subroutines to describe the iterative algorithm. 629–642. This implies that no permutation is tried twice. if RMSD . 27–52. at other times by the fact that finding the best bijection is not entirely straightforward. the root mean square distance decreases. 827–828. Int. Soc. RMSD returns the root mean square distance. [5] G. endif forever. Amer. Wellesley. Robotics Res. D. For background on linear algebra and how to compute the eigenvalues and eigenvectors of a symmetric matrix. [4] W. recognition. In the algorithm. it follows that the algorithm halts. we follow the exposition of the solution given by Horn [3]. The problem of finding the rotation that minimizes the root mean square distance between two point sets with given bijection in has been studied in various fields. [1] P. Acta Crystallogr. FAUGERAS AND M. 239–256. A popular version of the above algorithm uses injections from to instead of bijections. we refer to Strang [5]. 5 (1986). VII M ATCH AND F IT puter vision. we replace M ATCH by A SSOCIATE and do the remaining operations as before. Note however that we neither have a polynomial bound on the number of iterations nor a guarantee that the algorithm finds the globally optimal solution. Anal. H ORN . J. The algorithm that attempts to minimize the root mean square distance between two point sets without specified bijection has also been described in several fields. ROTATE . D. Intell. B ESL AND N. Patt. Closed-form solution of absolute orientation using unit quaternions. and locating of 3-D objects. The representation. the best translation always moves the centroid of to the centroid of . For a given rotation. Sometimes this change is motivated by the purpose of the computation. including x-ray crystallography [4] and computer vision [2]. PAMI14 (1992). Massachusetts. WellesleyCambridge Press. A 34 (1978). Finally. given a permutation and a rotation. In this section. M ATCH . except that is replaced by the multi-set of points in that are closest to some point in . M ATCH returns the permutation that minimizes the root mean square distance between and . Introduction to Linear Algebra. So we may again assume that both centroids are at the origin and restrict ourselves to rotations. J. Given a permutation. which determines for each the point closest to . 1993. A method for registration of 3-D shapes. then else exit ©        $©  # #       ©   £     © . P. A discussion of the solution for the best rotation to relate two sets of vectors. [3] B. K. Note that independent of the bijection.

we study two questions on rigid motions. with the square radius equal to . as long as both intersect the sphere. The method may be viewed as picking a point on the enclosing cylinder and projecting it back to the 2-sphere: . as before. . normalized to have unit total integral. Hence.  ¤ ¢  ¢ ¡ ¡ ¡  ¢ ¡   ¡   ¤ B  ¤   ©  ¢ ¡ ¡ ¦     © ¦ © ¦   The size of a sphere. For embedded in . . It would not be correct to pick an angle uniformly at random since this would favor small dislocations of . To pick the angle correctly. the density is . so we just need to pick the angle of rotation about this axis.3 Sampling and Covering ing the infinitesimal slices. in the quaternions near the identity would be more likely than those far away from the identity.  B¢ 7 B¢ ¡ ¢  5 ¢     ¢     © ¦ © ¦ Vol ¡ B 7 B¢ ¡¡ ¢  5 ©  ¤ P§   ¡ ¢ ¢ ¡ © ¤ ¤¥ P§ ¡ ¢  £ ¡ ¡ #B  ¡   ¡           In this section.7. This fact has been known already to Archimedes and is often expressed by saying that the axial projection from the sphere to an enclosing cylinder preserves area. Return ¥     ¢            ¢  ¢ ¥ ¢ ¢      ¤ ©  © ¤ R  ¢   B   The total area of the 2-sphere is therefore . ©B)¡   © ¦  ¡ B  ¥     ¡ VII. not uniformly but from a density that favors angles near the middle of the interval. We need to pick the angle from .3 Sampling and Covering . ©B ¡   ¢ £¡   ¡  ©B ¡ B ' ¡ ©B ¡ ©      B ¥        ¦ ¡         #B  ¡ B . Note also that Archimedes’ theorem does not extend to the 3-sphere. This projection is illustrated in Figure VII.  109 ¡  ¤ ¢ ¥  § ¤ ©  ¢ ¡ P ¡ G    ©   ©B ¦ We use the same method to compute the volume of embedded in . We now extend this method to and thus to an algorithm for picking a rotation uniformly at random. We prepare the discussion of sampling rotations by measuring the unit 2-sphere and the unit 3-sphere. we get the volume by integrat- © ¤   §  ! ¢ Figure VII. The total volume of the 3-sphere is therefore . Think of as the axis of rotation. Sweeping a three-dimensional hyperplane normal to the -axis. Pick uniformly at random in Step 2. In other words. . namely how to sample uniformly at random and how to cover the space of motions most economically. The area of a slice is with . Indeed. Specifically. The corresponding distribution function is B ¤    &  ¥  © © § £  ¤ P§ ©(P§  ¤ t©R (¨©§©© ¦!£ ¨ © ¥   ©  Step 1. © ¦¦ B   ¢         ¢    ¡ ¡   B ©  ©  Area which we get by substituting . But note that the derivation shows more.7: Illustration of Archimedes’ theorem implying that the sphere and the enclosing truncated cylinder have the same area. we return to what we learned from the above volume computation. Hence. The perimeter of the circle in which the plane cuts the sphere is . We treat translations and rotations separately and spend most of our time on the more complicated case of rotations. namely that the area of the slice between two parallel planes at a constant distance is the same for all such planes. The angle of rotation about the axis is twice the angular distance from the identity on .   © ¤  © Uniform sampling. Archimedes’ theorem can be used to pick a point uniformly at random on . at least not in the straightforward manner from sections between parallel plane to sections between parallel hyperplanes.VII. Pick uniformly at random in Define . we sweep a plane normal to the -axis and compute the area by integrating infinitesimal slices.

Alternatively.680 0. with a bijection that maps to . The covering radius is the smallest radius we can assign and still have the balls cover . Next.2. packing while the BCC lattice leads to an effective covering. Recall that its volume is . we pick a number uniformly at random in .8: From left to right: the cube. This implies that the vectors add up to 0 implying that the sum of scalar products with any vector vanishes: . . we are guaranteed that every translation has a selected translation at a distance at most from . the root mean square   ¢   ¢¢ ¦   ¢ # §      © ¥ # © ¢ ¢   ¢¢ ¦  ¥ ¦     ©  B B  §    ¢ ¡ ¥  ¢ ©   ©        £¥ ¢ ¡  ¢ ¢ ¡ !    ¢ ¦£  ¢     © § § ¡ B     ! P © £ ! !§§ ¤¤ P§§  ©  ¤ © § (© ©  ¤   © (© © Step 3. Return ©  © ¤ ©  .     ¤ ¢ ¡   ©B  ¢ ¡ B ¦  ¢ £¡ ¡ B     £¢     ¡ ¦    ¤ ¢ !¨§    ¡     ¢ ¡  ¤  © ¦          ¤   ¢ £¡        ¤    ¥  ¢ ©       ¢ ¡ B   Figure VII. we can use a straightforward volume argument to show that we need at least balls to cover the 3-sphere. It is convenient to measure the distance between translations and between rotations using the Euclidean metric.094 BCC 2 0. ume of the balls divided by the volume of the space they inhabit. . If we believe that we cannot cover more economically than the BCC lattice in . We call a covering if and we call the covering radius. Covering the spaces of translations and rotations.8 shows the portion of each lattice inside a cube of unit side-length and Table VII. and the BCC or body-centered cube lattice adds all cube centers to the cube lattice. To get a point uniformly at random on . Pick uniformly at random in Let . The packing radius is the largest radius we can assign to the points to get non-overlapping balls. we address how translations affect the root mean square distance between two point sets. We need infinitely many balls has infinite volume. the edge centers and the midpoints between the face and the edge centers. let and be two collections of points in .2 lists some of their pertinent properties. Recall that the root mean square distance between and is the square root of the average square distance between corresponding points. We see that the FCC lattice leads to an effective C UBE 1 0. We turn our attention to selecting a collection of rigid motions such that every possible motion has a selected motion nearby. the FCC and the BCC lattices.500 2. As in Section VII. we assume that the centroids of the two collections are both at the origin: . we append Steps 1 and 2 above with      VII M ATCH AND F IT We get a random rotation by using as a unit quaternion.866 2. and the volume is the fraction of the space covered by the packed balls.559 1. but we are usually just because only interested in bounded portions of space.433 0. As an exercise we may estimate the number of balls we need to cover the unit 3-sphere.353 0. We will later analyze how these notions of distance relate to the effect of the motion on the root mean square distance between two sets in .2: Numerical assessment of how well the cube. Indeed.110 at which monotonically increases and reaches . Assuming is very small. To simplify the analysis. Let and let be a collection of closed balls. and we compute its preimage under the distribution function: . We study three lattices of points in some detail.523 0. To pick an angle. the FCC and the BCC lattices pack and cover. If we use the centers of the covering balls as selected translations. The cube lattice consists of all integer points. both are known to be the respective best packing and covering lattices. and the volume is the total vol© Sensitivity to small translations. and as Euler angles.500 0. The idea of guaranteeing that every possible motion has a nearby selected motion can be expressed by covering the space of motions with neighborhoods.463 points per cube packing radius volume (fraction) covering radius volume (fraction) Table VII. Figure VII. which we represent by 3-vectors or points in .740 0. we note that the FCC lattice has four times and the BCC lattice has twice as many points as the cube lattice. for each . we get a random rotation by using .720 FCC 4 0. The points with maximum distance to the lattice points are the cube centers. After translating along . all of radius . the volume of a ball with radius in is about . Consider first translations. the FCC or face-centered cube lattice adds all centers of cube faces. By counting fractions.

the difference between the root mean square distances for two translations is bounded from above by the norm of the difference vector:  ¢     #  ¢ ¢  B   Figure VII.9: The hyperboloid approaches the graph of the norm function at plus and minus infinity. except that the constant now depends on the collection of points. The length is 1 if and only if for all . For the purpose of computing the gradient and its length. Going back to the definition of . The effect of the rotation represented by is best viewed in the   and We see that the rotations satisfy a Lipschitz condition that is similar to that for translations. Bibliographic notes.3 Sampling and Covering distance is 111 direction opposite to the rotation axis. Figure VII. Since the length of the gradient never exceeds . Call the root mean square distance from the centroid the radius of gyration. The problem of sampling motions has been studied in various fields. which includes the possibility that . We have if and only if for all . we have and the root mean square distance between and the rotated copy of is Let be a unit quaternion. in particular to their radii of gyration. including statistics. its gradient and the length of the gradient:    ¨ 0 ¢     ¢ £ ¢ B ¢     B      B  ¢ 0 0 0 ©  0  ¥       0  ¥   B     ©  ¥   B    ¡ ¡ ¡  ¢ ¡ #  ¢ ¥    ¥ 0¥     ¡        0 ¦ ¡    B  p ©  ¢ Pp e #  ¢ e The gradient is defined everywhere except at and its length is . To measure how fast the root mean square distance changes with varying translation vector. the difference between the root mean square distance for two rotations is no more than that multiple of the norm of the difference vector: 0  #  ¢  £  ¢ B ¢ 0   B  0  B ¢ 0   B  0   ¢ p¥   9§ rp 0    B  00  B0 ¤ ¢       ¡     ¢ e ¤ p #  ¢ Hp    © ¢ e ©  ¢   dient never exceeds 1. Since the length of the gra where are the eigenvalues of the matrix defined in the previous section.  ¥    ¥ ¦ ¥ Sensitivity to small rotations. As for translations. the radii of gyration of and are    In words. where is the radius of gyration of projected into the plane in . This is possible in the limit and characterized by the velocity vector of being parallel to . Note that . Since we assume . the root mean square as a function over the three-dimensional space of translations satisfies a Lipschitz condition with constant 1. we compute the the gradient: § £ £ ¡  §      ¢0   ¤0   0 §    §     ¥   ¡ ¡    ¥ (¢'       §  p  § p   ¥     B© B¢ §   purp   #B ¢ B pB urp £ ©B ¢    §¨ urp p B   p      p   ¥     § ¤ p  § B     p    ¥     ©B ¢   B    p     ©B ¢  B   § ¥  ¥  ¥    B rp ¤ ¢  ¢   ¥ £ ¤ # 0   ©   B £  ¢ XB   XB ¡ B  p    p   ¥    £ e ¤ p #B ¢ Hp    ¡ ¡ ¥ ¡  £ ©B ¢ ¢    § ©B ¢ e            ¡ ¢ 0  . Using we simplify the expressions for . We repeat the analysis for rotations.9 illustrates this result by comparing the graphs obtained for equal and for nonequal corresponding points. In this case. the length of the gradient is maximized if for all . we observe that the eigenvalues are and . It is geometrically obvious that the total distance increases the fastest when each point moves in a direction straight away from . for .VII. we consider a function over :  p © ¥   B      ¥ 0   ¥ ¦      !p      ¥ ¢   ¡           ¡ #   #  ¢ which implies .

Very little nomical covering of is known about optimal packings and coverings in nonEuclidean spaces. 31 (1977). A popular method that is correct and different from the one described in this section is due to Marsaglia [4] and is reproduced in the exercise section of this chapter. Lattices and Groups. In particular. Ann. © ¢   VII M ATCH AND F IT  ¢ £¡ . Math. Sphere Packings. Choosing a point from the surface of a sphere. Comput. 1988. many of the main questions in this area are still open. and for most numbers of points (or caps) only approximate solutions are known [1]. New York. S LOANE . Stat. Lagerungen in der Ebene. Packing and covering problems have been studied within mathematics and have generated a large body of literature [2. ´ [3] L. [4] G. The problem is challenging even in the relatively simple case of the 2-sphere. 645–646. [2] J. Math. New York. C ONWAY AND N. B ERMAN AND K. [1] J. 1006–1008. auf der Kugel und im Raum. Optimizing the arrangement of points on the unit sphere. For example. Second edition. F EJES T OTH . it is not known whether or not the BCC lattice is the most ecowith congruent balls.112 crystallography and molecular modeling. Various methods for picking a rotation uniformly at random have been published but not all are correct. Surprisingly. H ANES . it is important to notice that first picking a rotation axis and second a rotation angle favors quaternions close to the identity if we pick the angle uniformly at random in . 1972. M ARSAGLIA . Springer-Verlag. 3]. 43 (1972). Springer-Verlag.

We model a protein as a string over the alphabet of twenty amino acids: and .4 Alignment In this section. removing the last column leaves an optimal alignment of shorter strings.3: The alignment uses matches. As illustrated in Table VII. without using that match we end with an insertion or a deletion. and define for all and . we get We can think of every alignment as a directed path in the so-called edit graph of the two strings. and not just compute its length.10: The edit graph for the strings in the above example and the path that corresponds to the given alignment. A match Q Q R A A A C C £ A R C C R £ This algorithm is a typical example of the dynamic programming paradigm. Letting be the length of the longest common subsequence. horizontal and diagonal edges B   §       ¡   ¡    §   ¡¡   ¡  ¡©      ¡       ¡ #B ¡       #    is a column of two equal non-space characters. Sequence alignment. insertion and deletion. Longest common subsequence. The common subsequence between two strings consists of all matches. An alignment maps the to the in sequence. which constructs an optimal solution from pre-computed optimal solutions to sub-problems. return   # VII. and a mismatch is a column with two different non-space characters.VII. . takes vertical. the algorithm uses an array of tries. We turn the recurrence relation into an algorithm: integer LCS : . Each entry takes constant time. In the third case. for to do for to do if then else endif endfor endfor. which implies that the total running time is proportional to . we briefly discuss the two problems of match and fit for protein structures. Columns with two spaces are disallowed. we represent an alignment by a matrix consisting of two rows and columns.4 Alignment 113 creasing the length. and with this extra information. Thereafter. and its length is the number of matches. we may keep track of the decisions made by the algorithm. and we may move the last match to the end without de-   if if A A C C deletion: insertion: match: mismatch: Figure VII. is the minimum number of insertions and deletions needed to transform to . which we illustrate in Figure VII. In each case. To enstore the solutions. Using a second array of the same size. We begin by studying how to match proteins and develop an algorithm that measures the similarity between two chains of atoms. An insertion is a column with a space at the top and a deletion is a column with a space at the bottom. we can reconstruct the longest common subsequence itself. we restrict ourselves to alignments without mismatches. Assuming gives the score for having and in a single column. a deletion or a match.3. we need to show that the length of the common subsequence cannot increase if we do not use the match between and . we consider the related problem of docking a protein with its substrate. left corner. Then   #    £    ¨  #  §    § ¢       ©     £ § ¡  ¡      ¡  © X    ¡   ¡ £ £    ¡    ¡ ©  §    £    Table VII. The path starts at the source in the upper A Q R Q A C R C R To verify the recurrence relation note that every alignment ends with an insertion. Consider first the combinatorial (as opposed to geometric) version of the sequence alignment problem. Let be the length of the longest common subsequence of and . where is the total number of spaces.  £  ¥  £   £    ¡ £ ¡  ©   ¡ ¡ . spaces to achieve ¥  #  ¡  © X   ¡   £ £  ¡ ¡ ¡    §                #    ¡ #  ¥   ¨  £      £   . but it permits spaces on both sides. We compute by dynamic programming.     © ¡ ¥ £ £§       ¥   §§ ¥     £ £ # ¡   ¡ # ¡ §   © ¦        ¡        #  § §         .10. Indeed. For the moment. The general alignment problem permits mismatches and assesses the score by rewarding each match and penalizing each mismatch.

Using three arrays. Letting and be positive constants. Running time. Instead of computing the best motion for each alignment. This gives rise to the following recurrence relations: VII M ATCH AND F IT . but we may decrease and thus get arbitrarily close to the optimum. We thus aim at computing an approximately best alignment. Using the root mean square distance between two sub-chains is problematic for two reasons. We may therefore assume that the radii of are both roughly equal to and the radii of are both roughly equal to . and we penalize for gaps as before. Let be the motion that maximizes . we determine the sensitivity of the score function to small motions. We need some notation to formalize this idea. we permit a rigid motion be applied to one of the chains. The upper envelope of the graphs is the motion-wise maximum of the score functions.11. say . This construction is of all : illustrated in Figure VII. we compute the best alignment for each of a dense sample of motions. For now. This strategy makes sense in practice since in any case the locations of atoms are only known up to some precision. Chains of atoms. To decide how dense we have to cover the space of rigid motions. it prefers shorter over longer sub-chains. Improving the approximation by decreasing comes with a cost. The score of the best alignment between and is then . Let the and the be the centers of the -carbon atoms along the backbones of two proteins. we reward a match between and by adding (VII. second. ¥¢  ¤ ¥¢  ¤ p   ¥ £ p B §  ¥ ¤   B© ¢ Pp e ¢  §   ¥      p   ¡     ©B ¢   ¡ Next. Instead. We first consider translations . and hence . Proteins tend to have globular shapes packing their atoms around their centroids. it does not lend itself to the dynamic programming algorithm and. We further simplify the discussion by assuming . and . for some . Γ where is the score of the best alignment that ends with an insertion and is the score of the best alignment that ends with a deletion. we get a function that maps a rigid motion to the score between and . for . Ignoring penalties for gaps. A gap in the alignment is a sequence of contiguous insertions or of contiguous deletions. The norm of the gradient of a single term in this sum is bounded by a constant . It does this in time proportional to . The dynamic programming algorithm can still be used to identify the best in a collection of exponentially many alignments. We quantify the dependence by analyzing the running time depending on . Consider the function defined as the motion-wise maximum ¢ ¡ ¡ B to the score. The other parameters entering the analysis are the lengths of the chains.114 and ends at the sink in the lower right corner.11: The horizontal axis represents the six-dimensional space of rigid motions. We can use the same algorithmic ideas to compute alignments between two sequences of atoms. we can again compute the best alignment with dynamic programming in time proportional to . namely higher running time because we evaluate for more rigid motions. ¢¡ $    £ ¥ ¢ ¡    ¥ £ ¡ ¡  ¥ ¡  ¦¡££  #  £ ¡ ¡ ©   ¦     © ¦  £   # ¡ ¡ ¡ ¦   ¥ ¦ ¦ ¦     # ©  ¡ ¡  &  §     ¡      ¡  X ¡    ©      ¡ #  £ ¡ ££ ££  § R  © X¡ ¡  #  £ ¡¡ £ £  © ¡     p  §    p    §  R $ ¥ ¥ ¥    ¥  ¡  £  # ¢ £¡ R ¥ ¥ §  ¥  ©  ¡¦   £ R ¤¡ ¤  #   ¡  §     ¡   £ £ £ ¦ ¢ ©       ¦ ¡  ¡  # £ §    ¦¢  ©   ©  © £     ¡ § ¦   # . It is common to penalize a gap separately for its existence and an additional amount that depends on its length. The idea of the algorithm is to sample the space of motions dense enough to guarantee an alignment with a score at least . we need a score function that balances the contributions of length and distance. and the best alignment is for which . the radii of the smallest spheres enclosing and and the radii of gyration of the two sets. we get where is the length of the alignment and the points are re-indexed so that maps to . For each alignment between and . This may be done by penalizing an insertion or deletion an amount when it starts a gap and an amount when it continues a gap. we assume a fixed embedding in and consider the alignment problem without applying any rigid motion.1)  µ Figure VII. First. One such function is obtained by combining square distances with gap penalties as follows.

we will need to build knowledge about flexibility into the score function. Protein re-docking.4. By choosing the balls in the cover small enough. The collections of colliding and of close pairs are ¡ # where is a small positive constant. and it is strongly repulsive for colliding van der Waals spheres. the van der Waals force is weakly attractive within small distances of maybe up to four Angstrom. In each case. This idea is illustrated in Figure VII.  if if 0 © ¡ # ©  & ¡  ¡     ¡   00 © © F ¥ ¤¢  § ¥ ¥ ¢  ¦  ¥ ¢¢   ¦ £ ¡ ¡ F ¥ £¢ ¡ ¦ ¡   p  §   p ¨  p  §    p ¢¢ ¨  ££  ¡¡ ¥ § ¡ ¡ §  ¥ ¥ ¢£¥ § £  # ©   ¡ ¡ © ¢ ©  F ¥ £ ¡ ¡ ¡   ¦ © ¡ # ¡ ¦     ¡     ¡ ¥ ¥ ¦ ©    ¡ ¡  ¡ F ¥ ¤¢ # # ¡ ¥ ¢¡¦ ¦ ¥         ¥ ¢¡¦ ¦      ¤ p ©  ¢ p    ¡ ¥  © ¢  ¡¥   ¢   ¦ ¥ ¢ ¢ ¥ d   . the region of local complementarity is frequently fairly large.12. and let be the protein after applying a random rigid motion. but not if we represent them combinatorially or as chains of points in space. we get again a contribution of at most to the error. Let and represent the protein and the substrate in complexed form. we can test how well we did by comparing with . more accurately. Actual proteins are flexible and can avoid minor collisions by small deformations. Experiments show that this score function is a good indicator of good fit. but one weakness is its sensitivity to collisions. The substrate could be another protein or a small ligand. This is of course not practical and we need faster alternatives. There are many possibilities. By assumption on the shape of the protein. we need a constant times balls. We thus define  © ¡ ¡ © #  ¡   ¡ # #  F ¥ ¤¢ ¡ protein-protein interactions. and we get . It follows that having a translation that is not quite the optimum contributes at most to the error. the basic question is how well a proteins and its substrate fit to each other. In protein docking. This question makes sense if we use space-filling representations of the protein and the ligand. which can be done directly or by computing the root mean square distances between and and between and . we can guarantee that the root mean square distances between and and between and are less than some ¦ Figure VII. in not making that interaction impossible.VII. After is computed. and the volume of the rotations is .12: The shaded local complement of the left shape is similar to the shaded portion of the right shape. Given a rigid motion . The geometric fit between the two proteins thus becomes a significant factor in making the interaction possible or. and one is the approximation of the van der Waals potential by counting the pairs of spheres at small distance from each other.4 Alignment We cover the space of translations with balls of radius . We cannot use the root mean square distance to guide our reconstruction of the complexed form and thus need a score function that assesses how well a motion does in generating a good fit. For 115 notation to lay out the rules for this problem. Improvements of the running time are possible. Multiplying this with the running time of the dynamic programming algorithm gives a total running time of . but to get a good approximation of the reality. Here we are given the complexed form of a protein and its substrate and we attempt to reconstruct that form while suppressing any knowledge of the solution. We interpret this question as asking how similar the substrate is to a portion of the complement of the protein. the volume of translations we need to cover is proportional to . The goal is to find a rigid motion such that and fit well. The input to the reconstruction algorithm consists of and and not knowing the solution means we can not use any information on and on . As mentioned in Section I. We may account for this fact by allowing a few collisions in the definition of . Instead of protein docking. The sensitivity of to small rotations depends on the radii of gyration. The general algorithm for re-docking is similar to the one for geometric alignment: we explore the space of rigid motions and evaluate the score function at the centers of the balls used to cover the space. By covering the space of rotations with balls of radius . We cover the space of rigid motions by cross-products of these balls and thus get a constant times rigid motions. we compute by comparing all pairs of spheres in time proportional to . we consider the simpler re-docking problem. We think of the and as the centers and write and for the van der Waals radii of the spheres in and . some of which will be mentioned at the end of this section. We need some Analysis.

C HOI . 409–443. Bibliographic notes.. The material in is this section is based on the work described in [1]. The particular score function given in Equation (VII. 2002. S ANDER . B. M URZIN . L AURENTS AND M. C HOTHIA . Indeed. An improvement by a factor is possible if we compute for all translations composed with a single rotation in one sweep. B RENNER . we have followed the second approach and presented the work of Kolodny and Linial [7]. 536–540. For the rotations. J. The FSSP database of structurally aligned protein fold families. [9] S. Biol. [3] D. [5] L. H. we need to cover a constant volume also requiring about balls. B ESPAMYATNIKH . [7] R. In many cases. J. J. Stanford Univ. N USSINOV. 3 (1993). [2] A. Proteins 47 (2002). ¢ ¥ ¥   ¢¡ #   ¥ ¦  ¢     ¥ ¥ © £ . research on this problem has lead to the creation of structural databases [6. H. Stanford. V. was sug£ ¦  VII M ATCH AND F IT    ¦ ¥ F ¥ £¢   ¡        ¢ ¥     ¢   ¥ ¡ ¥     0   ¥   ¦ ¦   0   ¦  ¡ ¦   gested in [9]. Algorithms on Strings. 141– 148. S UBBIAH . D. L EVITT. Phys. D. Cambridge Univ. Current Biol. where and how proteins interact with each other and with other molecules. Computer simulation of protein-protein interactions. Press. The total number of rigid motions to be ex. 1504–1518. Durham. A. England. M C C AMMON . Let us return to the question how to cover the space of motions to guarantee a root mean square distance of at most . H OLM AND C. H. There are two main computational approaches to structural alignment: one represents a chain by its matrix of internal distances [5] and the other uses rigid motions to align the chains embedded in space [9]. T. Nucleic Acid Res. we may cover the space of translations with balls of radius and the space of rotations with balls of . The goal of protein docking is the prediction of whether. In this section. Principles of docking: an overview of search algorithms and a guide to scoring functions. there are cases with smaller interaction area in which forces unrelated to geometric shape outweigh the importance of shape [2]. The structural alignment problem refers to comparing the backbones modeled as curves or chains of spheres in three-dimensional space. Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core.1). California. We refer to [4] for a recent survey of the extensive literature on computational approaches to protein docking. E LCOCK . H UBBARD AND C. Approximate protein structural alignment in polynomial time. where is the radius of gyration of either radius or . For the translations. 1997. [8] A. Chem. G. In other words. however. Note that this does not necessarily imthreshold ply that is large. with constants and . There is. As before.. whether or not the algorithm recognizes as close to depends on the shape of in this neighborhood. E DELSBRUNNER AND J. Protein docking by exhaustive search. Mol. W OLFSON AND R. Trees. S ANDER . Duke Univ. However. we get a total running time proportional to . L INIAL .116 . G USFIELD . who explore rigid motions in the outer loop and optimal alignments using dynamic programming [3] in the inner loop of their algorithm. [4] I. this improves the running time to roughly . KOLODNY AND N. Protein structure comparison by alignment of distance matrices. North Carolina. and Sequences. the surface area of the interface during the interaction is substantial. For constant . SCOP: a structural classification of proteins database for the investigation of sequences and structures. [6] L. Since is typically in the thousands. 233 (1993). Manuscript. V. H ALPERIN . 247 (1995). S. even this is not practical and we need faster alternatives. Mol. 8]. 123–138. 3600–3609. and in these cases the geometric fit is an important factor. [1] S. we need to cover a volume of about requiring about balls. B 105 (2001). it could be zero because motions with high score value tend to be right next to motions that generate collisions. Among other things. According to the sensitivity analysis in the previous section. Its importance within structural molecular biology derives from the observation that evolution preserves structure better than amino acid sequences. S EPT AND J. H OLM AND C. E. 22 (1994). RUDOLPH . 2003. M AO . Manuscript. It should be mentioned that the presented algorithm is significantly slower than the currently most commonly used DALI software [5]. we simplify the analysis by setting and assuming that the radii of the smallest enclosing spheres and the radii of gyration are all roughly equal to . Biol. We can design cases in which has arbitrarily narrow high spikes and our algorithm has little chance to ever recover the complexed form. experimental evidence that such configurations do either not exist or are rare for actual proteins. but it is the only algorithm that guarantees a good approximation of the optimal alignment in polynomial time. and multiplying this plored is thus proportional to with quadratic running time for evaluating the score function .

The square distance from a point . The reflection through a plane maps every point to the point such that crosses the line segment orthogonally at its midpoint. Sampling the 3-sphere.Exercises 117 5. For a rotation . Number of alignments.          ¥ B ¢      B ¥£     §   ¢   "      '  ¥      ¢   § ¡      (     B B 2. £  ¤ ¡ ¤  4. The central reflection maps every point to its . so we define . the area of is and the volume of is . (i) Show that the minimum of two numbers picked by Function U NIFORM is distributed according to the triangle density function . .        B We know that the perimeter of is . Sizes of spheres. Suppose Function U NIFORM picks a real number uniformly at random in . Reflections. Recall that an alignment beand -carbon atoms that tween two chains of uses spaces can be represented by a matrix with two rows and columns. (ii) What is the number of different alignments with a fixed number of spaces? (iii) What is the total number of different alignments? What exactly is the constant? (ii) Extend the construction to a collection of planes in . Assuming . The -dimensional unit sphere consists of all points at unit distance from the origin of the -dimensional Euclidean space:  (i) Pick numbers dom in . the median and the maximum of three numbers picked by Function U NIFORM distributed? 6. The remaining spaces are distributed over equally many insertions and deletions. let be the image of under that rotation. What is the -dimensional volume of ? 7. antipodal point       Exercises      B ¥ ¥ ¥   $ B ¥  ©B        urp ¢ pB p B   6¡ !p ¢ ¡ ©   ©   ¡  ¡ ¡ B ¡ £  ¦ $   ¥      ¢ ¥  B p  $   ¦ $ ©B  ¡       ¥ B ¡ B¡     ¡ B rp    p $ ¢ ¡ ¢ ¡ ¥ B t   ¥ #B ¢ ¢ ¡ B rp ¡ B  $ $ ¢ ¡   ¡ ¦    ¥ ¢ ¡ © ¦ $   ¥ ¥ ¢      #B . (ii) How many plane reflections do you need to represent the central reflection?   ¥ B ¥     ©B   1. Square distance from planes. we define and note that we need insertions just to make up for the difference in length. Random rotation. (ii) How are the minimum. prove that there are three planes for which a similar formula gives the sum of square distances to the planes. (iii) Further extend the construction to a collection of lines in . Sum of square distances. (ii) Area there triplets of planes enclosing non-right angles for which is equal to the sum of square distances from to the three planes? 8. (ii) If or 1. else let return and uniformly at ranthen repeat Step and ¢   (i) Show that every rigid motion is the composition of two plane reflections. the root mean square distance to the is the root of the square distance to the centroid plus a constant:  (i) Show that is a necessary and sufficient condition for the number of spaces in any alignment of the two chains. (i) Prove that for every point in space. Let us mark a point on the unit 2-sphere. In other words. £ ¥   ¥ ¡ ¡    ¡¥  £ ¡ ¡ ¥ ¡   ¡ ¥¡ ¤ ¡ ¡  £ (i) Show that the above claim holds for any three planes that pass through and pairwise enclose a right angle. Any density function over the space of rotations implies a density function over the 2-sphere. Prove that the following method picks a point uniformly at random on : ¢   £ ¡ . Prove that the uniform density of quaternions over implies the uniform density of points over the 2-sphere. Consider a collection of points in and let be its centroid. Biased probability.     ¥ £ ¢   3. is also the sum of square distances from the three planes parallel to the coordinate planes that pass through .

118 VII M ATCH AND F IT .

4 Molecular Dynamics Spheres in Motion Rigidity Shape Space Exercises 119 .2 VIII.Chapter VIII Deformation VIII.3 VIII.1 VIII.

] ¡ ¢£   . predictor-corrector).1 Molecular Dynamics  Newton’s second law. Delaunay triangulation or dual complex (forward pointer to Section VIII. [Weighted area and derivative (forward pointer to Chapter IX). Verlet.2 and IX). [Taylor expansion.] Hydrophobic surface area.] . Beeman. [ Numerical integration. different numerical methods (Euler. leap-frog.] Kinetic data structures. [Close neighbor lists.120 VIII D EFORMATION VIII.

2 Spheres in Motion [Explain the slack in the Pie Volume Formula (with a forward pointer to Chapter IX. Comput. Geom.VIII. 1996. Illinois. D.] [Predict collisions of spheres. [3] M. 287–306.. J.. Geometric techniques for molecular shape analysis. Discrete Comput. [1] J. Illinios. Comput. Sci.] Bibliographic notes. 1997”. Univ. FACELLO . Geom. 344–351. BASCH . Dept. ¡     . Proximity problems on moving points. Report UIUCDCS-R-961967. thesis. Z HANG . A. Ph. G UIBAS AND L. A. [2] H. 13th Ann. E DELSBRUNNER AND E. Sympos. Linear motion in instead of .2 Spheres in Motion 121 VIII.)] [This topic relates to the possibility of drawing non-straight Voronoi like decompositions [2].] [Dynamic Delaunay triangulations [3]. 17 (1997). In “Proc. R AMOS .] [Define cross-sections of the complex of independent simplices and proof that each cross-section gives a different pie formula but the same measurement. Inclusionexclusion complexes for pseudodisk collections. L. Urbana.

] Bibliographic notes.3 Rigidity [Discuss the pebble algorithm that analyzes the rigidity of a graph in three dimensions. .122 VIII D EFORMATION VIII.

The goal there is photo realism and possibly the most difficult problem towards achieving it is the construction of a one-to-one correspondence between features of the initial and the final images. where we discuss notions of similarity between two molecular skins. we merely illustrate the deformation and mention some of its features in passing. 19 (2001). [Explain the mixing of two or more shapes as a generalization of 1-parametrized deformation. which often does not exist. we can deform skin surfaces into each other by continuously changing the defining spheres. A canonical such method is explained in [1]. 19 (2001). C HENG . 64–71. There is a third type of change not seen in Figure VIII. Geom. Recent advances in image morphing. 1996”. Geom.] The main functionality of the Morfi software is that it can smoothly morph between one skin curve to another. They are similar to fundamental questions on function representation. For the complex we observe two types of changes caused by adding an edge or a triangle. (2) finding the best approximation within the spanned space. it deforms the skin of one set of circles to the skin of another.-W. F U AND H. The Morfi software creates a few-to-few correspondence through geometric considerations rather than working towards a one-to-one correspondence. The Morfi software has been used in [2] to explain two-dimensional skin geometry and to illustrate its use in deforming two-dimensional shapes into each other. The problems of (1) finding a good basis. which in the he complex is caused by adding a vertex and in the body by creating a component. [1] H. H. The details of this deformation will be explained in Section VIII. Bibliographic notes. [2] S. Shape space from deformation. F U AND K.1 shows the deformation of a skin curve defined by four into one defined by three circles. We note that these deformations are similar but also different from the image morphs studied in computer graphics [3].VIII. In other words. L AM . C HENG . P. P. In this section. 205–218. That method can be   ¥   ¥  ¥ VIII. except the last three in the sequence.-L. Recall that the homotopy types of the body and the dual complex are always the same. Similar to two dimensions. E DELSBRUNNER . Design and analysis of planar shape deformation.1. Comput. which are probably discussed in the approximation theory literature. 191–204. In “Proc. are both difficult. For each snapshot.4 Shape Space ¡ 123 skin surfaces and thus create a shape used to mix space that encompasses -variate deformations. Figure VIII. P. we show the skin curve together with the dual complex. Comput. which implies that they change their type the same way and at the same time. The corresponding changes in the body are caused by creating a handle or filling a hole. E DELSBRUNNER . W OLBERG . differ by at least one change in homotopy type..4.4 Shape Space . [3] G. Comput. Graphics Internat. Theory Appl. We note that any two contiguous bodies. Theory Appl.

.1: Ten snapshots of a deformation with skin and dual complex displayed. The skin in the fifth snapshot is the same as in the figures above.124 VIII D EFORMATION Figure VIII.

VIII. ¡ ¡   £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡   ¡ ¡ ¡ ¡ £ ¡ ¡ ¡ ¡ £ ¡ .4 Shape Space 125 Figure VIII.2: From left to right and top to bottom: the shapes at times . The sequence is defined by a set of seven spheres forming a question mark at time and a set of eight spheres forming a human-like figure at time .

Let be a line that avoids all point. Prove that intersects at most edges of and that this upper bound is tight for every .126 VIII D EFORMATION Exercises The credit assignment reflects a subjective assessment of difficulty. Let be a triangulation of a set of points in the plane. 1. (2 credits).  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Every question can be answered using the material presented in this chapter. Section of triangulation.

This chapter will study three aspects of size: volume. surface area. and arc length for such diagrams. Surface area is a resource consumed by molecular interactions and is probably even more relevant to research in structural biology than volume. Volume is important in the calculation of free energy and in estimates of populations given a bound on the available space.1 IX. From these we will derive short inclusion-exclusion formulas for size measurements.4 Indicator functions Volume and surface area Void formulas Measuring Software Exercises ¦ ¡ 127 . IX.2 IX.Chapter IX Measures There are various reasons why biologists want to measure the size of molecules.3 IX. Our general approach to measuring the size begins with indicator functions for convex polyhedra in .

Namely if then it sees a facet from for the singleton the outside and we have set containing the half-space whose bounding hyperplane contains that facet. A hyperplane supports if it intersects the boundary but not the interior. The sum ranges over all subsets of . is a -face of itself and the facets are the -faces. For example. The Euler relation will follow from elementary proofs of properties of these indicator functions. we only keep the terms that correspond to faces of . It is either bounded or unbounded. There are e elementary proofs for this special case. Let be the finite collection of half-spaces such that .     £¥ § ¢ £ ¨     £          ¦    ¢¡   ¢ §¡   ¨ ¡        ¡   ¥        7£  £ 5   ¥  ¡         ¡  ¦ Let be a convex polyhedron in and assume it has non-empty interior. the dual of the boundary complex is a simplicial complex and the Euler-Poincar´ Theorem stated in Sece tion IV. which is the alternating sum of subsets of . The Euler characteristic of is the alternating sum of faces. is the most important dimenkeeping in mind that sion since polyhedra in relate to molecules in .1. We show that the non-zero terms cancel unless there is only one non-zero contribution to the sum. Most of the terms in the exponentially long formula (IX. which are usually prefixed for clarity. A convex polyhedron is the intersection of finitely many closed half-spaces. as we will see later. The boundary is decomposed into faces of various dimensions. Each face is the intersection of the polyhedron with a subset of the hyperplanes bounding half-spaces in . In the first case. the boundary is an open -dimensional topological ball whose only non-zero Betti number is . including the empty set for which for all points . In words. if otherwise. that leads to We form an alternating sum of the an indicator function for the convex polyhedron. To see this define and . Convex polyhedra.1: A bounded convex polyhedron in an unbounded one to the right. The straightforward way of doing this is called the principle of inclusion-exclusion. Particularly. and . which comes from the empty set.1) Truncation. In the unbounded case. For we get and is an indicator function for .   Note that is outside iff for at least one nonzero subset . we define    (IX. and in the second. For a subset and a point we define  ¦ IX. and both cases are illustrated in Figure IX. . Assuming general position.    ¢ ©  ©  © "   ¦ if if   is bounded is unbounded       § ¡  ¦    ¦ In the bounded case.3 implies the Euler relation for convex polyhedra:   ¥ ¦  ¦ ¡   if if          ¦  ¦ ¦ ¢¡   ¡    £           ¦ ¦ ¡ provided .1 Indicator Functions The Euler relation for convex polyhedra is a special case of the Euler-Poincar´ theorem for complexes. to the left and     ¡   ¦    ¢¡ B  ©  $ $ F ¢ ¥¤   ©       ¦    ¥   ©   ¢¡ $  ©    ¢ £¡      $  ¡ ¦ ¦         ¦ §   ¦ §¡  $ ¡      ©     ¦   £ $   ¥ ¢ ¡  ¢  ¡ ¦ ¦ £  $   ¢   ¦ ¡   $ ¢   ¥  ¡   ¦    ¡    ¡ ¦   ¦ ¡ ¦ ¦ ¡ ¦ ¦ ¦ ¡ 0   ¥    ¥           ¦  ¥ ¤ ¡          ¥        ¡ ¦ ¦ ¦ 0 ¡ ¦     ¤ ¡ ¡ 0 $ . the polyhedron is the convex hull of finitely many points. the boundary is a -dimensional topological sphere whose only non-zero Betti numbers are . We study polyhedra in -dimensional space.1) are redundant and can be removed. and this section presents one that is inductive. This sum is    © Figure IX. A face of is the intersection with a supporting hyperplane. Let be the number of faces. Specifically. Inclusion-exclusion. Note that . it extends to infinity.128 IX M EASURES Below we will construct indicator functions of from Euler characteristics of subcomplexes of the boundary complex.

let . the A in terms of face numbers ¢  Figure IX. The corresponding systems are and . It is convenient to assume general position.2. The third term vanishes because iff . Then iff is visible from . . The basis of the induction is covered by . the ones contained in . which implies that iff and therefore . Consider visible from if sees all facets around from outside . as required.3: The point lies in the intersection of the complements of the half-spaces. . The corresponding systems form the partition . it is still an indicator function of . as shown in Figure IX. Assume . We can therefore write their values as sums of values of the subsystems. where ¡ P IE T HEOREM A. We distinguish £  ¢   ¡ ¡¡ ¦ ¡¡ ¡¡ ¦ £  ¦     £  ) £ ¦   ¡ ¡ ¦ )  ¡     if if   and hence . which is a half-space that contains . . as in Figure IX. Notice that according to this definition. ¡ Figure IX. and . ¡¡ © ¦ )   ¡ §¡ ¦  ¦  §¡  ¢¢  ¦ ¡¡ ¦  &¡ © and &  ¢¢   ¡ " ¦ ¦ )¡ )   "¢ ¢¢   ¡ "  &  " ¢¢ E£ ¡  ¦ ¡¡ ¦ ¦ ¦ ¦ ¡ £ £ © © © ©  ¡¡  ¡ ¡ ¡ ¦ ) ¡ ¡ ) ¡ ¡ ) ¡ ¡     ¡ ¡¡ E£ ¡ ¦   ¦ ¦   ¡  ¦ ¡ ) ¦       ¡¡ ¡ ¡ ¡ ¡ ¦ ¦   © ¡ ¦ ¡ ¦   " ) ¡    © " ¦         © ¦ ©  $ "  $   ¢  " ¢   ¢¡¡   $ F ¢ ¥¤   © © © ¡ ¦ © £ ¡ ¢ ¡      ¡ ¦ ¡     ¡ ¡)    ¦ ¥  ¡    ) ¦        ¢  ¦ ©    ¢  ¡         ¦ ©   ¦   "        ¡ ¦ ¦       ) ¦      ¦     ¡¡ ¡   ¦   ¡¦ $  ) ¨  ) ¥                  ones crossing the hyperplane shared by and . . the faces on the silhouette are not visible. and the ones contained in . The second term vanishes because all sets in contain .3.  The introduced systems partition . The induction hypothesis thus applies. The Pie Theorem A implies the Euler relation for unbounded polyhedra. The convex polyhedron is obtained by removing the constraint . The union of and is . which we consider an imFor proper face but still a face of .1) to the system is (IX. and the faces of are defined by sets in . we fix a point outside all half-spaces in . We argue that all three terms on the right side of the equation for vanish. The faces of are defined by sets in . . which in this context means that there are no two subsets of that define the same face.IX.2) © 129 ¦ Note that . . ¤   ¤     ¢ ¦  ¤ ¢ ¦   ¤¤ ¢ ¦  ¤ ¥§¦ ¥ ¨§¦    ¡¦  ¤ ¦ ¤  ¤¤ ¢ ¦  ¤ ¢¡¦   ¤ ¥  ¦ ¡  ¦ ¢ ¦¦  ¤¤ ¢ ¦¦  ¤¤ ¡¦¦   ¥    ¦ ¦  ¢  ¦   ¤  ¦ ¤ ¡ ¦ We claim that even though is much shorter than . The restriction of the inclusion-exclusion formula (IX. To see this. and therefore . By assumption of general    _ g g Unbounded convex polyhedra.2: The half-spaces and share the hyperplane and are complementary to each other. . where . Therefore because the values cancel pairwise. For sets there is an intuitive interpretation of . This claim is sufficiently important to warrant a complete proof. which is again defined as the collection of half-spaces that do not contain . Define sets of half-spaces and . Both and have one less half-space not containing than does. where  ¡ ¡¡ ¡ ¡     ¦   ¦  ¦ ¦ . and rewrite the formula in the Pie Theorem  ¡   ¡¡ ¦   $  £    © ¡  $  ¦   ¤ ¥  ¤ ¦ ¥ ©    ¤ ¥  ¦    ¦ ¡    ¡¡      ©  $ ¢¦  ¡      ¡¡ ¤ ¦ ¢¦ ¦ ¡  ¦¡ ¦   ¡    ¡   §¦ ¦ £ ¡     P ROOF. and define as the closed complement of . Let be the system of subsets that define non-empty faces.  ¦   P ’’ P y ¦ ¡ three types of faces of . We use induction over the cardinality of the set . We have for all . in which case and .1 Indicator Functions we get . By assumption.

The projection of the silhouette onto a hyperplane normal to the line is a bounded convex polyhedron of dimension . every point is contained in all half-spaces of . using the respective other convex polyhedron as the restricting convex body . Define and let be the corresponding sum of values. that of is solid. and define symmetrically. ¦ Z y Figure IX. We show that for points . The system contains exactly all sets for which . and the silhouette is indicated by the two hollow vertices.5: The boundary of is dotted. is the number of sets . This implies the Euler relation for unbounded convex polyhedra. Furthermore. We choose a line not parallel to any face of and points and sufficiently far in opposite directions on the line. same as on the left side. ¢       P IE T HEOREM B.130 with cardinality position. We return to the computation of the Euler characteristic. We get  ¡      ¢ ¦     ¥  ¡   ¦ PA      ¦   ¥  ¦  ¤ A    ¤   ¢    ¦    ¦        P     ¢  ¥  ¡ ¦   ¥     ¦ ¡ P ROOF. On the right side it is counted times. Let be the number of -faces in the silhouette. this par- ¤ £    £ ¡ ¦  £   ¥    £ )¥ ¡    ¡¦ ¤ ¦ §£ ¦  ¥   ¤ )  Y z  . For we have   ¥  ¡     ¡   ¥  ¡  ¡    ¦  ¡ ¦   ¥    ¡   ¥  ¡  ¡  ¦  By choice of . and the same edges and vertex intersect the interior of . is a closed interval with . Observe that this sum counts the -face the same number of times on both sides.5. we have for all and therefore . We first weaken the theorem by restricting the points to lie within a convex body . by the Pie Theorem B. Hence if . ¤   £  ¢  ¦ ¡ £    £¡   ¦ ) ¡  ¡  ¤ ¤   ¡ ¡     if if        ¢   ¥  ¡ ¦          ¥  ¡ ¦  and use the Pie Theorem A to get ¦ ¦   ¥     ¡ ¦ ¡ ¥ ¦ ¢  ¦   ¦   ¥  Figure IX. Each proper face of either belongs to or to or to the silhouette as seen in a view parallel to the chosen line.4: Three edges and one vertex of intersects the interior of . ¢   ¦ ¦     ¦   ¢    ¥  ¤      ¤ ¡ ¡ ¤¤ ¡         ¥ ¦ ¦  © ¡ $ "¢ ) ¡ ¡ ¤ $¡ ) h © ¢¡¡ ¨ ©       © ¦      $       ¡¦ ¡   ¡    ¦ ¡   ¢¦     ¥    ¦        ¤   ¤    £  £   ¡     ¡¦     ¤ ¤ ¤ ¦   ¨ ¤   ¤   ¤ ¢ Bounded convex polyhedra. We need a slightly stronger version of the Pie Theorem A to prove the Euler relation for bounded convex polyhedra. is an indicator function for . ¦     Restricting body. IX M EASURES For . As illustrated in Figure IX. as in Figure IX. Let be the number of -faces of that have nonempty intersection with the interior of . which establishes the induction basis. By the choice of . and then strengthen it by further reducing the set system. Hence for all points and therefore also for all points . this time for a bounded convex polyhedron . We can now argue inductively that the Euler characteristic of is . Define   ¥ ¢     ¥ ¦      £   £     )   £ ¤¦ ¦ ¥ ¡ £ £¢¦ £ ¢¦ ¡ ¤     if if   titions into the set of half-spaces that do not contain and the set of half-spaces that do not contain .4. We construct a convex polyhedron that contains and approximates in the sense that . .

and implies . [4] L. There are e many proofs of that relation. Most of the material in this section is taken from [2]. As demonstrated. E DELSBRUNNER . except for one who has a Swiss grandmother. . Indeed. although there is evidence that Ren´ Descartes knew about it a century earlier. Math. The discovery of that relation for convex polyhedra in three dimensions is usually attributed to Ludwig Euler [3. Zur Einf¨ hrung der Eulerschen Charakteristik. Sci. Scand. Acad. 109–140. 41–46. u Monatsh. J. Written a 1850–52 and published in Denkschrift der Schweizerischen naturforschenden Gesellschaft 38 (1901). Imp. 101–110. where the inclusion-exclusion approach to measuring the union of balls is laid out. 1–237. E ULER . Novi Comm. M ANI . B RUGGESSER AND P. who thus filled the gap left in Schl¨ afli’s proof. 197–205. which has not been established until 1972 by Bruggesser and Mani [1]. Demonstratio nonnullarum insignium proprietatum. 13 (1995). Reine Angew. Eulers Charakteristik und kombinatorische Geometrie. H ADWIGER . and the historically first one for the general -dimensional case goes back to the work of Ludwig Schl¨ [7] in the middle of the nineteenth cenafli tury. The union of balls and its dual shape. Petropol 4 (1752/53). Novi Comm. 6]. 415–440. [5] H. 140–160. Sci. N EF.  131 ¡ ¦   ¥       ¦  ¤ ¡     ¢   . 4]. ¨ [7] L. [1] H. [2] H. S CHL AFLI . Imp. Math. Shellable decompositions of cells and spheres. He implicitly assumes that the boundary complex of every convex polyhedron is shellable. Elementa doctrinae solidorum. Math. [3] L. finding elementary proofs of the Euler relation for convex polyhedra seems to be a favorite topic for Swiss mathematicians [5. as required. Petropol 4 (1752/53). Bibliographic notes. Geom. Theorie der vielfachen Kontinuit¨ t. 92 (1981). E ULER . Adding the alternating sums of the . We note that all authors of papers referenced in this section are Swiss. this principle also yields the Euler relation for convex polyhedra. Discrete Comput. [6] W.1 Indicator Functions by induction hypothesis. quibus solida hedris planis inclusa sunt praedita. 194 (1955).IX. Acad. 29 (1972).

The volume of the intersection of the two convex bodies is N Figure IX. The map is bijective and therefore has an inverse. Recall that is the unit sphere centered at the origin . ©   B     ¦ B ¢ ¥   ¢ ¡ ¢   ¦ B  ¥¡ ¢ ¡ ¥ ¢   ¡§ ¡         ¢ ¡ ¢   ¦ ¡ ©B ¥ ¡ $  ¡   1¥ ¡ ¦ ¡ 0 ¦    ¡   ¤ ¤¤ Figure IX. The volume of the pyramid can now be computed by taking the ball. This is illustrated in Figure IX. adding three sectors. We can therefore compute its volume by integration.    ¢ ¡      ¢   ¢   ¤ ©    © ©  § ¡ ¦   ¡    © ¥  © ¤ ¡  $ % 0  $ $ F ¢ ¥¤   ¤  ¤ £ £ ©  ¢ $ © ¢  ©    © F¢¤ ¥    © ©   ¢    F¢¤ 0 $  0  ¥    ¥         ¦ ¦   ¥    ¥  © ©       ©    © ¦ £ £ £ £        ¢ © © ¦      © © ¡  ¤ ¢¢ ¥          ¢ a© ¥ ¢ ¥ a©       ¢¢ ¦    . we use the indicator functions developed in Section IX. For measuring molecules. That radius is one. We now turn to the problem of measuring the union of a finite set of balls in .7: Stereographic projection from hyperplane that does not contain the north-pole. and be the dihedral angles between the planes. we compute the volume of   ¥ ¢     ¦ The area of the spherical triangle is three times the volume divided by the radius of the sphere. Let be the system of subsets of that appears in the statement of the Pie Theorem B in the last section. and subtracting the reflected pyramid. the sets contain or fewer half-spaces each.2 Volume and Surface Area In this section. It follows that the volume is  ¢ £  ¢ ¤  where is the closed complement of the half-space . so does contain . subtracting three half-balls. we get a cap of . . Let be the unit 3-sphere with center at the origin and identify with the hyperplane . Union of balls.7. By definition. the indicator function of a geometric set is 1 inside and 0 outside the set. The stereographic projection maps a point to the point collinear with and . We transform the question into one about half-spaces in . as shown in Figure IX. Assuming general position. Consider for example a bounded convex body and a convex polyhedron . ¢ ¡   which implies that the area of the spherical triangle is . Let be a set of three half-spaces whose bounding planes pass through 0. The half-spaces intersect in an unbounded triangular cone. in which the volume is a sum of terms each involving four or fewer half-spaces.6: A pyramid cut out of a ball by three half-spaces. . If applied to all points of a ball in . ¢        ¢ ¥  ¢¦ ¡ £  £    ¡ ¡¢     ¢   ¦ ¡   ¦ §    ¥   ¢ ¢   In dimensions. the angles of the spherical triangle. which is the intersection of the 3-sphere with a half-space . and the intersection with the ball bounded by is a pyramid whose base is a spherical triangle. Stereographic projection. or equivalently. we are mostly interested in the case . Let be the collection of half-spaces that contain the north-pole. The half-space lies on the side of its to . Call the north-pole of . Volume by integration.6. Then is the stereographic projection of the portion of that is not contained in the interior of . and total arc length of a space-filling diagram. Instead of computing the volume of directly. Let .132 IX M EASURES IX. area. the above formula gives a proof of the area formula for spherical triangles.1 to derive inclusion-exclusion formulas for the volume.

Figure IX.    ¢   §   £ ¡ ¥ £ © ££ F   ¢   ¥  ¢ ¡     ©  ©        !  £ ¡ ¥ £ We could now get a formula for by scaling the volume by the distortion factor of . the volume formula becomes an area formula. The volume of the union of a finite set of balls is  Similarly. namely for the system of balls and for a generic set in . we get a Pie Area Formula for the surface area of .      ¢     ¦ ¤    ¢   ¢  § a© £ £ F    ©    ©     ¢ ¢ ©    ¢ ©  § ¡ ¢    a© £ £ F ¢ ¡©   ¢        © $ ££ F   $ F ¢ ¥¤   © © $ © ¢ ¡ ¦ ¡ ¢ £   ¢   §¢   ¥   ¢         ¥  ¥  © © ¤ ££     ¡     i¡ ¡    ¦   ¡ ¢   ©  © F   ¤  © ¦ © ¤ ¡    ¥     ¥  ¦ ¥   ¥   ©   ©    ¢ ¡       ¤ ¤ ¢   £¡ ¢ ¡   ¢ ¡   ©   ©    © © © © ©  £ £ £   ¤ ¢ ¤        ¦ ¦   ! ©   ¦ © ¢    ¥ "   ¢             ¢ © "     ¢   ¤ ¡ ¤¡ ¦ ¦  ¢ ¤ ©   ¤ ¢ ¡ ¤  . For each ball we get a half-space . we can get a Pie Length Formula that measures the total length of the circular arcs in the boundary of the union of balls.8. By summing over all balls.  The sets with one or no half-space are redundant because in these cases.IX. A subset belongs to iff its corresponding face of has non-empty intersection with the ball bounded by . We observe that the index system in the Pie Volume Formula is an abstraction of the dual complex of . This is the weighted Voronoi diagram of . For a single sphere. we use the same notation.8: The area of the union is the sum of eight disk areas minus the sum of nine pairwise intersection areas. a non-empty set of half-spaces is in iff the corresponding set of balls defines a simplex in the dual complex. This is illustrated in Figure IX. The volume of the portion of outside the polyhedron is 133 complex of and do inclusion-exclusion with a term for every simplex in the dual complex. we have the corresponding set of balls together with the ball of in the system of .2 Volume and Surface Area .7. we explain the connection in geometric pictures. we use the Pie Volume Formula on the set of caps defined by intersecting balls. Use to project the boundary complex of to . the area of is the area of minus the alternating sum of the areas of cap intersections. which contains the north-pole in its interior. A more straightforward derivation of a formula for the ball union translates the inclusion-exclusion formula from to . where is the abstraction of the dual complex of . To prove this formula. Letting be the sphere and the set of caps. . we get the Pie Area Formula given above. except that the summation is done over all circles that are intersections of two   § §¦¡ § £ £ F  ¢ ©       ¥    ©  ©       ! §   §¦¡ §  ¦ §¡ ¢ ¢ Start with and embedded in as suggested in Figure IX. Hence. We have arrived at a simple interpretation of the Pie Volume Formula: construct the dual ¤ ¥    ¥  £ ¡ ¦£  £ £ F ¤   ¥  ¥   ¤ ©   ¦       ¤ Dual complex. plus the sum of two triple-wise intersection areas. we add the contributions of individual spheres. Let be the 4-ball bounded by and the system of subsets of that appears in the Pie Theorem B. For each set of caps in the system . But this is also the condition for the projection of to have non-empty intersection with the interior of . Similar to volume. Instead of the system of half-spaces we now use a system of balls obtained by substituting for . Area and length. Instead of proving this algebraically. The proof of the formula is similar to the one for area. Since the caps are two-dimensional. For we get and therefore a zero contribution to the area. revisited. P IE VOLUME F ORMULA . and the intersection of the half-spaces is a convex polyhedron . For convenience.

1997. E DELSBRUNNER . W YNN . 415–440. Sci. Geom. we apply the (one-dimensional) Pie Volume Formula and thus get an expression whose terms correspond to the simplices in the star of the pair. The inclusion-exclusion formula suggests that this number is the alternating sum of vertex numbers of common intersections of balls. Inclusion-exclusion formulas for such polyhedra can be found in [2]. 1957. [1] H. Edelsbrunner generalized the formula to allow for different size balls and strengthened it by using the dual complex as the index system [1]. Princeton Univ. For two or fewer balls we have no vertices. H ADWIGER . New Jersey. [2] H. Bibliographic notes. P. P. We might even go one step further and consider the number of vertices of . 13 (1995). 20 (1992).134 spheres forming a pair in . Q. Algebraic decomposition of nonconvex polyhedra. Just as a union of balls in corresponds to a convex polyhedron in . NAIMAN AND H. T HURSTON . Oberfl¨ che und ¨ a Isoperimetrie. Three-Dimensional Geometry and Topology. 248–257. 1995”. Comput. [3] H. That projection is conformal (preserves angles) and has a number of other nice properties. Ann. Inclusion-exclusion Bonferroni identities and inequalities for discrete tube-like problems via Euler characteristics. Berlin. Statist. Volume 1. Discrete Comput. Vorlesungen uber Inhalt. The union of balls and its dual shape. [5] W. many of which can be found in the book by Thurston [5]. 36th Ann. E DELSBRUNNER . In 1992. IX M EASURES ¦ ¡ ¢ £¡ ¢ £¡ ¤   §¢   ¢ ¦ ¡ ¤ . It follows that in the generic case. In “Proc.. a union of intersections of balls corresponds to a union of intersections of half-spaces. For each such circle. IEEE Sympos. 43–76. Levy. Springer. Found. The proof of the volume formula uses the inverse of the stereographic projection to transform balls in to half-spaces in . The material in this section is taken from that paper. and for each quadruple we have a rounded tetrahedron with four vertices. the number of vertices of is twice the number of triangles minus four times the number of tetrahedra in the dual complex. Naiman and Wynn proved that the volume of a finite union of congruent balls can be expressed by an inclusion-exclusion formula whose terms correspond to the simplices in the Delaunay triangulation of the centers [4]. Press. [4] D. Edited by S. For each triple in we have a three-sided spindle with two vertices. The latter is Hadwiger’s notion of a not necessarily convex polyhedron [3].

.10. and vertices. The left drawing suggests that the area of the triangle is . A (one-dimensional) angle is by definition the length of a unit circle arc and can assume any value between 0 and . and 1. there is a point inside every disk ery subset in the subset and outside every disk not in the subset. and the zero-dimensional angle of a triangle. and arc length of a union of balls in .3 Void Formulas 135 .   § a© ¢   ¤ ¥   ££ ¥ F   ¥  ¨ 0    ¨    Consider for example a tetrahedron .  ¢       ¢       IX. is the volume fraction of a sufficiently small ball centered at an interior point of that lies inside the tetrahedron. edges. I NDEPENDENT VOLUME F ORMULA . For each face . which are bounded components of the space outside the union. or both points. . and the one-dimensional angle at an edge as a dihedral angle. for the area of the intersection of the disks with centers and . This condition is equivalent to the three circles decomposing into eight regions in the way shown in Figure IX.10: Both triangles are spanned by the centers of three independent disks.IX. Similar to the two-dimensional case. the dihedral angle at an edge. and we will see shortly that this convention makes perfect sense when we compute volume using angles. but the right drawing in Figure IX. and . $ 0    ¡ ¥ ¥ This section derives another collection of inclusionexclusion formulas that express the volume. The zero-dimensional angle of a triangle is always . Recall that a collection of three disks in is independent if for ev  § ¦ ¡ § ¤ ¥ ££ F  ¥  ! £ § ¡ ¥ £ ¤ ¥ £ £ F ¢       ¥       ¥      dimensional angle at a vertex as a solid angle. In we refer to the two  vertices . where we write for the area of the disk with center . The new collection leads to formulas for voids.10 indicates that there are cases where the formulas are not as obvious as to the left. . we get sums that evaluate to zero if we replace volume by area or length. We use similar conventions for triangles. It is convenient to normalize so that in both cases the full angle is 1 and every angle is a fraction of the full angle. at the same time. a single point. we drop the distinction between abstract and geometric simplices. The volume of an independent tetrahedron is ¥  ¥    ¥      ¥  Independent triangles and tetrahedra. and be the angles at the c We generalize the formulas for independent triangles to independent tetrahedra. ¡R R R R ¡ R  ¡   R  ¡ R  $  ¡  s ¡ R ¡ R  R  ¡ R   ¡   R  ¡ #R  $  ¡  s   ¥ 0     R ¨ ¢ © ¡ Angles of revolution.  The proof of the formula is somewhat technical and omitted. Let . The only zero-dimensional angles are therefore 0. we define the angle as the fraction of directions around along which we enter . For example.9: The solid angle at a vertex. Specifically. Equivalently. This definition can be used in any dimension . we also define the angles of the improper faces of as and .3 Void Formulas  ¥ ¤ ¨ ¢ £¡ c b ¢ ¡  ¨     ¥ ¡ ¦ ¤   © ¦¦ ¤ ©               ¤ ¨  ¨ ¤ . If we change the meaning from area to perimeter we get . Figure IX.9 illustrates the definition. surface area. and so on. a b a Figure IX. For convenience. the tetrahedron spanned by the four ball centers. To simplify the notation. A two-dimensional angle is the area of a piece of the unit 2-sphere and can assume any value between 0 and . the 0-sphere is a pair of point with possible subsets the empty set.     ¥ Figure IX. . Both formulas hold whenever the three disks are independent. we let denote an independent set of four balls and.

Furthermore. some might reach into the interior of . the second sum is exactly the volume of the fringe. First we compute the volume of the underlying space of itself. IX M EASURES the same formulas for area and length. We start with the Pie Volume Formula. This results in the new volume formula. The most straightforward translation of the angle-weighted formula suggests we compute the volume of by first computing the volume of the corresponding void in and then subtracting the volume of the fringe that reaches into that void. The volume of the union of a finite set of balls in is     £¢ # # ¤ where is the Delaunay triangulation of .WEIGHTED P IE VOLUME F ORMULA .11 illustrates the fact that every void of is contained in a void of .11: Both voids in the union of disks is contained in a corresponding void of the dual complex. This can be done by adding four points viewed as degenerate balls to the set . . for a subcomplex . edges. From a point inside the void. except that the first sum vanishes:  and decompose into the parts defined by the tetrahedra that contain as a face. We write for . Observe that not all pieces considered in the second sum are subsets of the fringe. is not a triangulation because it is not even a complex. let denote the collection of pairs with and . We first make the Pie Volume Formula more complicated and then simplify by cancelling terms. the union of balls looks a lot like from a point outside all balls and voids. the contribution is split up into as many pieces as there are angles around . Whenever is a tetrahedron in . Nevertheless. The volume of a void with dual set is #   ¦¤¢ ¦    £© ¥ ¢ ££ F     ¢ © ¨ a© ¦  )  ¤ ¤ A NGLE . It is therefore not surprising that we can rewrite the Angle-weighted Pie Volume Formula to get an expression for the volume of a void of . We get VOID VOLUME F ORMULA . and vertices . The corresponding void in is triangulated by a subset of the Delaunay triangulation. missing the simplices that bound the void in . Figure IX. a void of a union of balls is a bounded component of the complement space. the only coface in is . With this notation we can rewrite the Pie Volume Formula as    § a© ¢   ¥ ¤ ¥ ¥ ¢         ¥   # ¢ ©   £¢ The new formula suggests we compute volume in two steps. For triangles. . Figure IX. we use the Independent Volume Formula to make a substitution.136 Angle weights. Let denote the set of tetrahedra in a simplicial complex . It is convenient to cover the portion of outside the Delaunay triangulation with tetrahedra.  Voids. and the contributed term is . . of    ¢   §¢ ¥  § ¦¡§  § £¡ ¥ £ ¢ ¡ ¢   ¥ ¤ ¥   ¥ ¤ ¥ ££ F  ££ F   ¥    ¥ ¤ ¢      £© ¢  ¤  £© ¢ ¢     ¤          ¥ ¥       !   ! § £ ¡ ¥ £ ¦¡ ¢   ¢    a©  ¨ ¤ )  § a©   ¥ ¢  § a©  ¢  ! § a© ¥ £ £ F  ¢ ¡ ¥ ¢   ¥ )    ¤ ¤ ¥ ¢ ¤  §   ¥ ¢ ¤ ¥   ££ F ¢ ££   ¢¢ ¤ ¢¢   ¥  ¢ F ©     ¥    ¥   ¥  ¤ ¢      £© ¢ ¡ ¨  ¤ ¨ ¢ ¥ ¥¤¢      £© ¢ ¢   ¥       © © ¥    ¢    ©    ¦        ¡¡     ¤ ¡    ¥ ¨     ¥  § a©   ¢¢ ¤ ¢¢             ¢       ! a©    ¥   ! © ¥ ¤   ! a© ¨ ¤  ¢   ¢ ¤ ¢  ) ¢ ¨ . For example for a tetrahedron . the angle is as before. and second we add the volume of the fringe. We derive a new volume formula for a union of balls by combining the Pie Volume and the Independent Volume Formulas. Strictly speaking. As defined earlier. We need some notation to continue.

chapter 14]. L IANG . Discrete Comput. England. E DELSBRUNNER . Bibliographic notes. The material of this section is taken from [1]. Finally. ¦  £ ¨ a© ¤       £ ¢   ¢ ¤ (i) be finite. London. minus 1. Wiley.   ¡ ¤  ¤ £   ¡ #  ¤   0 ¤ ¤¡ ¥   ¤ ¢  ¡    ¡ ¤   0 ¡   ¢ £   ¤  £    £  §  § £¡ ¥ £ ¦¡ § ¢   ¥ ¤ ¥   ¥   ¤ ¥ ¡ ££   ¢ F  ££ ¤ F   ¥    ¥ ¢ ¤ ¢ ¦    £© ¢  ¤ ¢ ¦    £©  ¢ £ ¢   ¡      £     ¢ £     ¤   ¢   §§¥ ¡ §¢          £ £            ¤        ¥ ¥   ¢ a© ¡   ¤     £¢ # ¢  ¤ £      ¡ ¤ ¤     ¡   ¡     ! © # ¥ ¡   ¡  # #   £ ¡ ¥ £ ¦¡   ¢ a© ¤ ¢ ¢ #   § ¢ #     . Conf. but the sum of solid angles minus the sum of dihedral angles is. (ii). and have the same Voronoi diagrams and Delaunay triangulations by the way we changed the radii. where is obtained from by reducing every ball with radius to radius . In “Proc. 1995”. © The difference gives the Void Volume Formula. System Sciences. and (iii) are satisfied. FACELLO . Let be a finite set of balls of radii with centers in the void that covers . we get formulas for the area and the total arc length of by substituting for in the corresponding formulas of :  137 to radius . The Angle-weighted Pie Volume Formula is related to Gram’s angle sum formula. Hawaii Internat. Define and note that the underlying space of is the void in that corresponds to the void in . £ £ [2] H. which states that the alternating sum of angles in a bounded convex polyhedron always vanishes. G R UNBAUM . Expressed in radians. vol.3 Void Formulas Similarly. Let be the set of balls we add. Hence. we have . M. The first complex is the sequence is and the last is . In . Assuming general position. ¤ ¥¤ ¤ £ £ F      0       § a© £ ¥    ¤  £ # ¢  § a©   £ ¢ ©   ¢   £¤   ¤      ¤   ¥   ¥ ¤ ¥ ¤ ¥   ¡ 0 ££ ¤ ¢      £© ¥ ¢ ££ F      ¢ © ¨ a©     ! a© ¤ ¢ ¤    ¤   ¢£©  ¥       ©  F ¥   ¥ ¡ (ii) be a subcomplex of . and consider . the balls in are contained in and thus cannot contribute to the union of balls in any other way than covering . E DELSBRUNNER . Measuring proteins and voids in proteins. we construct so that (i). We require that faces In . for the edges. P. The implementation of the formulas are part of the Alpha Shapes software and their use in structural biology has been described in [2]. and .IX. hence as required by (ii). Convex Polytopes. and they have the same dual complexes by the choice of . A. F U AND J. for the gon. 415–440. there exists a positive with . 13 (1995). ¨ [3] B. ¡   ¢¢ ¢ ¢¢  ¥   # Proof of void volume formula. By choice of . V: Biotechnology Computing. this is . Let be the set of centers and note that the dual complex of is just together with finitely many isolated vertices. . 256–264. Interscience. this implies that the sum of angles at the vertices of a convex -gon is . [1] H. the sum of angles at the vertices is not longer determined by the combinatorial structure of the polyhedron. The union of balls and its dual shape. 1967. The main idea in the proof is to cover the void with small balls and measure the difference between the new and the old union. as required by (iii). which also contains a proof of the dimensional version of the Independent Volume Formula. Geom. where the second containment follows because is obtained from by growing every ball of radius £ ¥ #¥     ¥  £ ©  ¥  £ ¥  ¢ ¡ ¡ Assuming these three conditions. A treatment of Gram’s angle sum formulas can be found in Gr¨ unbaum [3. The Angle-weighted Pie Volume Formulas for the two unions are    ¥  ¤  (iii) ¢ . 28th Ann.

name. we pick the middle of The implementation of the Area and Length Formulas is similarly straightforward. This list is a prefix of the masterlist mentioned in Section II. We use a partition of the Delaunay tetrahedra into the dual complex and the various voids.contrib. index 845: number of tetrahedra: 26 tetra volume: 2. After entering the index of the -complex. where is ¡ ¨  © ££ F ¢     ¥  E£  E £ E £ ¡ E £ ¢ ¢  ¢ ) ¤ £ ¦£  ¢   ¥ §   £ ¦£  ¢ ¨    ¥ ' § ¨   for ¤  ¤   ¤ ¢  )    ¨  ¤ ¥ ¢         £  ¤£ ¤ ¢  ¨ £   ¡§   ¢   ¢ £ ¢   ¢   0    ¤  £   £ ¢ ¡ ¡ ¡ ¢   ¢¡      ¢   £   0 a©   ¢   . ˚ for A. and volbl outputs the measurements of all voids.4. While the largest void is more than ten times as large as any of the others (in volume). which we do by typing > alvis name & > volbl name on the command line. as explained in Sections II. where is the van der Waals radius of the -th ball. To measure a union of balls using the Pie Volume. we look at the wirefor frame of the dual complex defined by the balls with radii . The corresponding void in the dual complex is more than twenty times as large. A .12: There are eight voids in the -complex of cdk2. Area. The output for the largest void in this example is measurements of void. and Length Formulas. The following pseudo-code is then a direct implementation of the Pie Volume Formula of Section IX. we need a list of the simplices in the dual complex of . The voids shown in Figure IX.776804e+01 number of corners: 34 The index of the void is a unique but fairly arbitrary integer assigned during the process of collecting the tetrahedra ˚ ˚ ˚ in the dual set.4 Measuring Software [Should we add a short discussion of Patrice’s new software that also computes derivatives?] Volbl stands for the ume of a union of a ls.  ¥   ¥               we get as from alvis. we take a brief look at the algorithms used and the data structures these algorithms require. It is part of the Alpha Shapes software and can be used to compute the volume. the software calculates for each ball its contribution to the void area and outputs the result in a new file. and total arc length of a ball union and its voids. The Angle-weighted Pie and Void Volume Formulas use the masterlist and in addition require a representation of the voids.880316e+01 arc length: 5. We simplify the actual situation insignificantly by assuming that the simplices in are stored in an array .2.12 occur for the solvent accessible diagram defined ˚ A. In other words.    IX. While measuring the voids. Running volbl. which confirms out intuition about the size difference between the two representations. and A. to do . surface area. Figure IX.138 IX M EASURES the corresponding interval of -values. The software uses the files generated by delcx and by mkalf that represent the Delaunay triangulation and its filtration. which is an enzyme involved in the control of the growth process of a body cell. it is still only of the order of one van der Waals ball. . Before exploring any of the other options in volbl. Some of the voids have (open) dual sets that seem connected in the image but are not because of missing triangles. The software will start with a dialogue narrowing down the options of what to compute. .4. As an example consider the measurements of voids in cdk2. which endfor. The measurements are in A . as appropriate.009809e+01 surface area: 3.3 and II.504511e+02 void volume: 1. It is not necessary but a good idea to execute volbl in parallel with visualizing the alpha shapes of the same data. Algorithms and data structures. Measuring voids takes about seconds on the author’s SGI Indigo II.

whose dual complex is shown in Figure IX. In the considered example. and number of corners are of course the same for both. the sum of volumes of the space-filling diagram and its voids should be equal to the volume of the envelope. and also the number of vertices in the boundary. Table IX. The surface area.1: Cumulative measurements made by the Volbl software. the software computes all terms in Table IX. It does this for the spacefilling diagram .0 0. U NION F IND F IND endfor.13: The dual complex of the van der Waals diagram of cdk2. the outside fringe (defined as the portion of the unbounded component of the complement of that is covered by the balls). For example. endif endfor endfor. . let be the first and the second Delaunay tetrahedron that has as a face. Figure IX.915391e+04 Csf = 6388 Cof = 6388 Note that the volume of the space-filling diagram is insignificantly higher than that of the outside fringe. The specific relations checked by the software are Vsf + Vtv . Options.3.962563e+04 ¡   ¡!)    © forall faces if then do ¨ ¢   ¥   ¤ ¥  £   ¨ ¤  ¢    £ ¤£   ¡§   £     ££ F   ¤£ Y ¡§ ¢ ¡¢ ¨  ¤ ¨ ©    ¦£   ¢ ¥§    ¤£   ¢ ¡§ £   £   ¡ ¨   ¨   ¥      ¥ ¨       ¨    ¥   ¢       ¤ space-filling diagram voids outside fringe envelope dual complex dual sets of voids ¢¢ ¤ ¢¢ ¨ ¨  £ £    ¢  ¢  £  ¦£   ¢ ¥§ £       £ Table IX. .Vtiv Asf Lsf Csf Vsh Atv Ltv Ctv Vof Aof Lof Cof = = = = 0. each represented by a linear list of tetrahedra.1 lists the main measurements made. The software also checks a few linear relations that should vanish provided the computations are correct. The difference is the volume of the dual complex. Asf = 3. its voids. forall tetrahedra  139 vol Vsf Vtv Vof Ve Vsh Vtiv area Asf Atv Aof Ae lgth Lsf Ltv Lof Le crns Csf Ctv Cof Ce do ¢ .1 and prints a summary of the results. A DD . and the envelope (defined as the space-filling diagram union all voids). case .13.IX. The complex has vertices and no voids. We fix this problem by adding a dummy tetrawhenever is hedron to the system and setting a triangle on the boundary of the Delaunay triangulation.0 0 The implementation of the Void Area and Length Formulas is similarly straightforward. The software computes the volume.915391e+04 Lof = 1. The following pseudo-code is a direct implementation of the Void Volume Formula of Section IX. which in turn should be equal to the sum of volumes of the dual complex. We have voids. downto do . and the outside fringe. We compute the lists by maintaining a union-find data structure while scanning the masterlist from back to front. which we refer to as corners. it reports that there are no voids and it prints the sizes of the space-filling diagram and the outside fringe as Vsf = 3. the voids in the dual complex.100959e+04 Lsf = 1. which is apparently rather small.0 0. for case . length. As an example consider the van der Waals diagram of cdk2. area.100959e+04 Aof = 3.034036e+04 Vof = 2. total arc length. In the checking option.4 Measuring Software the set of tetrahedra in the unbounded component of the complement of . The only trouble with this algorithm is that tetrahedra in the unbounded component may be scattered in more than one list.

we may assume that the intersection is a bigon. Hence . $ $ $    ¥ ¤ £   ¡ ¦ ¤     ¤ ¤ ¥     ¥ &¨        ¤   ¤  ¤ 0 $ ¥   ©  Let be the radius of and the radius of the circle bounding . we can find infinitely many integers so that the two -gons share two vertices near the vertices of the . Furthermore. who shows that there is a short inclusion-exclusion formula for the area of the intersection of a finite set of disks in the plane. Equivalently. the voids. but the lack of an explicit expression occasionally leads to miscalculations [2]. the software compares for each atom the area contribution to the space-filling diagram with the sum of contributions to the voids and the outside fringe. namely . We plug the values for and into the formula for the area of and get ri ρj pj wj ϕ pk ϕ Consider now the intersection of two caps. Similarly. and it does this for the space-filling diagram. All analytic formulas needed to measure the common intersection of up to four balls are straightforward. An example is Connolly’s work [1] on computing the area of a molecular surface. and we get for the area of . It also checks whether the sum of contributions really add up to the total area. The cap on a sphere consists of the portion inside the sphere . We then have two shared vertices approach as goes to infinity. By construction. Area formula.      Figure IX. Depending on the type of area measurement. The area of the cap is then times the area of the sphere . Note that the formulas give the precise area of the intersection of two or three caps since the approximating spherical -gon is only a tool in the proof and not used in the formula.contrib that contains the contribution of each individual atom. The angles at the bigon. .2. This is because a triangulation produces spherical triangles each contributing one half times the sum of the three angles minus one quarter to the area.140 Another form of output is the description of the total measurement as a sum of contributions over individual atoms. ¨  ¨   ¡  ¡        ¥  "    ¡    ¢£ ¦ ¤    £ $ ¥    ¥ ¨ ¨ $   $ ¥  0       £    ££   ¦ ¤    ¨  ¨ ¥ 0     0    ¨   ¥ ¥ ¤ ¥  ¥   £   ¡ $ ©  ¥   $ ¡ ¥ ¥ £ $   $   $ ¨ ¥ ¥ ©      $ 0 ¥ ©  £ ¤    ¡    0 ¨ £ ¨ £ ¥ ©    ¥ £ ¥ ¥ £ ¥ ¨ ©¥  0 $ ¥ ©  ¡ ¡ ¡ $ £        ©B  © ¤ #B  © ¢  ¡ ¡ B ¡ $ ¡    ¢    0   ¢ ¤ ¦   0 ¦ ¡  0     ¡     0 0 ¡  ©    ¡ ¡ ¡ ¡  © ¦  ¡  ¨ ¢ ¥    ¤    ¢ ¥   ¢ ¤ 0    . . and and arc lengths . £ and and Bibliographic notes.14. we approximate each of the two circles by a regular spherical -gon. The points are placed slightly outside the circles so that the areas of the -gons are exactly the areas of the caps. the shaded bigon has angles and arc lengths and . To construct the -gon. We approximate the bigon by a spherical -gon. A detailed documentation of the Volbl software is given in [3]. and symmetrically . Scheraga and coauthors [5] implement an inclusion-exclusion formula for a union of balls based on Kratky’s work. His proof is existential and superceded by explicit formulas that can be derived by the same methods as described in Sections IX. To the right. except possibly the area of the intersection of up to three caps. as illustrated in Figure IX. The area of that -gon is . The structural biology literature distinguishes between numerical and analytical approaches to measuring molecules. where .14. but we prefer to derive it with elementary means. where the sum adds all angles in the -gon. Let and be the angles in the two -gons. For the latter approach. A formula for the area follows from the GaussBonnet theorem in differential geometry. the area of the approximating -gon is the same. We define the width of equal to the distance between the two planes that cut from . all measured as fractions of a full circle. We let be the angle at the two vertices and and the lengths of the two arcs. the shaded cap has radius width . as shown in Figure IX. we would decompose the molecule into simple pieces and give a formula for the size of each piece. Assuming that and are rational. In the checking option. and the outside fringe. the software outputs a file name. IX M EASURES whose edges are by definition great-circle arcs. the -gon has vertices with angle and vertices with angle . . for the intersection of three caps with angles .1 and IX.14: To the left. which is . the cap contains all points whose power distance from is no less than that to . Since all simplices in are independent. after eliminating the terms that vanish when goes to infinity. To compute we recall that the area of the cap is . This makes sense for volume and area but is done only for the latter. The idea of using inclusion-exclusion for size computations goes back to Kratky [4].

Measuring space filling diagrams and voids. 1313–1345. 13 (1992). D. PALMER . [2] L. C ONNOLLY. E DELSBRUNNER AND P. The area of intersection of equal circular disks. A. Cryst. B. 16 (1983). J. Beckman Inst. P ERROT. MSEED: a program for rapid determination of accessible surface areas and their derivatives.IX. 1994. J. Illinois. K RATKY.4 Measuring Software 141 [1] M. Urbana. J.. UIUC-BI-MB-94-01. Appl. A. D ODD AND D. S CHER AGA . A: Math. C HENG . Chem. W. [3] H. 11 (1978). N. K. Comput. Analytic treatment of the volume and surface area of molecules formed by an arbitrary collection of unequal spheres intersected by planes. B. [4] K. Gen. Illinois. Analytical molecular surface calculation. G IBSON . L. A. Molecular Physics 72 (1991). Univ. M AIGRET AND H. Phys. 548–558. V ILA . [5] G. J. 1–11. R. NAYEEM . 1017–1024. T HEODOROU . Rept. F U .   .

Let be a triangulation of a set of points in the plane. Prove that intersects at most edges of and that this upper bound is tight for every . 1. Section of triangulation.  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Every question can be answered using the material presented in this chapter. (2 credits). Let be a line that avoids all point.142 IX M EASURES Exercises The credit assignment reflects a subjective assessment of difficulty.

4 Implicit Solvent Model Weighted Area Derivative Weighted Volume Derivative Derivative Software Exercises 143 . it is related to the length of the circular arcs in the boundary.3 X.1 X.2 X. X. In the case of van der Waals or solvent accessible diagram.Chapter X Derivatives The derivative of surface area under deformation is an important term in the simulation of molecular and atomic motion.

144 X D ERIVATIVES X.1 Implicit Solvent Model [Give a general introduction and work out the relationship with area and volume derivatives.] .

Duke Univ.] [1] R. E DELSBRUNNER . P. L EVITT. H.2 Weighted Area Derivative [Talk about the unweighted and the weighted area derivatives. B RYANT.2 Weighted Area Derivative 145 X.X. 2002. Durham. KOEHL AND M.] [Explain the results and disucuss the continuity issue of the functions. North Carolina. Manuscript. The area derivative of a space-filling diagram. .

146 X D ERIVATIVES X. Manuscript. E DELSBRUNNER AND P.] [1] H. Duke Univ. 2003.3 Weighted Volume Derivative [Talk the unweighted and the weighted volume derivatives. North Carolina. The weighted volume derivative of a space-filling diagram. KOEHL . Durham. .] [Explain the results and disucuss the continuity issue of the functions.

] .X.4 Derivative Software [Discuss Patrice’s ProShape software.4 Derivative Software 147 X.

Let be a triangulation of a set of points in the plane. 1. Let be a line that avoids all point. Section of triangulation.  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Prove that intersects at most edges of and that this upper bound is tight for every . Every question can be answered using the material presented in this chapter. (2 credits).148 X D ERIVATIVES Exercises The credit assignment reflects a subjective assessment of difficulty.

29 boundary group. 5 coherent triangulation. 45 Delaunay triangulation. 59 curvature (of a curve). dihedral. 60 Corey-Pauling-Koltun model. 48 Helly’s theorem. 114 . 63 graphical user interface. 35 . 24 fundamental theorem of linear algebra. 45 homotopy. 49 chromosome. 61 . 32 . 32 gene. 23 group. weighted. 61 indicator function. 48 length scale. 16 coset. 16 join. 19 Connolly surface. 100 atom. 21 alpha shape. 60 differential topology. 49 boundary homomorphism. 3 closed ball property. 57 body (inside a skin). 5 angle. 40 electron. 57 cell (in a complex). 51 Gauss map. 48 homotopic map. 96 integral line. 96 independent collection. 96 coordinate system. 28 convex polyhedron. 51. 57 homomorphism. 116 Hessian. 2 geodesic. mean. 24 isomorphism. 96 filtration. 35 coaxal system. 23 . 48 chain complex. 100 Lennard-Jones function. Gaussian. 29 codon. 45 kernel. 36 Euler characteristic. 32 cycle group. 32 . 32 gluing map. 11 linear algebra. 48 critical point. 3 genome. 28 alpha complex. 19 DNA (deoxyribonucleic acid). 61 homeomorphism. 96 Euler-Poincar´ theorem. 60 backbone. 28 affine hull. 116 canonical basis. 63 interval tree. 20. restricted. 49 deformation retraction. 60 central dogma. 51 Betti number. 9 attachment. 1 chain. principal. 9 element. 44 contractible. normal. 40 gradient. 49 Brunn-Minkowski theorem. 44 image (of a function). 24 face (of a polyhedron). 101 dual set. 44 homology class. 96 e exact arithmetic. persistent. 65 basis (of a group). 9 atomic number. 62 dihedral angle. 49 . 5 barycentric coordinates. 36 length. 103 area. 16 continuous function. 62   . 61 critical point theory. 51. 96 face (of a simplex). 21 Alpha Shape software. 69 edge contraction. 103 . 45 convex combination. 9 atomic weight. 44 homotopy type. non-degenerate. 32 . solid.S UBJECT I NDEX 149 Subject Index active site. 28 convex hull. 2 dual complex. 20 independent simplex. 48 Johnson-Mehl model. 96 Euler relation. 18 diffeomorphism. persistent. 9 -sampling. 48 inclusion-exclusion. 20. 32 Gaussian curvature. 21. 40 edge flip. 103 Dirichlet tessellation. 49 homology group. 48 facet. 60 Gouraud shading. 32 . 44 homotopy equivalence. 7 affine combination. 23 amino acid. 103 index (of a critical point).

48 . 24 principle of inclusion-exclusion. 18 orthosphere. 5 restricted Delaunay triangulation. 44 open set (of simplices). 23 pencil (of circles). 64 Morse function. 29 Skin Meshing software. 25. 32. 19 replication (of DNA). 35 restricted Voronoi diagram. 107 unstable manifold. 28 persistent Betti number. 60 solid angle. 44 open set. 19 . 19 . 15 . 32 normal form. 59 Morse-Smale function. 106 volume. 84 Morse complex. weighted. 102 . 32 principal simplex. 116 mixed cell. 65 pocket. 103 solvent accessible surface. 96 potential energy. 51 lower star. 23 van der Waals surface. 63 velocity vector. 35 metamorphosis. 6 rank (of a group). 35 . weighted Delaunay. 24 regular triangulation. 44 transcription (of DNA to RNA). 96 protein. 32 nucleotide. coherent. 61 regular simplex. 2 open ball. 23 proton. 61 Morse theory. 100 Voronoi diagram. 30. 51 regular point. 18 union-find. 48 simplicial complex. 35 ribosome. 41 Minkowski sum. 18. 64 triangulation. 10 molecular skin. 56. 9 NMR (nuclear magnetic resonance). 69 neutron. 6 RNA (ribonucleic acid). 17 power distance. 60 tangent vector. 32 spherical triangle. 48 Ramachandran plot. 9 Morfi software. 44 topological subspace. 44 supporting hyperplane. 3 signature. 60 map. 11 power diagram. 60 partial order.150 S UBJECT I NDEX linear independence. 60 topological equivalence. 69. 10 van der Waals radius. 15 space-filling diagram. 23 . 100 stable manifold. 7 speed (of a curve). 24 singular simplex. 63 star. 63 van der Waals potential. 69 pdb-file. 96 tangent space. 39 morphing. 57 piecewise linear. 14 specificity. 65 stereographic projection. 32 vertex insertion. 68 polyhedron. additively weighted. 65 manifold. 64 mouth (of a pocket). 104 Volbl software. restricted. 30 mixed complex. 114 normal vector. 17 principal curvature. 44 matrix (of a homomorphism). 22 parametrization. 17 x-ray crystallography. convex. 5 Protein Data Bank. 55. 48 simulated perturbation. 44 topology. 9 quotient group. 55 normal form algorithm. 24 skin. 55 mean curvature. 30 molecular mechanics. 32 mesh. 44 topological type. 72 orthogonal spheres. 27 molecular surface. 71 simplex. 15 molecule. 100 subspace topology. 60 smooth map. regular. 35. 23 normal curvature. 15 vector field. 40 smooth manifold. 40 void. 3 residue. 44 topological space. 4 transversal. 57 persistent homology group.

F. A. 8 Crick.. E. 99. M. 84 Chew. 102 Nef. 77 Pauling.. A... 19 Delfinado. 109 Edelsbrunner. L. 46. 11 Miller.. F... 42. H.... 105 Griffith.. 113. 70. H. J. 31 Connolly. 16. 92 Mani. 11 Bourne.. S. 46 Letscher.. 117 Hughes... 117 Kratky.. S.. J. 93 Bhat. W. W.... M. 99. 99 Neyeem. L. 34. 102 Harer.AUTHOR I NDEX 151 Author Index Akkiraju.. P. 42. 26 Bray. 82. 58 McKay... C. 42. 62 Munkres.. J. W. 114 Creighton. 26. J.. 38 Dirichlet. B. B.. 26 Maigret. R.. 11. 76. 109 O’Neill. F. J. 105. B. 19. 54.. F. S. 70. 93 Giblin. M. 31. 58.. P.. 117 Casati. 26 Bern. R. M. 11 M¨ ucke. J. 109 Gauss. 22. C. R. F. B.. 16 Jorgensen. P. D. 77 Banchoff. 26. N. E. J.. W. E. 16 Leiserson.. J. F. L... 109 Maillot. L. 114. D.. R.. Q... W. A. 102. E..... 8 Johnson.. D. D.. L. J.. A. 65 Basch. 87. 50 Gibson... L. 114 Leray. K. C.. I. G. I. 16 Alberts. M.. 84.. 109 Pascucci. R. Z. 65.. 19. V. H... 4 Darboux. G. 11.. J.. 22. C. M. 42 Johnson. 26 Foley. 99 Martinetz. 58 Naiman. A. W.-L. 105..-W.. P. R.. 77. 38. 105... J.. P. 42. L. P. A. E. J. T. 34 Bruggesser. E. C. 115 Eilenberg.. 38. 42 Forman. 87. 31 Darby. 8 . M. J. 34.. F.. 4 Mermin. 8 Cormen. 84. 22. 74. 46. 11 Clifford. D. J. K. 8 Bruce. 8 Delaunay.. K. 42 Feng. T.. M. 16 Bader. 4 Gelfand. K. 32 Gelbart. 109 Cheng. F. 54 Euler. S. B.. 99 Facello. K. W. 22 Amenta. 83 Guillemin. 74. 4 Liang. 46 Kirkpatrick. G.. M. P... 109 Corey. 76. N. N. 115 London. 26 Gr¨ unbaum. 58. R. P. 54 Dey. 115 Frobenius. H. 79 Bajaj. 4 Milnor. L. 87 Cheng. 74. G. M. 84. 93 Mehl. 8 Lewontin. 16 Lee. 70. 38 McCleary. A. H. 19 Dodd. L. 105. 26 Billera.. 117 Guibas. J. J..-G. P... J. R. 16 Mendel.. M. W... J. C.. A. N. 25. (also Delone)... 4 Gromov. 34. 22 Klee. R. 42.. C.. P. T.. 70. 50... R.... G. D. 54... 11 Kapranov.. 70 Cheng.. 19 Kelley. M. M. N. V. 83 Berman. H. T. H. T. V. 115 Feiner.. W.. J. 99 Capoyleas. 62 Hadwiger.. V. 62 Morse. 38 Besl. N. P. 38 Chothia. G. H. 77 Helly. 19 Gerstein. T. 32. 16. 114 Levitt. L. N... D.. 109 Gilliland. 84 Leach. 34 Palmer. 38 Ashcroft. D. B. 70 Lam.. 31 Fu. 11 Aurenhammer. 34.. 109 Kuntz. 16. B. J. B. 54. H. A.. 65... 19 Bondi. 8 Bronson. 109. 93 Lewis... 8 Alexandrov..

.. K. 66 Steenrod. M. J. C. 26 Rivest. M. 8 Sturmfels. A. M. A.. 50 Sasisekharan. R. P. Pollack. 16 Woodward. 11 Van Dam. 77 Varzi.. L.. B. 8 Ramachandran. 70 Veltkamp. 19 Sullivan.. J. 65. D. 4 Weissig. J... 77 Van Oostrum. 76. 99 afli. 58 Strang. J. D. J.. R. 74 Wynn. 58.. 8 Rotman.. G. M.... L.... G. 82 Richards. P. 62 Stryer. 8 Ramos. 114 Roberts. 11 Theodorou. I. Schneider.-M.. 46. 113 Van Krefeld. R.. S... R. J. N. 62 Thurston.. N. V.. D.. H... H. 109 Schey.. 113 Sherwood. E. 117 Wallace. R. W... R. H. C.... 54 Stern. A. L. F.152 AUTHOR I NDEX Pedoe. 114 ... 102 Tirado-Rives. 26 Smale... 19 Wagon.. N... 113 Scheraga. 38 Taylor. 8 Sch¨ utte. H.. 46. N.. 91 Voronoi. Y. 91 Vila. 31 Perrot. 16 Raff. P. A. 42 Van der Waals.. 66 Schikore.. 16. M... 77 Watson.. A. J. N. H. L. J.. 77 Schl¨ L.. 22 Seifert. B. 38 Sharir. 102 Zelevinsky. A. J. 19 Zhang. C. 74.. A. 54. 77. 8 Wang. 109 Threlfall. 11 Tsai. D. 38 Seidel. 62 Shah.... R. 4 Storjohann.. G. 26 Will. 4 Shindyalov. 11 Van der Waerden.. 117 Schulten.. A.. 109 Poincar´ H.. 62 Qian. 109 Vleugels... R. 26 Westbrook. H. A. S.. 83 Zomorodian.. 34. 54 e.. J. G. 62 Walter. K. K. W. V. E. R.

Sign up to vote on this title
UsefulNot useful