INTRODUCTION TO BIO-GEOMETRY

Herbert Edelsbrunner Departments of Computer Science and Mathematics Duke University

Table of Contents
P ROLOGUE I II III IV V VI VII VIII IX X B IO - MOLECULES G EOMETRIC M ODELS S URFACE M ESHING C ONNECTIVITY S HAPE F EATURES D ENSITY M APS M ATCH AND F IT D EFORMATION M EASURES D ERIVATIVES S UBJECT I NDEX AUTHOR I NDEX i 1 17 35 53 71 89 101 117 125 141 147 149

Preface
[Mention the pioneers who early on recognized the importance of geometry in structural molecular biology: Fred Richards, Michael Levitt, Michael Connolly] [Mention that my book on the “Geometry and Topology for Mesh Generation” is complementary/a prerequisite to this book. In particular, it covers the construction of Delaunay triangulations in detail, and it describes the simulation of simplicity as a general idea to deal with non-generic situations.] [This book is really about alpha shapes in a broad sense. It might be useful to describe the history of that research in short. 1981. Vancouver. Conception of idea with Kirkpatrick and Seidel. 1985-89. Graz and Urbana. SoS, Delaunay software, Alpha Shape software with Ernst Mucke, Harald Rosen¨ berger, and Patrick Moran. 1990-93. Urbana and Berlin. Surface triangulations, Betti numbers, inclusion-exclusion, CAVE with Ping Fu, Ernst Mucke, Cecil Delfinado, Nataraj Akkiraju, and ¨ Jiang Qian. 1994-95. Hong Kong. Morphing, molecular skin, with Ping Fu, Siu-Wing Cheng, Ka-Po Lam, and Ho-Lun Cheng. 1995-98. Urbana. Flow and pockets, skin surfaces with HoLun Cheng, Tamal Dey, Michael Facello, Jie Liang, Shankar Subramaniam, Claire Woodworth. 1999-2001. Duke. Skin triangulation, hierarchy, Morse ¨ ¨ complexes with Ho-Lun Cheng, Alper Ungor, Afra Zomorodian, David Letscher, John Harer, Vijay Natarajan. 2002-2003. Duke and Livermore. Docking, Reeb graphs, Jacobian manifolds with Johannes Rudolph, Sergei Bespamyatnikh, Vicky Choi, John Harer, Valerio Pascucci, Vijay Natarajan, Ajith Mascarenhas. 2000-2005. ITR Project. Derivatives, interfaces, software with Robert Bryant, Patrice Koehl, Michael Levitt, Andrew Ban, Johannes Rudolph, Lutz Kettner, Rachel Brady, and Daniel Filip. ] [This book is based on notes developed during teaching the courses on “Sphere Geometry” in the Spring of 2000, and on “Bio-geometric Modeling” in the Spring of 2001 and the Fall of 2002, all at Duke University. These courses were either taken for credit or audited at least occasionally by Luis von Ahn, Tammy Bailey, Yih-En (Andrew) Ban, Robert Bryant, Ho-Lun Cheng, Vicky Choi, Anne Collins, Abhijit Guria, Tingting Jiang, Looren Looger, Ajith Mascarenhas, Gopi Meenakshisundaram, Nabil Mustafa, Vijay Natarajan, Xiuwen Ouyang, Anindya Patthak, Ken Roberts, Apratim Roy, Scott Schmidler, Xiaobai Sun, Yusu Wang, Shumin ¨ ¨ Wu, Alper Ungor, Peng Yin and Afra Zomorodian.]

Herbert Edelsbrunner Durham, North Carolina, 2002

Should Section V.1 on Molecular Dynamics.2 on Topological PerChapter V sistence be reorganized by first presenting the algebra and second the algorithm? In Section V.3 on Construction and Simplification.and 23collapses.by 03-. Write Section VIII. Chapter X Write a new chapter on area and volume derivatives and related topics. Write a section on the Weighted Area Derivative. Chapter VIII Write the introduction to Deformation. Write Section VIII. 13. Write Section VIII. Exercises: come up with questions. Write Section VIII. Exercises: come up with questions. Chapter VI Write Section VI.4 on Shape Space. General Fix the software for creating the index and glossary.3: replace 23. Should the Exercise sections be labeled so the page heading is more uniform? Chapter III Section III.3: mention new results on scheduling.2: find out about finding the best bi-chromatic matching in .2 on Spheres in Motion. Add the interface software description to Section V. 2004). Exercises: add a few more questions.3 on Rigidity.4.To do or think about (March 15. Chapter IX Exercises: come up with questions. Exercises: come up with questions. ¢ £¡                                         . Chapter VII In Section VII. Write Section VI. Write a section on the Weighted Volume Derivative.4 on Simultaneous Critical Points.

Each cell is like a society whose members have specialized tasks. All mentioned molecules are between large and huge. Finally. DNA is the stuff that genetic material is made of. and proteins.Chapter I Bio-molecules This chapter discusses the three main classes of organic macromolecules involved in the hereditary and life maintenance mechanisms of living beings: DNA. it should not be surprising that there are exceptions to almost everything meaningful that can be said about them.2 and talk about the structural organization of proteins in Section I.4.2 I. proteins are created in two steps from DNA. Proteins act like machines that define the cell cycle as an ongoing process. Perhaps it is more surprising that anything of broad validity can be said at all. which carries the genetic information: We begin by describing the chemical structure of DNA and RNA in Section I. RNA. 1 .3.1. which they accomplish in a complicated net of interactions. We then explain the translation from RNA to proteins in Section I. DNA transcription replication RNA translation Protein I. According to the central dogma of biology. we present some of the fundamental premises and results of molecular mechanics in Section I. Because of the complexity and the large variety.1 I. RNA is mostly but not entirely an intermediate product copying portions of the DNA (transcription) and turning this information into working proteins (translation).4 DNA and RNA Proteins and Amino Acids Structural Organization Molecular Mechanics Exercises We talk briefly about the processes indicated by the three arrows and focuses on the structure of the players involved. They are relatively simple locally but exceedingly complicated in their totality.3 I.

our way up the multi-scale structure of DNA. the two backbones are in opposite. For example. Figure I. and thymine. A nucleotide is conveniently referred to by the first letter of its base. The two bases of a pair are said to be complementary. This implies that the sequence of bases along one strand determines the ¡ ¦ ¡ ¥ ¡ £ ¡ ¤£ ¡ ¢  ¡ £ ¡ ¥ . and the chemical structure of the other three nitrogenous bases below.2: The chemical structure of the DNA nucleotide with adenine as the nitrogenous basis above.1.2. DNA consists of two strands of nucleotides twisted into the shape of a double helix. and the other is between the phosphate and the -carbon. deoxyribose sugar. All atoms in the ring share electrons as a group and we draw some double bonds just to Double helix. passes through the -carbon. The carbons of the sugar group are numbered from to . We think of the backbone as oriented in the direction of the path that starts at the -carbon. We use boldface edges to connect atoms that are joined by two covalent bonds. As discovered by Watson and Crick in 1953.1: A short piece of the DNA double-helix. NH2 N O −O I. C and T by substituting the corresponding base for adenine in Figure I. In the double stranded DNA molecule. namely adenine. DNA has three chemical components: phosphate. and one of the four bases. as depicted in Figure I. The covalent bonding in the ring structures of the nitrogenous bases is more interesting. with atoms shown as tightly packed and partially overlapping spheres. The bases are attached to the 1-carbons.2 sketches the chemical structure of the nucleotide A and shows the chemical structures of the remaining three bases. cytosine. the treatment of DNA in this section is coarse and lacking of many important details. The first two bases are double-ring and the last two are single-ring structures. Figure I. Chemical structure of DNA. the hexagonal ring of cytosine has a total of eight covalent bonds. We begin by looking at the small level and work C C C N N CH HC N O CH2 C H H C OH O H C H C H P −O adenine phosphate deoxyribose sugar O N HC N C N C C NH C NH2 HC HC N NH2 C CH 3 N C O C HC N O C NH C O guanine cytosine thymine Figure I. or anti-parallel. each composed of a phosphate group.2 I B IO . Interactions between base pairs hold the two strands together. which we may think of as four thirds of a covalent bond between every contiguous pair. and four nitrogenous bases. The backbone of each strand is a repeating phosphate-deoxyribose sugar polymer. The two strands of DNA are held together by weak hydrogen bonds between complementary bases.MOLECULES indicate the total number of extra shared electrons. orientation. and ends at the -carbon.3. The phosphate and the sugar groups in the backbone are connected by phosphodiester bonds. Compared to standard genomics texts.1 DNA and RNA DNA (or deoxyribonucleic acid) is the material that forms the genome. One part of the phosphodiester bond is between the phosphate and the -carbon. We obtain the nucleotides G. forming the structure of a spiraling staircase. The attachment of these bonds to the sugar groups is illustrated in Figure I. a deoxyribose sugar. guanine. which is a complete set of the genetic material of a living organism. The chemical components are arranged in groups called nucleotides. Adenine interacts with thymine and guanine with cytosine.

. Each cell of an organism contains a copy of the entire genome. Figure I. which is more than a hundred times the distance between the earth and the sun. which is a much needed operation during packing and unpacking the DNA.4: Chemical structure of the RNA nucleotide with uracil as the nitrogenous basis. which itself assume the form of a spiral. Since humans are small relative to that distance. There are three main differences to DNA. the DNA is wrapped twice around a configuration of eight histones (a  AATCGCGTACGCG TTAGCGCATGCGC 3’ 5’ ¢       ¢ C     NH C O HC N O CH2 C H H C OH O H C OH C H P −O uracil phosphate ribose sugar Figure I. which can fail for a variety of reasons. This enzyme has the ability to pass a strand of DNA through another.4 illustrates the chemical difference between RNA and DNA by showing a ribonucleotide containing uracil. guanine. topoisomerase II.1 DNA and RNA 3 special protein). The dotted connections between the nitrogenous bases indicate hydrogen bonds. In the case of a human cell. The best evidence suggests that the solenoid arranges in loops emanating from the scaffold. The body has about cells. ¡ ¤£ ¡ ¢  Chemical structure of RNA. RNA has ribose sugar in its nucleotides. We begin by looking at the chemical features of RNA. Indeed. giving each strand an orientation. this implies that the DNA must be thin and efficiently packed. A protein machine builds new DNA strands by separating the two old strands and complementing each by a new anti-parallel strand. A gene is a subsequence of the DNA capable of being transcribed to produce a functional RNA molecule. O P O O O O P O 3’ 4’ 5’ H2 2’ 1’ O O O H HN N 5’H2 O 4’ 3’ T NH A 1’ 2’ O P O 5’ H2 O O O O O O P O 3’ 4’ 5’ H2 2’ 1’ O H HN N O O 1’ 4’ 3’ G NH NH H C 2’ O P O O O O O O O P O Figure I. How is a long thread of DNA converted into the relatively thick and worm-like structure visible through the electron microscope? On the lowest level. RNA is a single-stranded nucleotide chain and can therefore assume a much greater variety of geometric shapes than DNA. It takes one more level of packaging to convert the solenoid into the threedimensional structure we call a chromosome. and cytosine. O HC O −O sequence of bases along the other: reverse the reading direction and replace each basis by its complement. this amounts to about two meters of DNA partitioned into twenty-three pairs of chromosomes per cell. but substitute uracil for thymine found in DNA. The beads of wrapped histones assume a coiled structure (a solenoid) stabilized by another type of histone that runs along its central axis. totaling about meters of DNA. Note that this definition depends on the rather complicated process of transcription. Uracil forms hydrogen bonds with adenine just as thymine does. ¦ ¨¥ ¥ © ¦ §¥ ¥ © 5’ 3’ Replication is based on this simple rule of complementarity and makes essential use of the relatively weak bonds between the two strands.I. which differs from deoxyribose sugar by one additional oxygen atom. The numbers to order the carbon atoms of each sugar group. RNA nucleotides carry the bases adenine. each chromosome is a long thread (a double-strand) that is densely folded around protein scaffolds. 3. This higher level uses a core scaffold made of another enzyme. 2. 1.3: Chemical structure of a very short segment of DNA. Chromosomes.

Initiation. Free ribonucleotides align along the DNA template. Modern Genetic Analysis. G RIFFITH . 3–47. J. WATSON AND F. the two strands of DNA are separated locally. 964–967. and one strand acts as a template for RNA synthesis. Electron microscope pictures show that the transcription of DNA to RNA is a highly parallel process in which a row of RNA polymerase complexes follow each other along the gene and produce RNA concurrently. New York. [2] G. L EWONTIN . 737–738. Nature 171 (1953). C RICK . S HERWOOD . H. Molecular structure of nucleic acid. which brings amino acids to the mRNA during the translation process.5. The resulting RNA sequence is S I B IO . RNA polymerase binds to a promoter segment of DNA located in front of the gene. and ribosomal RNA (or rRNA). which is not translated into protein. maintaining a transcription bubble to expose the template strand. 6]. W. During the transcription of a gene. the RNA polymerase complex. as sketched in Figure I. Specific sequences in the DNA signal the chain termination by triggering the release of the RNA strand and the polymerase. Versuche uber Pflanzen-Hybriden. C RICK . An English translation of this work can be found in [3]. 1981. 5’ P C S P G S P A S 3’ P U [3] C. C. The process is catalyzed by another protein machine. The Double Helix. Br¨ nn 4 (1866). when he discovered the basic rules of the hereditary mechanism [2]. but the detailed mechanism how it comes about started to unfold only recently. [1] A. C. Transcription. WATSON AND F. Verhand¨ lungen des naturforschenden Vereines. 1966. M ENDEL . C. F. G ELBART. RNA polymerase moves along the DNA. H.4 RNA is classified into different types depending on their function. . It compares free ribonucleotides with the next exposed DNA basis and adds a complementary match. is similar to the replication process of DNA. D. [4] J. R. A structure for deoxyribose nucleic acid. and most of the material in this section is taken from [1. M ILLER AND R.5: The RNA grows in the 5’ to 3’ direction. the same as the non-template sequence of the gene. The vast majority is messenger RNA (or mRNA). The groundwork for our current understanding was laid in the nineteenth century by Gregor Mendel. M. Examples are transfer RNA (or tRNA). New York. Elongation. Termination. which acts as an intermediary structure in the synthesis of proteins. except that U replaces T. D. The book by Watson [4] is an enjoyable personal account of the years preceding the discovery of that structure.MOLECULES A gene is thus not only marked but indeed defined by the promoter segment preceding and the terminating sequence succeeding it. H. There is also functional RNA produced by a small number of genes. It then unwinds the DNA and begins the synthesis of an RNA molecule. J. [5] J. Antheneum. [6] J. Nature 171 (1953). Chapters 2 and 3]. 1999. D. which makes RNA. Freeman. Bibliographic notes. The Origin of Genetics: A Mendel Source Book. which moves along the DNA adding ribonucleotides to the growing RNA. which helps coordinating the assembly of amino acids to proteins. S TERN AND E. u 3’ A 5’ T P S P C S P G S P S Figure I. Abhandlungen. Genetic implications of the structure of deoxyribonucleic acid. The transcription process. but it took until the work of Watson and Crick in 1953 to discover the chemical structure of DNA [5. WATSON . Freeman. The idea that traits are hereditary is old. It was long known that DNA is critically involved in that mechanism. Today there are many books on the subject. in this case by adding a nucleotide carrying uracil to the chain. Each individual transcription works in three steps.

Most of the internal nodes are carbon atoms. The fifteen amino acids sketched in Figure I. Amino acids that are linked into a polypeptide chain are referred to as residues. We list their names together with their three-letter codes and single-letter abbreviations in Table I. In this section.9.8 and I. Different residues are distinguished by their side-chains. the carbon. one hydrogen atom. as illustrated in Figure I. A protein is a linear sequence of amino acids connected to each other by peptide bonds.I. nitrogen and sulfur atoms. Among a much larger variety of amino acids. As can be seen in Alanine Cysteine Aspartate Glutamate Phenylalanine Glycine Histidine Isoleucine Lysine Leucine Ala Cys Asp Glu Phe Gly His Ile Lys Leu A C D E F G H I K L Methionine Asparagine Proline Glutamine Arginine Serine Threonine Valine Tryptophan Tyrosine Met Asn Pro Gln Arg Ser Thr Val Trp Tyr M N P Q R S T V W Y I. linked to an amino group. ¡ ¡   Glycine Alanine O O Threonine S Cysteine O O Serine Aspartate N NH2 COOH COOH NH 2 S N N N O O Glutamine Lysine Methionine Glutamate O N Cα Cα Arginine H Figure I. This tetrahedron has two orientations. The resulting repeating sequence of nitrogen. -carbon and carbon atoms is the backbone of the protein. Asparagine The four neighbors of an -carbon. two amino acids are linked by a peptide bond whose creation releases water.6. with rare occurrences of oxygen.8 may be viewed as trees rooted at the -carbon. nature uses only twenty to build proteins. we mark double and partially double bonds by boldface edges.2 Proteins and Amino Acids Proteins are polypeptide chains obtained by translation from strands of messenger RNA. + R R Figures I. residues differ widely in size and structure. codes and abbreviations of the twenty amino acids that occur as building blocks of natural proteins.9 have pentagonal and hexagonal ring ¢ L R R H D Figure I. Four of the five amino acids   R R Valine Isoleucine Leucine O N Figure I. which is part of the backbone. As before. one being the mirror image of the other. are at the vertex positions of a tetrahedron around C . a carboxyl group. sketched in Figure I. Chemical structure.7. As shown in Figure I. C . The shaded circle is the -carbon on the backbone. H N H H C C OH O H H N H OH2 H N H H C O C N H H C C OH O C C OH O     Table I. and a side-chain.6: Two amino acid residues joined by a peptide bond. .8: The fifteen amino acids without cycle in their chemical structure. we sketch the translation process and discuss the chemical structure of proteins.7: The two isomers of an amino acid. The two oriented forms are referred to as isomers and distinguished by letters L and D.1: Names.2 Proteins and Amino Acids 5 Amino acids.1. Only L-amino acids occur in nature as building blocks of proteins. Each amino acid consists of a central carbon atom. All unlabeled nodes are either carbon or hydrogen atoms.

AUG.6 structures. there are apparently three possible reading frames. The translation process is more involved than transcription because it converts information between two languages that use different alphabets. The redundancy is in part due to multiple tRNA molecules carrying the same residue and in part because there is flexibility in how the tRNA reads the codons. mapped to one of the residues in the row of X and the column of Y. which are UAA. The codon XYZ is A A G C U Tyr Tyr Cys Lys Asn Glu Asp Gln His Lys Asn Glu Asp Gln His Arg Ser Gly Gly Arg Arg G Arg Ser Gly Gly Arg Arg Trp Cys Thr Thr Ala Ala Pro Pro Ser Ser Translation. Incidentally. and complementary substrings shown. although that one also binds to methionine. N Proline N Tryptophan O N O Tyrosine Phenylalanine N Histidine Figure I.   I B IO .9: The five amino acids with cyclic chemical structure. as will be discussed in Section I. Each tRNA is a short sequence of about 80 nucleotides. it differs from the tRNA that binds to the AUG codon in the middle of the sequence. U in the second row.2: The genetic code.10. which forms a cycle by having its chain connect back to the nitrogen next to the -carbon along the backbone. The start codon is AUG and maps to methionine. and UGA. The complete map is shown in Table I. . This explains the relative uniformity among the four residues in any one slot of Table I. The sequence of nucleotides is read consecutively in groups of three. G in the first row and C. each producing an entirely different residue sequence. As mentioned above. covalently attached amino acid at the top. the tRNA molecules are instrumental in translating codons into residues.3. The initiator tRNA is a specific transfer RNA that recognizes this sequence and binds to methionine. A tRNA Table I. an accurate match at the first two positions suffices and a mismatch at the third position can be tolerated.2. Some residues correspond to more codons than others. The correct reading frame is identified by starting the translation always at a start codon. Since codons are triplets of nucleotides. Since there are four different types of nucleotides. In many cases. Complementary subsequences form double-helix substructures that further fold up to characteristic ‘clover leaf’ formations. There are only twenty residues.10: Transfer RNA with anti-codon at the bottom. one of which is sketched in Figure I. called codons. The fifth amino acid is proline.2. which implies that the map is not injective but uses redundancy to reduce the number of outcomes.MOLECULES The translation is accomplished by transfer RNA molecules that recognize codons through the same binding mechanism used for replication and transcription. we have codons. ¦ ¢   £¡¢ amino acid 3’ ¦ 5’ C Thr Thr Ala Ala Pro Pro Ser Ser Ile Ile Val Val Leu Leu Leu Phe U Met Ile Val Val Leu Leu Leu Phe G C G G A U U C U C G G A G C C C A G G G U C C G C C U A A G A C A C C U G U G anti−codon GAA Figure I. Genetic code. UAG. The four positions inside that slot correspond to A. This unique feature locally restricts the flexibility of the backbone. Empty entries correspond to the stop codons.

J OHNSON . 1990. C REIGHTON . A. it took only a few years for the community to agree on the central dogma. 1988. J. P. E. [6] L. [5] P. as always. Freeman. B RAY. 1993. E. WALTER . which is a large complex made from more than 50 different proteins and several RNA molecules. M OODY AND A. S TRYER . Essential Cell Biology. Similar to transcription. In some cases. The codon and anticodon are matched in anti-parallel orientation. 878– 879. The translation process ends when a stop codon is read. 6]. England. It consists of a small subunit and a large subunit. K. The ribosome scans through the strand like a tape reader. 3. Oxford Univ. New York. M OORE AND T. J. C. L EWIS . £ ¢  ¡ . After the determination of the DNA structure in 1953. The translation process is facilitated by the ribosome. E. N IESSEN . Press. the translation of an mRNA strand into a protein happens in parallel. New York. M. P. all three of which are comprehensive texts in their respective fields. [3] T. S TEITZ . Freeman. Protein Structure.I. England. J. ROBERTS AND P. R AFF . [1] B. D. J. An Introduction to the Molecular Biology of the Cell. Oxford Univ. Second edition. 7 [4] N. 5]. Biochemistry. [2] N. C REIGHTON . and a few more years to decipher the genetic code on which the dogma is based. Proteins: Structures and Molecular Properties. BAN . The protein chain and the mRNA are released and the ribosome dissociates into its two subunits. The orientation of the mRNA strand from the 5. Press.2 Proteins and Amino Acids molecule matches the exposed codon of the mRNA with its anti-codon and contributes its residue to the polypeptide chain that grows at the other end. Protein Engineering. H ANSEN . The geometric structure of the ribosome has recently been resolved by x-ray crystallography [2]. B. Garland. Most of the twenty amino acids that occur in proteins have been identified in the nineteenth century. For each codon. A LBERTS . it finds a tRNA with matching anti-codon and appends its amino acid as a residue to the carboxyl end of the growing polypeptide chain. 1993. Bibliographic notes. before the mRNA strand is complete. Third edition. the translation even starts during transcription. with several ribosomes working concurrently and in sequence along the strand. Science 11 (2000). which come together around an mRNA strand with the help of the initiator tRNA that contributes the first residue. 1998. DARBY AND T.to the 3-end is thus preserved by the orientation of the polypeptide chain from the amino group of the first to the carboxyl group of the last residue. Considerably shorter and more focussed descriptions of proteins and protein structures can be found in [4. A. The material of this section is taken from [1. W ILKINSON . New York. The complete atomic structure of the large ribo˚ somal subunit at A resolution.

¡ ¡ Two common motifs.8 I B IO . which is the link between the carbon and the nitrogen atoms. A given residue prohibits some angles because of steric hindrances. which differs from all others because it binds back to the backbone. and this is really the reason why geometry plays an important role in their study. Cartoon representations of protein structures usually draw -helices as tubes. This so-called Ramachandran plot for glycine is sketched in Figure I.MOLECULES are physically larger residue angles than a are visualized  ¥ £  ¥ £ § ¤  §   ¥  I.12. The characteristic dihedral angles for a right-handed -helix are roughly and . The conformation of the backbone is completely determined when . In contrast. The two forms are distinguished by the rotation angle along the C-N bond. which is measured along the axis. and in this way restricts the rotational degree of freedom to a small region.6 shows its chemical and Figure I. A motif that is commonly observed in proteins is the -helix. £ ¥   Ramachandran plot. whose backbone forms a right-handed helix. and are specified for each residue in the chain. A rotation takes about residues and produces an axial separation of about ˚ A. # ¡   $"! ¥ § ¢   ¡ ¥ £ § ¤  ¡   ¡ © ¥ £ ¦ ¤    ¥ £ ¦     ¨ ¨ © ¢ . 0     £     ¥ % )(¦ ¥     ©   ¥ % '&£ ¥   ¥  §   ¨ ¢ ¦ £ ¥ bond character. which by convention is for the trans and for the cis form. As shown in Figure I. measures the rotation around the N-C bond. which can run in the same direction (parallel) or in opposite directions (anti-parallel). . Bond rotation. and refer to it as a peptide unit.13 the tubes are visible as spiral sections of the ribbon. Figure I. An interesting residue in this respect is proline. The side-chain of ψ φ O C ψ H H N H Cα N φ Cβ C Cα O Figure I. the links between the -carbon and the carbon and nitrogen atoms are single bonds with one-dimensional rotational degrees of freedom. In Figure I. in which it curves in one direction (zig-zig). same proteins fold up to same shapes. which is the reason that a relatively large portion of the square of angle pairs is realizable. The realizable angle pairs as a subset of the square of angle pairs. The and angles measure rotations around the bonds preceding and succeeding every -carbon atom. The stabilizing hydrogen bonds are between neighboring strands. Because of partial doubleCα   prohibited collisions between atoms. and for the two coplanar trans forms. A strand can be obtained by stretching the -helix until the axial distance between two ˚ contiguous -carbons reaches about A. which are flat and made up of several strands. Again by convention. and measures the rotation around the C -C bond. there is no freedom to rotate around the peptide bond.11: The planarity of a peptide bond is caused by its partial double-bond character. Contiguous -carbons are separated ˚ in the rotation direction and by about A rise. Figure I. and the cis form.11 its geometric structure.11. glycine is only H. There are however two possibly planar configurations: the trans form. They combine strands to sheets.3 Structural Organization We cannot hope to understand proteins without a good grasp of their multi-level structural organization.12: The square represents all angle pairs and the shading indicates the region of disallowed pairs for glycine. Most surprisingly. A will generally prohibit a larger range of smaller one. . in which C -C-N-C is relatively stretched (zig-zag). . Consider the three bonds from one carbon to the next along a protein backbone. which © ¨ ¢ Another recurring motif are -sheets. The structure is stabilized by hydrogen bonds between every CO group and the NH group four residues later. All side-chains lie outside the helix structure.

that are specific to interactions with other molecules. The dotted edges represent stabilizing hydrogen bonds. Structure determination. and quaternary structure addresses questions about their relative position and interaction. The x-ray experiment does not determine the element identities of the atoms.14: Two parallel -strands to the left and two antiparallel ones to the right. Evidence for that claim can be provided by mutating a protein and distinguishing between mutations that preserve and that change the active sites. Even though proteins are large molecules that typically consist of a few thousand atoms. this fact is expressed by saying that the van der Waals force creates specificity in the interaction. Secondary structure refers to the spatial arrangement of residues that are near each other along the chain.13: Ribbon diagrams visualize proteins by emphasizing the backbones as it winds its way through the structure. How do we then know anything about the structural organization of proteins? The primary source today are xray diffractions from protein crystals. it would be desirable to automate the process. Compute the electron density and from it derive the structure. which affect atoms in short distance (within ˚ about A). In biology. It is common to distinguish four levels of organization in the description of protein architecture: Primary structure refers to the sequence of residues along the oriented polypeptide chain. Although this force is weak compared to others.3 Structural Organization 9 Quaternary structure refers to the spatial arrangement of subunits of a protein. Expose the crystal to x-ray beams and collect the diffractions. While active sites usually occupy only a small fraction of the surface. they decide protein function. £ ¥ Figure I. This accumulated effect thus prefers interactions between geometrically complementary shapes.14. Tertiary structure refers to the spatial arrangement of residues that are far from each other along the chain. Both methods are complicated and laborious. but there are others and most notably images generated from nuclear magnetic resonance (or NMR) experiments. 3. Each chain forms what we call a subunit. The description of quaternary structure includes the rather weak van der Waals forces. so-called active sites.I. they are not visible under an electron microscope. 2. which have to be obtained from the known chemical structure threaded into the density. That specificity plays a dominant role also in protein-protein and in protein-ligand interactions. . Both options are illustrated in Figure I. A protein typically has a few regions embedded in its surface. It seems that Step 1 is the main obstacle in reaching this goal. We only scratch the surface by explaining the principle steps in the reconstruction of protein structures from x-ray diffractions: 1. CO Cα NH OC Cα HN CO Cα NH Cα NH HN CO Cα NH OC OC Cα HN CO HN Cα Cα NH OC Cα Cα OC CO Cα NH OC NH CO HN Cα Figure I.   Protein architecture and function. A single protein may indeed contain more than one polypeptide chain. Since there are probably hundreds of thousands of different proteins. Prepare a protein crystal. its accumulated influence is significant if two subunits have geometrically complementary shapes that permit a large number of atom pairs within the reach of the force.

10 in part because some proteins are not known to form crystals at all. Step 2 requires an x-ray source, a device to rotate the crystal by small angles ( or less), and a detection device. For each angle, we get a two-dimensional picture of diffractions. The three-dimensional electron density is computed from a whole array of such pictures. A typical level surface of an electron density is shown in Figure I.15. The main mathematical tool in the construction
¥     
¡ £   ¤© ¢"¥ ¤©   £ © # §  £   ¤© ¡ § £  ¦© ¤© #   £ ¢¤ ¤¢  ¦# ¤¢  § £ ©  £ ¡  ¤¢  §  £ ¥ ¦# "!  ¡  £ ¥ ¡ "!  ¥ §  £ ¡ ¤¢  § ¡ £ ¡ ¦¦¥ ¤¢ 

I B IO - MOLECULES
§¦ ¤¨  £ ¡ §%§ ¤¨   £ ¡ # ¥ £ ¡ ¦¡ ¤¨ ¦¥ %$¨ ¡ £ § © # £ § ¦¥ %$¨ ¡ ¤¨ £  ¤¨  £ ¥ © £ ¡ ¤¨ ©   £  ¢"¥ ¤¨ ¡ £  § ¤¨ ¥  £      ¨
 

Table I.3: Incomplete records of the atoms that belong to an arginine residue. CA is the -carbon atom, CB the -carbon, etc.
¢

Figure I.15: The so-called chicken wire representation of a level surface of a three-dimensional density.

Bibliographic notes. The Ramachandran plot for realizable bond rotations goes back to work by Ramachandran and Sasisekharan [6]. The -helix has been suggested as a common motif in proteins by Pauling and collaborators in 1951 [4], and in the same year they also identified the -sheet [3]. This was a few years before these motifs had been observed in x-ray experiments. In the late 1950s, Max Perutz reconstructed the structure of hemoglobin from x-ray diffraction data [5], and John Kendrew did the same for myoglobin. A classic text on the x-ray crystallography method is [2]. The material on x-ray crystallography and PDB files presented in this section is taken from [1].
[1] L. J. BANASZAK . Foundations of Structural Biology. Academic Press, San Diego, California, 2000. [2] T. B LUNDELL AND L. J OHNSON . Protein Crystallography. Academic Press, New York, 1976. [3] L. PAULING AND R. B. C OREY. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc. Natl. Acad. Sci. USA 37 (1951), 729–740. [4] L. PAULING , R. B. C OREY AND H. R. B RONSON . The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA 37 (1951), 205–211. [5] M. F. P ERUTZ . X-ray analysis of hemoglobin. Lex Prix Nobel, Stockholm, 1963. [6] G. N. R AMACHANDRAN AND V. S ASISEKHARAN . Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7 (1963), 95–99.
 

of the electron density is the Fourier transform. A fundamental difficulty in this step is that only the amplitudes (intensities) of the waveforms are observable, while the phase information must be obtained by different means. Protein data banks. After completing the structural study of a crystallized protein, investigators usually send their results to the Protein Data Base, which is a public repository of protein structures described in so-called PDB files. At the beginning of each file we find ancillary information, including the header, the name of the protein, the author, the reference to the corresponding journal article, etc. There is also information about non-standard components and about secondary structure elements. The main body of the file lists the coordinates of the observed atoms. They are always given in an orthonormal coordinate system, in which the length unit is one angstrom. Table I.3 illustrates the format by showing a small portion of a PDB file for hemoglobin, listing the coordinates of the atoms of an arginine residue. Note that there are no hydrogen atoms, since they are too small to be resolved by an x-ray experiment.

¡ ¥ £ # ¦# ¤¨  ¤§ ¨ © £   ©   £ © ¢¤¨    £   ¢¤¡ ¨ § ¨ £   ¡¦¡ ¤¨ ¥ £ ©   £ © ¦¥ ¤¨ ¦ ¤¨ ¥ £ # ¡  £ #  ¤¨  § £ © ¦ ¤¨    £ © ¢¤¨

ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM

N CA C O CB CG CD NE CZ NH1 NH2

ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG ARG

0

I.4 Molecular Mechanics

11 the Avogadro’s number of its atoms. In other words, if the mass of one atom of that element is daltons then the mass of one mole is grams. Table I.4 lists properties of elements that are commonly found in organic matter.
element Hydrogen Carbon Nitrogen Oxygen Sodium Magnesium Phosphorus Sulfur Chlorine Potassium Calcium #p 1 6 7 8 11 12 15 16 17 19 20 #n 0 6 7 8 12 12 16 16 18 20 20 electron shells . .. .. .. .. .. .. .. .. .. .. .... ..... ...... ........ ........ ........ ........ ........ ........ ........

I.4 Molecular Mechanics
After a protein has been created by translation, it folds into a shape, or conformation, that is determined by its sequence of residues. The folding process is a reaction to a multitude of forces that simultaneously act on every part of the protein. This section presents some of the current knowledge and efforts to model these forces. We begin by studying atoms and discuss covalent and non-covalent forces.

Atoms. Each atom has a positively charged massive nucleus, which is surrounded by a cloud of negatively charged electrons. The nucleus consists of protons, each contributing a unit positive charge, and of electronically neutral neutrons. The electrons are held in orbit by electrostatic attraction to the nucleus. Each electron has one unit of negative charge, which exactly neutralizes the positive charge of one proton. In total, we have the same number of protons and electrons and thus an electronically neutral atom, as illustrated in Figure I.16. Different

H C N O Na Mg P S Cl K Ca

Table I.4: Some elements together with their numbers of protons, neutrons and electrons distributed in the shells around the nucleus.

-

-

-

-

+

+ + + + + +
-

Figure I.16: A schematic picture of a hydrogen atom to the left and a carbon atom to the right.

elements consist of atoms with different numbers of protons. The atomic number is by definition the number of protons, which is also the number of electrons. The number of neutrons is usually about the same because too few or too many neutrons destabilize the nucleus. The atomic weight is the ratio of its mass over the mass of a single hydrogen atom. Because the mass of an electron is negligible, the atomic weight is almost exactly the number of protons plus the number of neutrons. Avogadro’s number is useful in translating from the miniscule world of single atoms into a humanly more accessible scale. It is the number of hydrogen atoms in one gram of hydrogen, which is roughly . The mass of one hydrogen atom is therefore gram which, by definition, is one dalton. One mole of an element is

Covalent bonds. According to the Born model, electrons live in shells around the nucleus and populate inner shells before using outer ones. The first three shells from inside out can hold up to 2, 8 and 8 electrons, as indicated in Table I.4. The chemical properties of an atom are defined by the tendency to either empty or complete its partially incomplete shell, if any. One way of doing that is by sharing electrons. The shared electrons complete the outermost non-empty shells of both atoms involved. According to Table I.4, carbon, nitrogen and oxygen need four, three and two electrons to fill their outer shells. As illustrated in Figure I.17, this can for example be done by covalently binding to the same number of hydrogen atoms. We can now define a molecule as a
+ + + + +

Figure I.17: The geometry of covalent bonding for carbon, nitrogen, and oxygen.

connected component of the graph whose vertices are the atoms and whose edges are the covalent bonds. When an atom covalently bonds to more than one other atom, then there is a preferred angle between pairs of bonds. For ex-

£

£

. .. ..... ...... ....... ........ ........

. ..

¢ 

¢     ¡  ¢   ¦  

¢

12 ample for carbon, this angle is what we get by connecting the centroid of a regular tetrahedron with two of the vertices. Using elementary geometry we find this angle is . Two atoms can also form a covalent double bond, which forces the nuclei closer together and is stronger than the corresponding single bond. It also prevents any torsional rotation around that bond, which is possible for single bonds. We need a sequence of four atoms and three covalent bonds to define the torsional angle of the middle bond. It is generally parametrized such that corresponds to the trans (zig-zag) coplanar configuration. For example for H CCH , we have three bonds on each side of the middle bond. There is an energetic preference for staggering the covalent bonds on the two sides, which corresponds to torsional angles of , , and . When two atoms that covalently bond are of different type then they generally attract the shared electron to different degrees. The shared electrons will therefore have a bias towards one end of the structure or another. We then have a polar structure in which the positive charge is concentrated on one end and the negative charge on the other. Examples of polar covalent bonds are between hydrogen and oxygen and between hydrogen and nitrogen, as illustrated in Figure I.17. In contrast, the bond between hydrogen and carbon has the electrons attracted much more equally and is relatively non-polar.

I B IO - MOLECULES der Waals interaction. Experimental observations point to a potential energy function roughly as graphed in Figure I.18. The corresponding force is the negative derivative,
energy

Non-covalent bonds. An atom can also donate an electron to another atom and thus create a complete outer shell. An example is sodium donating the only electron in its third shell to chlorine, which uses it to complete its third shell. As a result we get positively charged sodium cations and negatively charged chloride anions. Both are attracted to each other by electrostatic force and form a regular grid packing, in which each sodium cation is surrounded by six chloride anions, and vice versa. These arrangements are known as table salt. A weaker interaction, also based on electrostatic force, is generated by polar molecules. A prime example is water, which is partially positively charged at the two hydrogen ends. Water molecules thus tend to aggregate in small semi-regular structures, but this force is weak and bonds of this kind are constantly formed and broken. The polarity of water molecules is the basis for the difference between hydrophilic molecules, that are polar and therefore attract water, and hydrophobic molecules, that are non-polar and do not attract water. Another non-covalent force is responsible for the van

¢

¥     ¦¦  ¦   % !  ¥ §  ¥ ¥ £ ¦ ¤ 

¥ £ &  

  ¦¢

¥ § 

   © § ¥ £  ¨¦¤

¢

 

¢

¡ ¢ 

distance

Figure I.18: The van der Waals force is obtained by adding the attractive force (derivative of dashed curve) and the repulsive force (derivative of the dotted curve).

which is interpreted as a balance between an attractive and a repulsive force. The attraction is due to a dispersive force that can be explained using quantum mechanics. The repulsion also has a quantum mechanical explanation in terms of the Pauli principle, which prohibits any two electrons from having the same set of quantum numbers. It is useful to keep the relative strengths of the various forces in mind. Table I.5 gives estimates of the amount of energy necessary to break one mole of bonds.
bond type covalent ionic hydrogen van der Waals strength in vacuum water 90.0 90.0 90.0 3.0 4.0 1.0 0.1 0.1

Table I.5: Relative strength measured in kilo-calories per mole necessary to break the bonds. Water molecules interfere with ionic and hydrogen bonds, which are therefore considerably weaker in a solution than in a vacuum.

Force field. To get a handle on how molecules move, we define the potential energy of a system of atoms. The general assumption is that the system develops towards a minimum. To model the potential energy accurately, we would have to work with quantum mechanics, which is beyond the scope of this book and also beyond the capabilities of current computations for large organic molecules. The alternative is molecular mechanics, which uses classical mechanics to model the forces that act on atoms. The

Newton’s second law is expressed by the differential equation . where is the force acting upon . the generic trajectory is an ellipse with one focus at the origin. The problem in molecular dynamics is significantly more involved. namely several hundred kilo-calories per mole. In its simplest form. Its location at time is . The rate of change of the velocity is also referred to as . and is the value at the unique minimum. stationary and equal to one over the norm. Whether or not that approximation suffices depends on what we use it for. then . Electrostatic interaction. The constants and are the charges. Let be the trajectory of a point with mass . For example. ©cbIaG if   ` ¦e § h¦ ¡ ¢ ¡ D# B R ©cbIaG " #D Y£ ` ©D XGV HWFVI U  ©D T  F SB #D PHFGI ¨  ©D E B F ©D R ©D Q£ EB ©D B D bonds ¢ ¡ § CB £ ¦ ¡  ©        7 92 0 783 0 3 ¦ '#& 2 3 A@   5   64 ) '#&  5 2   ( 1( )  0   $%  ¢  "# © !§         ¡       ©   ©    ) 2 0 2 0 2 ) 3  " ¥ ¡ ¤£¢ ¡ ¢   ¥ ¡  ¦ ( § ¤££¡ ¨¦ ¢ ¥  ¡ ¥ ( ¥        ©  ¦ .19. As before. To every action there is an equal and opposing reaction. Angles that lead to staggered arrangements of bonds at both sides are energetically preferred. and its momentum is . Suppose we write the force as the negative gradient of a potential function: . that energy is written as 13 as defined is only a rough approximaIt is clear that tion of the real potential energy that drives the behavior of the system. . Van der Waals interaction. is considerably less than for bond length. We use a vector to describe the state of a system of atoms and define the potential energy as a function . 2.I. three accounting for covalent bonds and two for non-covalent bonds. for some . Newton’s secthe acceleration. namely about one one-hundredth or even less. The third sum approximates the energy for different torsional angles around a bond. In this case. by a quadratic function. We briefly look at each one of the five terms. The fifth sum approximates the van der Waals potential by the Lennard-Jones 126 function. as illustrated in Figure I. . ¦ angles torsions atoms atoms Torsional rotation. Figure I. Both the gravitational and the electrostatic potentials have this form. The strength is relatively large. again by a quadratic function. . A body continues to move in a straight line at constant velocity unless a force acts upon it. Using this notation.19: A generic trajectory when the magnitude of the attraction to the origin decreases with the square distance. In simple cases. One of the applications of force fields is the simulation of molecular motion. A trajectory is a solution to this equation. This preference is modeled by a cosine function with minima and the same number of maxima. The rate of change of the momentum equals the force. ¡   D© S B £ ¦e G gf ¢ ¡   G" ¡ ©cbIdG " ` This formula contains various constants that depend on the type of atom or interaction involved. is the distance between the two atoms. We have bodies (atoms) and the energy potential and force depend on the momentary locations of p Bp  srq¢    #B ¦ ¢ purpq Bt G gf   G " B ¦e ¥ ¥   ¥ Bond angle. the trajectory can be computed analytically. marks where the function crosses the zero line.4 Molecular Mechanics simplest such model sums five contributions to the potential energy. if the potential is . The strength. Recall Newton’s three laws of motion: 1. . 3. ond law can now be written as . is the dielectric constant of the medium. Molecular dynamics. and is the distance between the two atoms. ¡ ¦ ¥ ¥ Bond length. The collision constant. The second sum approximates the energy penalty for differing from the reference angle. The forth sum adds the electrostatic potential between every pair of atoms in the system. its velocity is . The first sum approximates the energy penalty for differing from the reference length.

[4] A. M ERMIN . The van der Waals potential derives its name from the work of van der Waals. The OPLS potential functions for proteins. Finally. 1968. There are various approaches to determine these radii. D. Bondi [2] looks for the distances of closest approach between atoms to determine van der Waals radii. there is no analytic solution and one has to resort to numerical methods to approximate the trajectories. Jorgensen and Tirado-Rives [3] derive parameters in an attempt to reproduce thermodynamic properties in computer simulations. Even small inaccuracies in the model or the computation can lead to false decisions and possibly spoil the entire remainder of the simulation. The material on force fields is taken from Leach [4]. [1] N. Wiley. Longman. Harlow. C HOTHIA AND M. To determine the constants needed to parametrize the mathematical formulation of a force field is far from trivial. 1976. L ONDON . a u [6] T. W. C. In this case. 290 (1999). Molecular Modeling and Simulation. Simulating motion with molecular dynamics is an important topic in com- ¥ putational biology. Amer. The energy potential is the function defined earlier. Biol. The first half of this section is a highly simplified introduction of atoms and bonds. Chem. Tsai et al. A SHCROFT AND N. ££¡ ¡ ¢ ¢  ¡ ¦gef £   S   ¤¢ ¦e ¡ gf   ¡ "   ¡ ¤£¢ ¡ § ¦ ¤££¡ ¡   ¢ ¦ ¥    ¥ ¦ ¥ . TAYLOR . T SAI . Solid State Physics. The definition of the van der Waals radii used to parametrize the Lennard-Jones functions is just one example. [3] W. Molecular Modeling. Soc. [7] analyse the most common distances between atoms in small molecule crystals in the Cambridge Structural Database. Mol. As before.14 all bodies. 1996. Bibliographic notes. The origin of the force is a fluctuation of electrostatic charge in atoms. Already for three bodies. 110 (1988). The currently available numerical solutions are inadequate to simulate the entire folding process even for small proteins. S CHLICK . L EACH . Florida. R. J ORGENSEN AND J. G ERSTEIN . [5] F. Springer-Verlag. and the force acting on is . New York. T IRADO -R IVES . [2] A. J. Liquids and Glasses. Orlando.MOLECULES where the mass vector multiplies each component of the acceleration vector with the mass of the corresponding atom. [7] J. J. England. One of the difficulties in the simulation is the near cancellation of large forces so that relatively weak residuals gain a decisive influence. Zeitschrift f¨ r Physik 63 (1930). who quantified the deviation of rare gas from ideal gas behavior. The packing density in proteins: standard radii and volumes. Molecular Crystals. The explanation of the dispersive contribution in terms of quantum mechanics is due to London [5]. Numerical algorithms for molecular dynamics can be found in Leach [4] and Schlick [6]. 2002. 253–266. Newton’s second law of motion can now be written as I B IO . R. Zur Theorie und Systematik der Molekularkr¨ fte. L. Energy minimization for crystals of cyclic peptides and crambin. B ONDI . we represent the collection of atoms by a point . New York. The classic two-body problem is the special case in which and is the sum of the two corresponding gravitational potentials. and we refer to physics texts such as [1. 245–279. Chapters 19 and 20] for further details. 1657-1666. Principles and Applications. Harcourt Brace. The problem in molecular dynamics is even more difficult because the potential function is considerably more complicated than a sum of gravitational potentials. the generic trajectories are again ellipses.

Draw the graph whose nodes are the acyclic amino acids that has an arc connecting two nodes iff one amino acid can be obtained from the other by the replacement or addition of a single atom. Descriptions of protein structures are publically available at the Protein Data ¤ ¦ p Bp B urq t ge ¦   G srf   G # " ¦ p Bp   B § ¦ © ¢ ¡ ¨ ¡ B ¡ ¦ © ¦ ¢ ¡ ¡ ¥ ¢ ¢ ¡   ¡ ¤ ¥ ¤ ¨ ¡ ¡ £ ¢  £&  ¢   £ ¡ ¢   § £ ¥ ¥   #B '   B ©B   ¡ ¤ ¦ ¡  B ¤ ¦ ¢ ¡ ¢   § £   B . (i) The face-centered cube (or FCC) lattice consisting of all points with integer coordinates whose sum is even: such that . Regular Tetrahedron. 7. Lattices.] 8. A double-strand of DNA has no preferred direction. (ii) The body-centered cube (or BCC) lattice consisting of all points will all even or all odd integer coordinates: such that or . (i) How many different linear pieces of doublestranded DNA of length are there? (ii) How many different cyclic pieces of doublestranded DNA of length are there? [Beware of palindromic sequences. (ii) Determine the solid angle formed by three faces meeting at a common vertex. [By convention. 2. In either direction. Elliptic Trajectory. the full dihedral angle is . how would you determine whether or not it is a palindromic sequence? (ii) Give an algorithm that finds the longest subsequence that is palindromic. Structure Repositories. Sketch two such lattices by drawing the atoms as points and connecting neighboring atoms by straight edges. Palindromic Sequences. as usual.rcsb.ch). (i) Given a strand. we read the strand in the to direction. Download a PDB file and extract the sequence of and angles along the backbone.hcuge. which is the area of the unit sphere. Amino Acids. (i) Is the graph connected? (ii) Does every connected component have a path that passes through every node exactly once? 4.Exercises 15 Base (www. which meet along six equally long edges. Counting strings. Prove that the generic trajectory in this force field is an ellipse centered at the origin. Call two linear or cyclic pieces of doublestranded DNA the same if they can be oriented so we read the same string of nucleotides in the two forward directions. (i) Download a PDB file from either data base and extract the string of single-letter abbreviations describing the amino acid sequence. but we can orient it so one direction is forward and the other is backward.] 3. Draw the result in form of a Ramachandran plot. Let the energy potential be defined by . (ii) Is the relative frequency of amino acids you observe related to the relative number of codons that encode them? 6. Ramachandran Plot.org/pdb) and the Swiss Bioinformatics Center (expasy. Call a single strand of DNA a palindromic sequence if it the same as the the complementary strand read backwards. 5. The arrangement of atoms in a folded protein is often compared to that in a crystal lattices. and the full solid angle is . which is the length of the unit circle. ¥ ¥ ©  Exercises 1. A regular tetrahedron has four equilateral triangles as faces. (i) Determine the dihedral angle formed by two faces meeting along a common edge. The force it exerts on a point is .

MOLECULES .16 I B IO .

Finally in Section II. In Section II. we use Voronoi diagrams to decompose space-filling diagrams. At the current stage of our biological knowledge. we introduce space-filling diagrams as the primary geometric model of molecules.1 II. protrusions. In a natural environment. there is an overwhelming accumulation of sequence information. S EQUENCE S HAPE A protein is a peptide chain of amino acids that folds up and forms a shape. By and large.2.3. but this might be a result of evolutionary selection. we develop a language suitable for studying details of our models. We have seen the bio-chemist’s view in Chapter I. who aims at pruning the immense variety by limiting attention to physically or chemically likely configurations.Chapter II Geometric Models A surprising finding in the research on proteins is the importance of geometric shape in their functioning. This finding is usually expressed as a causal chain of responsibilities:  so. the shape seems to determine how proteins interact with each other and with other molecules. In Section II. The rest of this books takes a complementary view by concentrating on mathematical models and computational data structures that arise in the study of proteins. and energetics determine how it interacts with other molecules.4 Space-filling Diagrams Power Diagrams Alpha Shapes Alpha Shape Software Exercises 17 . and in doing   ¡    ¡  F UNCTION II. In this chapter. In Section II.2 II. this is only a small fraction of the wealth of available sequence information. The goal of studying the geometry of proteins is therefore two-fold: the development of new computational tools to help determine or refine structure information and understanding the relationship between shape and function. we introduce alpha shapes. which are dual to space-filling diagrams and are our preferred computational representation. we introduce some of the basic geometric models useful in representing molecular shape. we talk about the Alpha Shape software and discuss how it can be used. to the near completion of several large-scale genome projects.4.3 II. like proteins fold up to same shapes. in part. Although the number of proteins for which the three-dimensional structure has been resolved and is stored in the Protein Data Base is in the thousands. which is due.1. dynamics. The details of that shape in terms of its cavities.

which is a small protein of barely more than 300 atoms. . has a boundary that consists of circular arcs meeting at common vertices. we may turn the cusps into crossings by adding arcs connecting the cusps. and on the inside the rounded boundary of the original union. The construction is illustrated in Figure II. The union  Figure II.2.18 II G EOMETRIC M ODELS ter of the circle thus traces out a curve at distance away from the boundary. Let be a finite set of disks in the Euclidean plane.1: Union of disks in the plane. The front of II. An example is shown in Figure II. Union of disks. A single disk can contribute any non-negative number of arcs.2: On the outside. We can imagine creating that portion with a milling machine whose material removing stylus has the shape of the rolling circle. To this end we roll a circle of radius on the outside about the boundary.2. and the portion of the sphere not covered by any cap is the ¢ ¡ ¡ ¢ 0   ¢    ¡ ¡ ¡ ¡ 0 ¢ ¡ of the disks. Let now be a finite set of balls (solid spheres) in three-dimensional Euclidean space.3 shows the union of balls that represent gramicidin. At any moment during the motion. the boundary of the union of uniformly grown disks. We study such unions first in the plane and then in space. More formally. which will be explained in Section II. which consists of convex and reflex circular arcs. . the circle touches the boundary but never intersects the interior.1 Space-filling Diagrams A space-filling diagram associates a molecule with a portion of the three-dimensional space it occupies. Figure II. The sphere bounding intersects the other balls in a finite collection of caps. we specify each ball by its center and its radius . The upper bound is a consequence of the relationship between arcs in the boundary of the union and angles in the Delaunay triangulation.1. We thus obtain a tangent continuous immersion of a curve in . The tacit assumption in constructing such a diagram is that the locations of the atoms in three-dimensional space are known.2. We specify each disk by its center and its radius . We can make the boundary of the disk union smoother by substituting blending curves for the vertices where the circular arcs meet. which we denote as . but there would be if the two disks to the lower left were just a little smaller. Hints towards proving the upper bound can be found among the exercises at the end of this chapter. which has no endpoints. Similar to the two-dimensional case. The total number of arcs is however rather limited. the rolling circle describes the rounded boundary. Even if we allow more general configurations. An atom is represented by a ball (a solid sphere) and a molecule is the union of balls of its atoms. we study the portion contributed by a single sphere.  ¡   §¢ Rolling circle. Figure II. which we denote as . The cen- Union of balls. If there are disks whose union is a simply connected region. It is also possible that an arc is an entire circle. To understand the structure of the boundary of the union. we cannot get more than arcs. is by and We note that the rounded boundary of large tangent continuous but can have cusps at places where the rolling circle cannot quite squeeze through two disks. In cases where tangent continuity is important. This curve is the boundary of obtained by growing every disk to radius . There are no cusps in Figure II.1. then the number of arcs cannot exceed . The interior of each cap lies in the interior of the union. Four of the eight disks contribute two arcs each to the boundary. as in Figure II.  ¤0    ¥£¢ 0 ¤ ¥  ¢ 0   ¢   ¦¡  ¡ ¥  ¥ d  ¡  ¡  ¡ ¢ 0   ¥¢ 0 ¥¢   £¢ ¢     ¡¡ ¡ ¥ ¡ 0 . this new curve is the boundary of the portion of that is not covered by any placement of the open disk bounded by the rolling circle.

4. The structural description of a finite union of balls is thus recursive in the dimension. we also have no more than vertices. the union of reflex patches (tori and spheres) is referred to as the re-entrant surface.1 Space-filling Diagrams 19 cally tight. The radius is chosen so that the rolling sphere approximates a water molecule. However. and its front sweeps out blending surfaces that cover cusps and crevices of the original boundary. are much smaller and typically only a constant times . To get bounds on the total number of faces. arcs and vertices.II. and the boundary of is referred to as the solvent accessible 0  ¢ ¡ ¤ ¥  ¢   ¢ Figure II. the number of arcs in the boundary of the union of caps is less than . We can again get a smoother boundary by rolling a sphere of radius about . Similarly. The number of arcs and vertices in the boundary of a union of balls in can be quite a bit higher than the same numbers for a union of disks in . In the application of space-filling diagrams to biology. To count the faces contributed by our sphere. the . Figure II. which are common for proteins. ¤ £¢   ¤  ¢ ¤ £¢     ¢ 0 0 ¥ 0 ¤ ¥  ¢ ¥  ¡ ¥¢  ¥ ¥   ¥ ¥  ¥  ¥¢ ¥ ¢ £¡ ¥  ¥  ¡ ¥  ¥ d¥ . There is a hole whose rounded surface penetrates through the outer surface roughly in the middle of the picture. edges and vertices. Rolling sphere.3: A union of balls representation of the gramicidin protein. and fewer than vertices. can can detect a self-intersection of the surface in Figure II. This happens because the tunnel connecting the hole to the outside is slightly too narrow for the rolling sphere to squeeze through. contribution of the sphere to the boundary of the union. fewer than arcs. . The center of that sphere moves along the boundary of the union of grown balls. The same type of symmetry can also be observed in dimensions beyond three. arcs and vertices. which implies that there are fewer than faces on this one sphere.3 have radii sphere patches that correspond to faces of .4 shows such a rounded surface representation of gramicidin. By analogy to disks in the plane. Since each arc has at most two endpoints (if it is a full circle then it has no endpoints) and each endpoint belongs to two arcs. When we look carefully. We will see that these components are related to the triangles of the Delaunay triangulation. The union of convex patches is sometimes referred to as the contact surface because that is where the rolling sphere touches . The caps form the same structure as the disks discussed earlier. Relative to that surface. and reflex sphere patches that correspond to vertices of . reflex torus patches that correspond to arcs of . we multiply by and note that each arc belongs to at least two and each vertex belongs to at least three spheres. and the boundary of is referred to as the van der Waals surface. the numbers for well packed sets of spheres. there are configurations of balls with at least some constant times faces.4: A molecular surface representation of the gramicidin protein. the radii of the balls are usually the van der Waals radii of the atoms. It can be shown that for each value of . To count the faces. we recall that these are the connected components of the complement of the union of caps. There are convex spheres in Figure II. We conclude that there are fewer than faces. This shows that the upper bounds are asymptoti- Figure II. we first note that a single sphere intersects the other balls in fewer than caps. only that they live on a (two-dimensional) sphere instead of .

3 and the molecular surface in Figure II. let be the set of points with . Graphics Appl. Surveys 23 (1991). By construction. AURENHAMMER . Q IAN . Increasing all radii of a set of circles or spheres continuously and at the same rate is referred to as the JohnsonMehl model of growth [4]. this implies that is also star-shaped and that lies also in its kernel. Analytic molecular surface calculation. Voronoi diagrams — a study of a fundamental geometric data structure. the boundary of the union sweeps out the Voronoi diagram. We can now see how structural differences between and arise: when we grow the balls. the line segment connecting and lies entirely in . and . 7]. We get the boundary of by drawing the sphere bounding each ball only inside its own Voronoi cell. An algorithm that computes cells of the additively weighted Voronoi diagram in has been developed and implemented by Will [8]. A KKIRAJU . 58–61. Figure II. 6 (1983). Pauling and Koltun [5. [3] M. ACM Comput. In geometry. which is sometimes referred to as the additively weighted Voronoi diagram. Define the weighted distance of a point from equal to the Euclidean distance minus the weight: . [1] N. L. Each face of the boundary sweeps out a (three-dimensional) cell in . It follows in particular that is a connected cell. The variations of these models discussed in this section have been introduced by Lee and Richards [6. The solvent accessible surface in Figure II. All these patches are visible in their entirety if viewed from . The boundary of and of do not necessarily have the same combinatorial structure. II G EOMETRIC M ODELS is the star-shaped and that lies in its kernel. Since the membranes bounding the are all sheets of two-sheeted hyperboloids. which is . We describe the same complex as a Voronoi diagram of the set of points with weights . their algorithms and applications. the arcs of the patches meet up in pairs along the membranes and in triplets along the curved edges of the Voronoi diagram.5: Two-dimensional Voronoi diagram generated by uniformly growing the disks. Consider the case of two weighted points. and Bibliographic notes. chapter 1]. The cell of is the set of points at least as close to as to any other weighted point. If one ball is contained in the interior of the other then its cell is empty. H. named after Michael Connolly who wrote early software constructing this surface [3]. Crystallogr. J. 548–558. . The same is true for and every . Viewing geometric protein structures from inside a CAVE. It leads to the Voronoi diagram of this section. The points of this membrane satisfy which is the equation of one sheet of a two-sheeted hyperboloid. IEEE Comput.4 are computed using the software described in [1]. we have two non-empty cells separated by a two-dimensional membrane. Otherwise. Appl. and each vertex sweeps out a curved edge in the common boundary of generically three membranes and three cells. Since common intersection of the . Uniform growth.5 illustrates the definition in two dimensions. The molecular surface is sometimes referred to as the Connolly surface. We can understand structural changes by observing how they are introduced while we continuously grow the balls. the boundary of consists of patches of such hyperboloids. [2] F. The rounded surface is usually referred to as the molecular surface. We refer to Aurenhammer [2] for a survey of Voronoi diagrams. 345–405. F U AND J. and we get a structural re-arrangement whenever we sweep over a vertex of the Voronoi diagram. P.20 surface. C ONNOLLY. each arc sweeps out a (two-dimensional) membrane separating two cells. E DELSBRUNNER . Space-filling diagrams have a long tradition in biochemistry and are similar to the CPK mechanical models named after Corey. Observe that for every point . ¢ Figure II. 16 (1996). this property is expressed by saying that is   2      ¢ ¡      2  2    ¢   ¢ ¤ §¢    ¡ ¢  £¢ 0   ¤  ¢ 2  ¤ £¢   B   0 p ¢  2 2   ¡ 0 0 ¥ ¨¦ ©§ ©B    ¥ ©B    £¢ ¡ ¡ B ¡ ¤ ¢ #B     ©B    ¤ ¢   ¢ ¢ ¥ ¥     £¢ B rp   0   ¢ #B    p  ¢ ¢ ¥ ¢ ¡ B !p p  ¡ B   ¥  ¢ ¢ ¥   B !p   ¢ 3  ¢ ¡ .

416–458. J OHNSON AND R. 6 (1977). R ICHARDS . Trans. ETH Z¨ rich.II. AIMME 135 (1939). Areas. Rev. packing and protein structures. volumes. M EHL . Principles and Applications. A. R ICHARDS . ETH 13188. J. Molecular Modeling. Biophys. Longman. 55 (1971). L EACH . F. The interpretation of protein structures: estimation of static accessibility. [5] A. u . W ILL . Mol. 151–176.-M. Reaction kinetics in processes of nucleation and growth. Am. [6] B. R. Switzerland. 1996. [8] H. Inst. Mining Metall. M. Computation of Additively Weighted Voronoi Cells for Applications in Molecular Biology. Bioeng. L EE AND F. M.1 Space-filling Diagrams 21 [4] W. 1999. 379–400. Harlow. Diss. Biol. England. [7] F. Ann.

As indicated in Figure II. larger balls grow slower than smaller ones.22 II G EOMETRIC M ODELS Growing square radii. inside on boundary of outside !   !        ¥  p ¢ ¥  B rp  # $ II. Of course. intersect both.6: The line of equal power distance separates if the two circles are disjoint and not nested. The square of the radius. Power diagram. the line moves in the same direction but then comes to a halt and reverses its direction moving away from the center of the large circle. #  if lies ¡ " B B  B ¡ "      # #B  B © If we grow the square radii of a finite collection of spheres or balls. Using the same algebraic manipulations as above. we can show that the set of points with equal power distance from two balls form a plane. If follows that the membranes swept out by the arcs of are pieces of planes. is the intersection of a finite number of half-spaces and thus a convex polyhedron. The points that belong to both spheres at time satisfy .2 Power Diagrams 0 . In words. we get a decomposition of space into convex polyhedra. Power distance. this plane may separate the two bounding spheres. Hence. . and vertices shared by the cells. We see the circle at which the two spheres intersect sweeps out a plane.7 illustrates the definitions in two dimensions by showing the Voronoi diagram of the same eight disks used in earlier figures. the power distance of is the square length of a tangent line segment from to the bounding sphere. or lie on the same side of both. This decomposition is known as the power diagram and has a variety of applications in molecular modeling. it passes through their intersection if that is non-empty. The power or (weighted) Voronoi cell of a ball under the power distance is the set of points at least as close to as to any other ball.   ¦ ¨ § #B  © ¤ #B  © ¢ B  #   ¢ ¡ 3        ¡ 2 ¡ B¡    # ¡ # ¡ We are interested in the surface swept out by the intersection of the spheres bounding and and claim it is a plane. The appropriate function in this case is the power distance of a point from a ball defined as the square distance from the center minus the weight. This polyhedron may be bounded or unbounded. As in Section II. The power or (weighted) Voronoi diagram of is the collection of cells together with the polygons.6. The set of balls at time is denoted as . We can describe the decomposition of space implied by the square radius growth model as a Voronoi diagram for yet another weighted distance function.1. Figure II. Every polygon is shared by two cells. If we denote by the set of points whose power distance from is at most as large as the power distance from then . Instead we just require that they both be equal. edges. We grow each ball to radius at time . and it passes outside if the two circles are nested. Varying has the same effect as dropping the requirement that the two expressions vanish. The Taylor series expansion of the radius as a function of time is If lies outside . The two planes are indeed the same. and in the generic case every edge is shared by exactly three and every vertex is shared by exactly four cells. We have D     ¢  0   0  p  p  p  p     B    XB  B   XB  B D 0   0    p  rp D 0   0   p  rp B B     #B   D   0   p  ¡ ¡   © ¡D ¢    ¥   £   ¢  0 D 0D   0 ¥ ¥ ¢ ¢ ¥ ¥   DI   0 ¢ ¢ ¥ ¡ ¥  ©   B rp ¥ ¥ ¥ ¡  0   D  D   00   ¢ ¥ D   0   p  B ¢    ¢      § ¥ ¨¦ £ ¤ ¢ ¥ ¥ ¡ ¢ D  B I D   0 ¥ ¥ ¢ I  ¢ ¢ ¢ £ ¥ ¥  ¢ © ¥ ¡ ©  D B !p         0 . is sometimes referred to as the weight of the point . we let be a finite set of balls . smaller balls never really catch up except in the limit:    Figure II. so we get tions as snap-shots in an animation in which the center of the small circle moves towards the center of the large circle. and it is even possible that it is empty. At first. Think of the three configura- The first order approximation of the growth is one half the inverse of the radius.

The Euler relation here is . ¥ Observe that we reverse dimensions when we go from the Voronoi diagram to the Delaunay triangulation: cells become vertices. hence ler relation implies and . Writing . The number of vertices is at most the number of disks. It is obtained by connecting and by an edge if the cells and share a common polygon. Observe that every triangle has three edges and every edge belongs to at most two triangles. and vertices become tetrahedra. we illustrate the definitions by showing a two-dimensional Delaunay triangulation in Figure II. ¥ ¤ ¦  # #  #  ¥ "  ¢   ¥ ¢  "  ¥ (   ¥ ¦ ¥ ¤¤   #  ¥ ¦ a  d "¦ " ¤ ¢  $       ¥ # ¥d ¦   ¥   ¥      ¦d ¤ " ¥ "  % ¦ ©¥  #  "   £ ¥ ¤  £ ¤ ¦ ¥ ¥ d¥ # ¥  ¦ ¥ ¤ # Delaunay triangulation. each atom is surrounded by its neighbors in the Delaunay triangulation. It follows that the number of edges in the Delaunay triangulation is at most some constant times . and . We refer to an element of a Delaunay triangulation as a simplex. Assuming the balls in are in general position. and . and as a consequence. which can be a vertex.7: Power or weighted Voronoi diagram of eight disks in the plane. Hence Figure II. let us warm up to the challenge by counting the simplices of a two-dimensional Delaunay triangulation. hence . polygons become edges. . a triangle or a tetrahedron. we can perturb them ever so slightly to move them into general position. For example. The neighbors are near the central atom and are therefore packed in a small amount of space. and share a common vertex. and share a common edge. The Delaunay triangles are transparent so they do not obstruct the structure of the Voronoi diagram underneath. . and for the numbers of vertices. The (weighted) Delaunay triangulation of is dual to the (weighted) Voronoi diagram. an edge. edges become There are Delaunay triangulations that have almost this many simplices. and are connected by a triangle if . this exhausts all possible types of overlap among the Voronoi cells. we reverse the inclusion direction. but they require a placement of the balls that would be rather unlike the configurations we observe for proteins. The number of vertices is at most the number of balls. we note that each tetrahedron has four triangles and each triangle belongs to at most two . Similarly. and the number of edges is at most the number of pairs of vertices. a Voronoi polygon belongs to a Voronoi cell iff the corresponding Delaunay edge contains the corresponding Delaunay vertex. We can count the simplices using the Euler relation. . If the balls are not in general position. Before counting the simplices in three dimensions. and are connected by a tetrahedron if . triangles and tetrahedra. we have # Figure II. Number of simplices. which says that the alternating sum of simplices is always equal to 1. implying there can only be a small constant number of them. Typically. .2 Power Diagrams 23 triangles.8: Delaunay triangulation drawn over the dual Voronoi diagram of eight disks in the plane. .      ¢ ¥ "  ¦ ¥ # ¢ " ¦  #   #  # ¢   ¡¢  ¢      ¢ # ¢      ¢ # #  # #     # ¢   ¢ ¢ .8. In three dimensions. Similarly. hence .II. Since complexes of tetrahedra are difficult to draw. edges. Combining this with the Eutetrahedra. also the number of triangles and tetrahedra are at most some constant times . . Combining this inequality with the Euler relation implies and .

That sphere is orthogonal to . We need some notation. It follows that the orthospheres of and of are orthogonal to the three balls whose centers span that triangle. Math. . 209–227. We prefer to be economical with terms and refer to them as (weighted) Delaunay triangulations.  ¢ ¡    ¡ § The name is justified because the two tangent planes defined at any point common to the bounding spheres of and form a right angle between them. and we refer to it as the orthosphere of the four balls. for all . Math. P ROOF. Suppose for a moment that the balls all have zero radius. and . Let be a half-line that emanates from and passes through the interiors of and . G. these diagram are often referred to a Dirichlet tessellations or Voronoi diagrams. there is no difficulty at all if is negative and is therefore imaginary. that is. © ©  ¨  ¥ ¨ ¨    § ¨   §     ¨   ¨ ¡ ¦¨        ¨     ¥  ¦¨ ¨      ¨   ¨  ¨ ¨      0  ¢ ¡ ¡ ¡ 0    #B   § © ¢  ¢ ¡  ¡ 0  0 p ¡ ¡ !Bp     ¡¢   ¡ ¡  ¡ %§ ©B0    ©     0   0    p   p 0    ¡  ¢     ¢ B#    ©B   ©B   ¡   0 ¡ ¡ ¡ ¢ ¢   ¢ ¨ ¥  ©   B ¡0  © ¨  ¢  © ¢ ¥ ¨ © ¢     ¡ ¡ 0 X ¦¥&¨  ¤¢    £ § B ¡ ¡ ¡   ¡ ¡ ¡ ¡ ¡ 0 B £ . 793–800. Akad. Nauk SSSR. whenever the same is true for and . D IRICHLET. Note that is further than orthogonal from all other balls. e Otdelenie Matematicheskii i Estestvennyka Nauk 7 (1934). Let be the sphere with center and weight . and we have and for some . . The dual triangulations have been introduced considerably later by Boris Delaunay (also Delone) [2]. ¨ [3] P. In reference to subsequent work by Dirichlet [3] and Voronoi [8]. and . The plane of points with equal power distance from and thus contains the shared triangle. [4] H. Sur la sph` re vide. In other words. Assuming the generic case. we conclude that is acyclic. Ann. This property can be used to characterize Delaunay tetrahedra for a generic set of balls. If the four balls had zero radius. 135 (1992). Algorithms for constructing weighted Delaunay triangulations in and are discussed in [4. has equal power distance from four balls. ACYCLICITY L EMMA . Upper bounds on the number of Delaunay simplices for “well-spaced” points in can be found in [5]. 40 (1850). Press. [2] B. The half-line passes through a sequence of Delaunay tetrahedra. J. Chapters I and V]. That reference also explains how to computationally cope with ambiguities in the construction caused by non-generic input sets. would be their circumsphere. Given a fixed viewpoint. Fiber polytopes. L. Uber die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen. E DELSBRUNNER Geometry and Topology for Mesh Generation. J. Then each Voronoi vertex is equally far from four points and coincides with the center of the circumsphere of these points. D ELAUNAY. we can order two tetrahedra if one lies in front of the other one. England. Reine Angew. and is further than orthogonal from all other balls in . Any two consecutive tetrahedra share a triangle. 527–549. We use orthospheres to prove that the relation is acyclic. Let now be a vertex of the Voronoi diagram of . By transitivity. We may assume Bibliographic notes. which implies that the power distance of from is less than that from . Cambridge Univ. It is common to reserve the name Delaunay triangulation for unweighted points and to refer to the duals of power diagrams as regular triangulations [1] or coherent triangulations [7]. the power distance increases along chains of the relation . The visibility ordering of the Delaunay tetrahedra with respect to any fixed viewpoint is acyclic. Two spheres or balls and are orthogonal if II G EOMETRIC M ODELS that does not intersect any edge of the Delaunay triangulation. The viewpoint is on ’s side of that plane. a tetrahedron connecting points . Izv. Since real numbers are totally ordered. B ILLERA AND B. . Algebraically. We will use the concept of orthogonality to generalize this property to the case where the have not necessarily zero and not necessarily equal radii. Power diagrams of discrete sets of weighted points have been studied by Carl Friedrich Gauss more than 150 years ago in the context of quadratic forms [6]. Specifically. Let be the viewpoint and write if there is a half-line that emanates from and passes through the interior of the Delaunay tetrahedron before it passes through the interior the Delaunay tetrahedron . and belongs to the Delaunay triangulation of iff the orthosphere of . It turns out that this relation can in general have cycles but is acyclic for Delaunay triangulations. and distance from the orthosphere of . 2001.24 Orthospheres. S TURMFELS . . the power distance of from the orthosphere of is less than its power . [1] L. We call this the visibility ordering with respect to the given viewpoint. . as seen from the viewpoint. and larger power distance from all others. . Acyclicity.

13th Ann. Discrete Alg. VORONOI . 2002”. ` e Math. E RICKSON . a [8] G. . [6] C. 1994. Resultants and Multidimensional Determinants. Nouvelles applications des param` tres cone tinus a la th´ orie des formes quadratiques. Discriminants. G ELFAND . In “Proc. ACM-SIAM Sympos. and 134 (1908). M. F. V. 97–178.. Z ELE VINSKY. 133 (1907). 312–320. Reine Angew. Birkh¨ user. Boston. 125–134. 198–287. 20 (1840). [7] I. J. G AUSS . M. K APRANOV AND A. J. Dense point sets have sparse Delaunay triangulations.II. Math.2 Power Diagrams 25 [5] J. M. Reine Angew. Recursion der Untersuchungen uber die ¨ Eigenschaften der positiven tern¨ ren quadratischen Formen a von Ludwig August Seeber.

10. in three dimensions. we refer to it as the dual shape of . We first discuss this pattern for general sets that are not necessarily balls. Note that this is just a more formal way of explaining the duality transformation we used in the last section to construct the Delaunay triangulation from the Voronoi diagram. . iff the common intersection of Voronoi cells has a non-empty intersection with the union of balls: . The number of regions is therefore           ¢  ¥      ¥  #  "  § ! ¥   ¥ Dual complex. We have and because the -st circle intersects the other circles in at most two points each. two. Call of a collection of sets independent if for every subcollection there is a point inside every set in and outside every set not in : Hence. We use the pigeonhole principle to show that the maximum number of independent disks in the plane is be the maximum number of regions three. Figure II. which implies that at most three disks can be independent. looks like the ball-and-stick diagram common in chemistry and biology. Recall that a simplex belongs to the dual complex iff the corresponding clipped balls (the ) have a non-empty common intersection. In a nut-shell. we generalize this construction and consider the dual of the Voronoi diagram restricted to within the union of the defining balls. In this context.9 illustrates the definition for the set of disks used in many of the previous figures. The nine edges correspond to the pairwise intersections and the two triangles to the triplewise intersections of the clipped Voronoi cells. This condition has an interesting consequence on how the themselves may intersect. # A collection of size has subcollections. In the special case. In this section. there must be points whose patterns of inclusion in the sets are pairwise different. Equivalently. each stick represents a covalent bond. There. These points cut the -st circle into at most arcs. there is only one possible intersection pattern for four independent balls. Again. Let be a subset of the index set. and three disks in the plane. in which the balls have non-empty pairwise but no non-empty triple-wise intersections. and each arc cuts at most one region into two.  ¡¡ # II. For each there is a (combinatorially) unique independent configuration shown in Figure II. Figure II.3 Alpha Shapes   ¤    §  ¢     ¡  ¢    ¥ ©    ¢ ¤ ¡ ¥¨ ¥ ©   ¨¢ ¦¨ ¡   ¤ § ¥   £  ¡  ¡    ¢£¢ ¡   # £ ¥¨ # ¤   . The dual complex records the non-empty common intersections among these cells. while here. Observe that the Voronoi cells decompose the union of balls in into convex cells . it represents the geometric overlap between two balls. there can be at most four balls (one more than the dimension of the space). ¡ Recall that the Delaunay triangulation is the dual of the Voronoi diagram.26 II G EOMETRIC M ODELS Independence. and they can form only one combinatorially distinct intersection pattern. where it can be used to show that the maximum number of independent balls is four. The underlying space is the set of points contained in simplices of .10: The independent configurations of one. Let we can get by drawing circles in the plane. ¥      ¥ 7     ¥5         ¥     ¥ ¥ ¥  ¥    0 ¥¤     £      ¢          ¥ 0  ¥ 0 ¦   ¤ ¤   ¦  ¥ 0  0      0 where is the convex hull of the centers of the balls with index in .9: The dual complex is drawn on top of the Voronoi decomposition of the union of disks. The same argument also works ¥   &   Figure II. For this collection to be independent.

.II. we assume the lemma for disks (or rather for caps on a sphere) and prove it for balls in . We refer to this sequence as a filtration of the Delaunay triangulation. covers all Voronoi vertices. the Voronoi cells of the balls are unchanged at all times. . all radii are imaginary.11. we get three disks of maximum size by intersecting them with the plane that passes through the centers. Given three balls. and the dual complex is equal to the Delaunay triangulation. In discussions of combinatorial properties. and because lies outside . Instead of time. and so on. For large enough time. we use the square root. but then . By construction. the three caps are not independent. the Independence Lemma also holds for three disks in the plane. ¡ £   0 ¤    )  £ ¤¥ Figure II. the three caps are not independent. the dual complexes can also only get larger in time. the radius of the ball at time vention is that for is . I NDEPENDENCE L EMMA . The following lemma is the key to proving that all simplices in the dual complex are independent. We return to the idea of growing the balls continuously and watch how the union changes. It follows that the dual complexes that arise throughout time are subcomplexes of one and the same Delaunay triangulation. In this spirit.11: The planes bounding the Voronoi cell intersect the sphere in three circles. The three planes meet at . . Recall that each simplex in the Delaunay triangulation is spanned by the centers of a small collection of balls. )  ¦¤  ¤   ¥ ¡   ¡   ¢ ¡     ¡ ¡ ¤ u D   D ¡ D   ¢  ¡D ¡      ¡     ¡ ¤ ¡ ¢¡    £    D 0 £ ¤¥  D   0 0  ¡ ¤     ¥ ¡ ¡ ¢¡ ¢ ¡       ¢  ¡         ¡  ¡ ¡ ¡   ¢ §§¢¡ ¥ ¡ ¡ ¡         ¢ ¡ ¡ ¡  ¢    ¡ ¡   ¡ ¢ ¡     ¢ . It can still be that there is a point outside contained in .12 illustrates the construction by showing three complexes in the filtration generated by eight disks in the plane. So there exists a subset not represented by 27 independent caps. We need some notation. We thus have a sequence of complexes that begins with the empty complex and ends with the Delaunay triangulation. since the portions of the Voronoi cells covered by the balls can only grow. Figure II. as claimed. We let time go from to and grow the weight of each ball to at time . A collection of four balls in is independent iff the (unique) vertex of the corresponding Voronoi diagram is contained in the . . The main reason for this con. Let be the collection of balls and the dual complex of at time . The circles bounding these caps lie in the three planes bounding the Voronoi cell of . and because lies outside the sphere. As mentioned above. Each has zero weight at time and negative weight and therefore imaginary radius before that time. and can be proved by induction over the dimension. we assume that is not independent. is not independent. This plane intersects the Voronoi diagram of the balls in the Voronoi diagram of the disks. we call the simplex independent if the collection of balls is independent. as the index for time varying sets. To avoid the complications of a discussion for general dimensions. Furthermore.3 Alpha Shapes Independent simplices. To translate between continuous time and discrete To prove the reverse. Assume first that . But this implies that three balls are independent iff the (unique) line in the corresponding Voronoi diagram has a non-empty intersection with the union of the three balls. and the dual complex is empty. We will prove shortly that all simplices in the dual complex are independent. There are only finitely many simplices and therefore only finitely many subcomplexes of that arise as dual complexes during the growth process. Then intersects the other three balls in three non- )   ¡ ¤       ¤ ¤    ¡ any point on the sphere. But this is exactly the criterion for a simplex to belong to the dual complex. union: P ROOF. We refer to as the -complex and to its underlying space as the -shape of . two balls are independent iff the (unique) plane in the corresponding Voronoi diagram has a nonempty intersection with their union. . for every . In other words. Similarly. four for a tetrahedron. But this implies that the Voronoi vertex lies outside the sphere: . The lemma holds in any dimension. It follows that each simplex in is independent. There sphere bounding intersects the other balls in three caps. Filtration. A particular such configuration is illustrated in Figure II. for example . For small enough (large enough negative) time. three for a triangle. This is a fairly strong statement since it limits the balls to a single intersection pattern. we sometimes forget the difference and think of the simplex as this collection of balls. that is.

Often two contiguous complexes and differ by only one simplex. That generalization benefitted from adopting the language of simplicial complexes. Discrete Comput. The main reason for the popularity is the duality between space-filling diagrams and alpha shapes as explained in this and the two preceding sections. even if it does not coincide with a dual complex. the difference between and consists of two or more simplices. Figure II. We can sort the Delaunay simplices in the order in which they enter the dual complex. [2] H. . the concept has been generalized to three dimensions and made available as a software package with graphical user interface [4]. Remaining ties are broken arbitrarily. The time becomes independent is also the time the orthosphere of dies or shrinks to a point. namely when all three disks reach Bibliographic notes. with the orthosphere of dying last at time . but the dual edge does not belong to the dual complex because their common intersection is disjoint from the corresponding Voronoi edge. the edges become thinner and the triangles become lighter. Kirkpatrick and Seidel [3] in 1983 for finite sets of points in the plane. Figure II. their orthospheres die at different times. which has been developed decades earlier in the area of combinatorial topology [1. we define a function . computes the connectivity of the  ¤   ¤ ¡ ¤ %¨     ¨  £ ¡ ¨   D  ¤  ¥     ¨ ¨    ¤ ¨    D ¨   £ . that also belongs to the difference. The union of balls and its dual shape. this case is characterized by a non-empty common intersection between the affine hull of and the Voronoi cells of its vertices. Combinatorial Topology. Figure II. The first complex contains all vertices but only two edges and no triangles. 1998 (republication of translation of the original Russian edition from 1947). alpha shapes had to be extended to take into account weights. All these simplices are born at the same time. all these simplices are faces of a single simplex. This property of the ordering will be crucial for the algorithm in Chapter IV that . and in case of a tie by dimension. 5]. A LEXANDROV. 415–440.13 illustrates this case. 13 (1995).28 II G EOMETRIC M ODELS the shared Voronoi vertex. In the generic case. the birth-time of coincides with the time it becomes independent. S. About a decade later. .  ¤ Ordering simplices.   ¤  ¨   ¨  ¡  ¤  ¤  £ ¡ ¨    ¢ ) ¡ ¨ ¡  D ¨  ¤  ¤ ¨  ¡  ) ¡ ¨       D  ¨   ¡  ¨ rank. To fully develop that duality. In the absence of any degeneracy. From the first to the third complex. Geom. The difference between two contiguous complexes in the filtration consists of all simplices whose birth-time coincides with the creation of the second complex.13: The two larger disks are independent.    such that if We represent the filtration by sorting the Delaunay simplices by birth-time. some of which are explained in this book. E DELSBRUNNER . Every dual complex is a prefix of this ordering. Dover. New York. every prefix is a complex.12: Three unions of disks and the corresponding dual complexes. and this has been described in complete generality in [2]. In this case. Sometimes. but the pair of larger disks became independent earlier. . however. This is also the time when the three disks become independent. The triangle connecting all three centers and the edge connecting the centers of the two larger disks are born at the same time. The unexpected popularity of that software in structural biology triggered the development of further geometric concepts useful in structural biology. and because of the tie breaking rule. Let the orthosphere of be the smallest sphere orthogonal to all balls whose centers are vertices of . Define the birth-time of a simplex as the minimum time such that for all . [1] P. Geometrically. Alpha shapes and alpha complexes have been introduced by Edelsbrunner.

Chapman and Hall. Three-dimensional alpha shapes. E DELSBRUNNER AND E. IEEE Trans. Inform. S EI DEL . . [5] P. D. G. ¨ [4] H.II. E DELSBRUNNER . London. K IRKPATRICK AND R. Second edition. Graphs. Theory IT-29 (1983). P. G IBLIN . ACM Trans. M UCKE .3 Alpha Shapes 29 [3] H. Graphics 13 (1994). Surfaces and Homology. 551–559. 1981. J. On the shape of a set of points in the plane. 43– 72.

This is done according to published translation tables that map atoms to van der Waals radii. but the radius must be inferred from the atom type. the algorithm can be written as follows. The coordinates are explicitely given in the file. Exact arithmetic guarantees the correct execution of flips in all generic and therefore unambiguous cases.1. Delaunay triangulation. The operations are ambiguous if the balls are in non-generic position. endfor. The resulting set of balls thus defines the solvent accessible diagram representing the interaction with the surrounding water. for . Using an arbitrary ordering of the balls. .1 of the Alpha Shape software executed on an SGI workstation running under the UNIX operating system and may differ for other versions and platforms. this radius increment is ˚ 1. To cope with the related robustness problem.pdb and create a new file name that con-    for to do I NSERT ¡  ¡ )    ) . The flips are performed depending on the outcomes of only two types of primitive tests needed in the construction of the Delaunay triangulation: O RTHOGONALITY: decide whether a ball is closer or further than orthogonal to the orthosphere of four other balls.pdb name to read name. and so is the Delaunay triangulation. we call > pdb2alf -r 1. we write for the set of the first balls and for the Delaunay triangulation of . there is no universally agreed upon table. We briefly mention the algorithmic ingredients used. O RIENTATION : decide whether a ball center is on the positive or negative side of the oriented plane spanned by three other ball centers. but can be inferred to some accuracy from the types and relative positions of the other atoms in the protein. the van der Waals radii of larger atoms are adjusted to include the bonded hydrogen atoms. Specifically. which accounts for almost 50% of the number of atoms found in organic matter. The -r option allows for the specification of a radius increment that is applied to every atom in the file. Only a fraction of the information is needed to construct alpha shapes. Both tests reduce to the sign of the determinant of a small matrix and can be decided without computing intermediate geometric information.4 A. molecular mechanics calculations. see Section II. Hydrogen atoms sometimes donate their electrons to complete the shells of other atoms and thus can exist without any shell and radius to speak of.   ¥  ) ¥ ¤ ¥ £ £ ¤     ¢  ¡ ) £   £ Data format. The efficient and robust construction of the Delaunay triangulation in is not entirely straightforward. which is the most common approximation used for the size of water molecules. we take four steps to construct and visualize alpha shapes in an interactive graphical user interface: > > > > pdb2alf name. we use exact arithmetic and simulated perturbation. name. Specifically. In the common unified atom model.dt that represents the Delaunay triangulation.30 II G EOMETRIC M ODELS tains a line for each atom listing its three coordinates and the van der Waals radius. adding one ball at a time to the triangulation. This is accomplished by the command > delcx name The aunay omple program creates a file name. for each atom we only need its coordinates in three-dimensional space and its radius. The discussion is more descriptive and less analytical than in the previous three sections. With this notation. etc. Some differences are due to different methods used to derive radii. and We can extract the coordinates and the radii using software that is part of the Alpha Shapes distribution.4 name. In our example. II. including measurements of closest approach. £ § ¢ ¡ ¢  ¢ ¡ The details of the discussion apply to Version 4. Unfortunately. One of the most problematic elements is hydrogen (H). The main public source for structural protein data is the Protein Data Bank (pbd) mentioned in Section I. The basic strategy is incremental.pdb name delcx name mkalf name alvis name The -th ball is inserted through a sequence of flip operations.4 Alpha Shape Software This section introduces the basic Alpha Shape software and explains how to go from a standard descriptions of protein structures to the visualization of their alpha shapes. Hydrogen atoms are generally not represented in pdb-files. The first step towards computing alpha shapes is to construct the Delaunay triangulation of the set of balls.3. Given a pdb-file.pdb.

As explained in Section II. we only show the singular simplices together with the regular triangles.alf. it enumerates the simplices whose intervals contain in time O( ).15 shows four alpha complexes of the relatively small gramicidin protein. and scene panel. but there are others. so . For this purpose. In each case.alf. £    ¨ D  D      D ¡ D       is  D           ¤ D ¤ ¡D   D   D    D ¡ not in singular regular interior if if if if ) ¡ ¨ ¡ D ¨ ¨ ¨   ¤ ¡ ¥ ¨ D ¨ ¦¦    ¦¦ ) ¢  D ¤ D ¤ ¡D £ ¨ ¤ £ ¢ ¡ ¥ ¡  £ )       ¥ ¡ ¤  ¢ £ ¤ ¥    ¢  ¤ £¡ ¤    ¢   . Finally. The use of exact rather than floating-point arithmetic poses a challenge to the efficiency of the code. and the filtration file. This danger is quite real as systematic enumerations of the data tend to generate subconfigurations with relatively large Delaunay triangulations.dt. and redo the computation in exact arithmetic if the error is too large to guarantee a correct decision. and a signature panel. the Delaunay triangulation in can have a number of simplices that is quadratic in . . we apply a random permutation to the input sequence and construct the Delaunay triangulation following this permutation.2. As mentioned in Section II. Figure II. Some of the three events may coincide. the balls of organic molecules are 31 > mkalf name The a e pha shape iltration program reads the Delaunay triangulation in name. This is the filtration of -complexes. then every pair of vertices forms an edge in the Delaunay triangulation. we need quick access to the simplices of the various types in .14: Edge-skeleton of the Delaunay triangulation of twenty one points on the moment curve in . The interface consists of a visualization panel. The combinatorial topology term for being singular is principal and means that is not a face of any other simplex. ¨ Figure II. The software refers to the sorted sequence of simplices as the ‘masterlist’. The simplex is regular if it belongs to the boundary but is not principal. Given a value of . a tetrahedron is interior as soon as it is born. All alpha ©  © ¢ £ Filtration. name. Another challenge to the efficiency of the code is the inherent size of the Delaunay triangulation. Then we spend a lot of time constructing that triangulation. that stores the filtration along with some auxiliary data structures. A simplex in the boundary of can never become interior. A common remedy is to use so-called floating-point filters: calculate in floating-point arithmetic. so . Each such tree stores some number of intervals in space O( ). a simplex whose orthosphere dies strictly before the simplex is born is never singular. when becomes a face of another simplex. We represent the filtration by the sequence of Delaunay simplices ordered by birth-time. and it is interior if it is completely surrounded by other simplices. dual complexes obtained by growing the square radii form a nested sequence of subcomplexes of the Delaunay triangulation. for . name. only to destroy most of it before arriving at the final triangulation. bound the error. and when becomes interior to the alpha complex. We finally discuss the visualization interface of the Alpha Shapes software. In other words. The necessary support structures are computed and the graphics user interface is opened by executing > alvis name The pha shape ualization program uses both the Delaunay triangulation file. so . The danger remains that one of the intermediate triangulations is large. The remedy here is to add the balls in a random sequence.3. For example.4 Alpha Shape Software simulated perturbation reduces ambiguous cases in a consistent manner to unambiguous ones.II.14. For example. Fortunately.dt and generates a new file. It stores each simplex several times. The main reason for recording all this information is to determine how to draw in the graphical interface. The sequence is generated by calling ¥ ¥  £ § ¨ ¢  £   ¡ D ¤   D   £ usually well packed and have Delaunay triangulations of size at most proportional to . marking when is born. and for a given moment . Then  Visualization. name. we store the existence intervals in a number of intervals trees. as shown in Figure II. if the centers of the balls lie on the moment curve and all radii are equal. Suppose the three events happen at times .

the software reached version 4. in wireframe. complexes are shown in the first but which complex is shown and how it is shown is decided in the other two panels.16: Signature panel of the Alpha Shape visualizer. shaded. 1-skeleton of the Delaunay triangulation shown in Figure II. Specifically. as shown in Figure II. By default. For example. The buttons in the middle of the scene panel provide control over how simplices are drawn: colored. the three default signatures map each index to the number of singular edges. The visualized complex is selected in the signature panel. The best documentation of the algorithm and data structures used in the software are still his thesis [6] and the original paper on the topic [4]. Different settings can be used to highlight different aspects of an alpha complex.15: Four alpha complexes of gramicidin. edges. All signatures that count rather than measure are displayed in log-scale. £  ¤ Bibliographic notes. After a period of rapid development directed by Ping Fu at the National Center for Supercomputing Applications. the panel displays a variety of functions (or signatures) that illustrate how the complexes change with time. Instead of mapping the time to a property of . which is still the most recent version distributed on the web [7]. the area of the boundary. A survey of geometric measure- £ ¡ ¤   ¤  ¡ ¡ ¤ ¨ . triangles and the regular triangles are shown.16 shows the signathe underlying space of ture panel and the three default signatures for gramicidin. A particular index.17: Scene panel of the Alpha Shape visualizer. The Alpha Shape software was created by Ernst M¨ ucke as part of his doctoral work at Urbana-Champaign. To facilitate the reconstruction of the map from time. only the singular vertices. is selected by the position of a vertical bar in the signature panel and by clicking the Alpha Shape button in the scene panel. the panel contains a signature that maps the index to time. the signatures map the index to the property of . the largest resource for structural protein data is the Protein Data Bank [1]. Figure II. For example. Figure II. The matrix on the right hand side can be used to select the types of displayed simplices. or with gaps created through a slow explosion.17. The interval tree used for fast retrieval of simplices is explained in [2]. As mentioned earlier.32 II G EOMETRIC M ODELS Figure II. the D #D ¨   £  ¤ Figure II.1 in 1996. The Delaunay triangulation software in the Alpha Shapes distribution is based on a variety of algorithmic techniques described in a recent text by Edelsbrunner [3].14 is obtained by drawing all edges of the last alpha complex while suppressing the display of all triangles and tetrahedra. it shows the log-scale graph of . . To support that selection. seamless. which can be accessed via the web [8]. and the volume of .

W ESTBROOK . J. Shapes and Implementations in Threedimensional Geometry. The Protein Data Bank. Math. [8] Protein Data Bank web-site at www. Cambridge Univ. England. Graphics 13 (1994). ¨ [4] H. [1] H. I. areas. B OURNE . 235–242. N. F ENG . 33 . ACM Trans.edu. Sci. M. Comput. Kluwer. Protein geometry: distances. and volumes. Press. A new approach to rectangle intersections – part I. [7] Alpha Shapes web-site at www.).4 Alpha Shape Software ments of proteins including a discussion of different tables for van der Waals radius assignment can be found in [5]. 1993. 2001. B ERMAN . Internat. M UCKE . Nucleic Acids Res. [2] H. Illinois. M. 28 (2000). Chapter 22 in The International Tables for Crystallography. UIUCDCS-R-93-1836. the Netherlands. [5] M. Vol. Rossmann and E.duke. Arnold (eds. see also the software collection in biogeometry. G ERSTEIN AND F. E DELSBRUNNER . G. 2001. M UCKE . P. F. H. 531–539. E.org/pdb. J. Univ. E DELSBRUNNER AND E. T.org. G ILLI LAND ..rcsb. S HINDYALOV AND P. E DELSBRUNNER . G.alphashapes. Comput. Dept. Rept. Dordrecht. Three-dimensional alpha shapes. 13 (1983). P. M. R ICHARDS . Urbana. 209– 219.II. ¨ [6] E. Geometry and Topology for Mesh Generation. 43–72. Z. W EISSIG . N. [3] H. B HAT.

A water molecule consists of one oxygen and two hydrogens: H O. triangles and tetrahedra are in the barycentric subdivision of a tetrahedron? (ii) Use the Alpha Shape software to create the barycentric subdivision of a regular tetrahedron. Call a disk in a finite collection of disks redundant if its Voronoi cell is empty. (ii) Describe the Voronoi diagram and the sequence of alpha complexes of the model. Prove that a tree-like cyclic sequence over an alphabet of letters has length at most . Number of arcs. The boundary of the union of the disks consists of circular arcs contributed by the circles.  £  ¡ ¡ R r¡     ¡ R ¡   ¥ ¥ #B     ¥ #¥¦ © ¥ d ¥ d    ¢ ¡ ¡ ¥ ¥    ¡ ¡¡ B ©B    ©B  ¥  ¥ ¢          ¥ ¤ ¥        ¢ ¥ ¡ ¢ ¡ ¢  R ©   ¥    § ¨   ©¥ ¥  ¡ R ¡ R ©        ¥  ¨ ¨   ¥ ¡ R ¡ R ¥ d ¥  ©B    ¥¢     ¢ ¡ © ¥ 7¥ R   R  r¡ R R R 6R ¥ ¡   ¡ 5 ¨   . form a sequence but refrain from placing any letter twice in a row. Binomial coefficients. [You will need to use weights to make the barycentric subdivision of the tetrahedron the Delaunay triangulation of the points. A half-plane is the set of points on or on one side of a line in .  unless . Let be a set of the plane. Independent half-spaces. Tree-like sequences. (i) How many vertices. Recall also that ¥ £  ¥  ? 7. ¦ [You might consider answering question (ii) before question (i). . subsequences of the form and are prohibited. Given an alphabet of letters. Is this bound tight? 3. (i) Assuming the boundary of is a single closed curve. that works for all posi- ¢ ¡   ¥    ©¥  ¥ ¥ ¥    ¥       ¥     1.] 6. and that satisfy Conditions (a) and (b). a half-space is the set of points on or on one side of a plane in . The filtration of water. Similarly. 8. Sphere arrangements. Examples of tree-like sequences of four letters are and . What is the maximum number of independent (i) half-planes in . (i) Prove that if there are disks . The barycentric subdivision of a simplex is obtained by adding the barycenter of (also known as the centroid or center of mass) as a new vertex and connecting it to the simplices in the barycentric subdivisions of the faces. 4.] ¥    £ Exercises £ ¥          ¡ ¥ £ £ £   £  £ £ ¥£   . and in the collection such that (a) for the orthocenter of . Let be two positive integers and recall that the binomial coefficient is the number of ways we can choose elements from a collection of elements. Is this bound tight? (ii) Prove that in general the number of (maximal) circular arcs in the boundary of the union is at most . (iii) caps on a sphere in  disks in 2. (i) Look up the standard geometric model (determined by radii. use tree-like cyclic sequences to prove that it consists of at most (maximal) circular arcs. In other words. Empty Voronoi cell.]  ¡ ¥  ¢ ¡ ¨ ¢ ¡  ¡ ¨ ¢ ¡ ¥ (i) Prove that a tree-like sequence over an alphabet of letters has length at most . edges. Let be the maximum number of cells we get by drawing spheres in . Is this bound tight? (ii) Define a tree-like cyclic sequence by prohibiting cyclic subsequences of the form . and a cap is the intersection of a sphere with a half-space. [We note that the relation in (ii) neatly generalizes the formula . The generalization is not quite as neat if we sum powers rather than binomial coefficients. The sequence is tree-like if there are no two letters that alternate more than twice. Barycentric subdivision. prove that if is redundant then there exist disks . (ii) Prove that the necessary conditions given in (i) are also sufficient. (ii) half-spaces in . Is this bound tight?   ¥ ¡ (i) Show that (ii) Give a formula for tive .34 II G EOMETRIC M ODELS (i) Show that (ii) Show that  ¡ 5. and (b) lies in the triangle then is redundant. bond length and bond angle). In other words.

4 Molecular Skin Curvature Adaptive Meshing Skin Software Exercises 35 . In other words. One is the continuity of the normal direction.1 III. which may be used to support numerical computations over the surface. we introduce model that is similar to the molecular surface. In this chapter. we present software for constructing molecular skin in two. Finally in Section III. In Section III.Chapter III Surface Meshing Recall the different types of space-filling diagrams we discussed in Chapter II. Another interesting property is an inside-outside symmetry that implies the existence of locally perfectly complementary molecular skin models.3 III. Corners and crevices are filled up and the surface consists of spheres connected by blending torus patches and inverted sphere patches. The van der Waals and the solvent accessible models are both unions of finitely many balls in three-dimensional space and differ only in the radii.and three-dimensional space. and we show that the maximal principal curvature is a continuous map over the molecular skin. We have also discussed the molecular surface model that is obtained by rolling a sphere about the van der Waals model. we discuss various notions of curvature of a surface. for each cavity we may construct a molecular skin representation whose boundary matches that of the molecule. III. Its surface consists of spheres connected by blending hyperboloid patches and inverted sphere patches. This chapter is organized in four sections.4. In Section III. In Section III. another the continuity of the maximum principal curvature. and some of the possibilities along these lines will be discussed in Chapter VIII. we give the geometric definition of the molecular skin and show how it can be decomposed into quadratic patches.2 III. and we use that software to illustrate some of the properties of these curves and surfaces.2. Both properties are crucial for the construction of good quality meshes.3. We call this the molecular skin model. we describe the algorithm that constructs a molecular skin in terms of a triangle mesh.1. The surface is piecewise quadratic and has a number of attractive properties not shared by the other space-filling models. The molecular skin also lends itself to represent deformations.

and similarly the convex hull is the subset of zero-sets of convex combinations. The centers of the circles in the affine hull are therefore the points on the line that passes through and . the affine hull consists of all circles that pass through the same two intersection points. ©   is orthogonal to if . Even is most relevant for the though the case of spheres in study of molecules.1.   0  p  ¤p  ¥  ¢   ¥ ¢ ¥     ¥ ¥ ¢   ¢ ¢   ¢ ¥ ¢ ¢   Figure III. We compute the center and ra- III. Recall that the weighted square distance function of a circle is the map defined by . that arise as weighted square distance functions have the . note that ¡ ¥ ¥ ¥ ¥ ¥ ¢  ¥ ¥ ¢ ¢ Recall that a circle ¥ ¢ ¡   0  0  p   p    0  0  p   ¤p     0 ¡ 0  p  ¡ p    ¡ ¡ ¡ ¡  ¡   p      0  ¡ 0  ¢   ¢ 0 ¢ ¥ ¢ ¢    ¢ ¢ ¥ ¥     ¢   ¥ ¡ ¡  p  # © ¢   © Functions form a vector space under the usual notions of scaling and addition.  0 ¥ Figure III. Circles and paraboloids. If is orthogonal to and to then it is also orthogonal to every circle in the affine hull of and . namely the one consisting of functions of the above form.  §   The center is therefore and the square radius is .2: Circles sampled from a coaxal system consisting of two orthogonal pencils. . All paraboloids square distance function. In other words. its graph is a paraboloid of revolution in that intersects in the circle. It is possibly easier to develop an intuition for combining circles than for combining paraboloids. the affine hull is the set of zero-sets of affine combinations of the corresponding weighted square distance functions. To see this elementary fact.1: A circle in distance function. is the zero-set of its weighted square  0   0  p   p ¥  ¢ ¢   ¥  ¢     ¥ ¢ ¥ ©     dius of the zero-set of ¢ . If instead of the affine hull we take the convex hull. We call the resulting family a pencil of circles. the circle is the zero-set of the weighted .1 Molecular Skin Almost everything we will say in this section applies equally well to spheres of any fixed dimension. As illustrated in Figure III. The three paform rameters correspond to the three degrees of freedom represented by the center and the radius. Given a collection of circles. Given two intersecting circles and . like the vertical family sketched in Figure III. .2. We will use only a subspace of that vector space. Indeed. there is sufficient pedagogical advantage to first talk about circles in . then we get the subset of circles whose centers are the points on the line segment with endpoints and . If and are disjoint then the affine hull is again a pencil but this time of pairwise disjoint circles.36 III S URFACE M ESHING Pencils. we can generate another such function by affine combination. where the are real numbers with . if then for all coefficients and . Given a collection of such functions . The new function is a convex combination of the if all are non-negative. We have    ¡       #B   p  p ¡ p   ¡ 0   p      0     ¡  0  p  p    p p B     p  p     rp B r¤p    0  p  !p  B     ©   ¡ ©B  © ¡  ¥   ¢ ¥  ©B    ©B     ¥  ©          © ¡      B !p ©B   p   0        ¡  ¢    £ §  ¢   §£ ¡ ¦ ¥      ¢ ¢ ¥     AB¡  B R   B   B ¡  ¡ ¡ ©        ¡     © ¢ ¡ ¡  © ©   © ¡ £ ¤        ¡    ¡ ¡ © © © ¡ ¡ ¡ ¡        ¦         B ©B ¢   ¡    §      ¦£ ¥   © ¦P§  © ©   ©       ¢ © .

and finally taking the envelope. It is the set of points for which vanishes. . and symmetrically. as in Figure III.1 Molecular Skin and thus vanishes as required.3. It consists of two circles connected by a blending hyperbola arc. but the union of their disks is just the union of the two original disks. the skin is the boundary of the body.5. The collection of all reduced circles is the projection of the entire zero-set. We introduce a shrinking operation that reduces small circles less than big ones and this way generates a smooth envelope. Such a configuration is illustrated in Figure III. The same parametrization of the family of reduced circles. . for a family of circles we define . We thus take an indirect approach and first study what happens when orthogonal circles shrink. we have two pencils in which each circle in the first pencil is orthogonal to each circle in the second pencil. Envelopes. It is the region in bounded by the skin.4: Sections of the zero-set of tive direction. Orthogonality and complementarity. It can be visualized as a leaning hour-glass of circles.   Figure III. at least directly. Then every circle in the affine hull of and is orthogonal to both and and thus to every circle in the affine hull of and . We are interested in the envelope of a shrunken pencil. Furthermore. The convex hull of two circles is an infinite family of circles. We thus have ¢   Figure III. Similarly. The body is the union of disks bounded by circles in . we have equality iff . gives  Figure III. The envelope of is the projection of the silhouette of as viewed along the direction.4. The skin of three circles is already more difficult to understand. .III. Skin and body. as shown in Figure III. Let and be two orthogonal circles. More general curves than just hyperbolas can be constructed by taking the convex hull of a finite collection of circles. In other words. Formally. the reduced versions of any two orthogonal circles     ¡ 0 0   0     ¥     0  0          0      00   00 ¥    ¡     ¢ ¢ ¥ ¢ ¡    The reduced circle with center is the zero-set of 0  B  ¥ ¥    B B   Bt   d XB XB  d  ¢ B ¥ ¦¢         ©B    ¡     ¥ ¥ ¡ ¢  ¥     © ¦P§     ¥   © ¦!§  ©  ¡         " £ ¤¢      ¥¦¢ £ ¤¢  p  p  0     ¡      ¤ ©   ¡ ¢  ¡ ¢ B    ¡   ¡    " ¡ ¡ ¡ ¥   B    " ¡ ¡¢ ¡ ¡   "         ¢ ¡  ¥  ¡   ¡ ¡   ¡ ¡   ¡      ©  ©B   ¥   © "   0       B XB   ¢    ©        "   ¡   ¡ " . then shrinking every circle in the family. The smallest non-trivial example is the skin of two circles. If these circles intersect in two points then the skin is a dumbbell.3: The dotted circles belong to the affine hull and the solid circles are reduced. Suppose is a pencil and all its circles pass through the points and . From we get . In other words. An example can be seen in Figure III. the skin of the collection of circles is the envelope of the reduced circles. which sketches a shrunken pencil of circles. We parametrize by the coordinate of the circle centers. viewed from the posi- Taking roots left and right implies that the radii of and add up to at most the distance between the two centers. 37 for fixed value of . The envelope is therefore the zero-set of . which is a hyperbola.5: The skin of two intersecting circles is the envelope of a reduced line segment of circles.2 and is referred to as a coaxal system. The corresponding radius is . Specifically. we define . which is Suppose we are now given two circles and and two more circles and both orthogonal to and .

The complementarity of the bodies extends from the case of two orthogonal pencils to the case in which consists of a single circle and contains all circles orthogonal to .4. We claim that the envelope of is the exact same hyperbola. As usual. the mixed complex of is the same as the mixed complex of the collection of circles introduced in Section V. the envelope of is a hyperbola. The mixed complex consists of all mixed cells and their faces. If then is the Minkowski sum of two orthogonal edges and therefore a rectangle.1. Furthermore. The corresponding mixed cell is the Minkowski sum of shrunken copies of both. every circle in for which there is an equally large circle in touches the hyperbola because it touches that circle. We thus claim that the ¥   £      £ ¥£§ ¥¨  £ ¡ £ ¢  ¡ © ¦P§   ¥ © centered at each ¡ ¢ £ ¢  # ¢ ¡  ¥ ¢ £¥ £ §    ¡   ¥ ¥¨ #   ¥ # ¥     ¥ ¡ £ ¢ £ £¥ § £       " " " "  "   "      ¡   "    ¡ "  . A Figure III. as sketched in Figure III. which are the convex hulls of corresponding Voronoi polyhedra and Delaunay simplices. or equivalently. We apply this result to the coaxal system consisting of orthogonal pencils and . each defined by at most three of the circles. We will not prove this claim and instead give an explicit construction of the decomposition. The mixed complex is then obtained by intersecting the pyramids and tetrahedra with the plane parallel to and halfway between the other two planes. is the affine hull of two intersecting circles. connected to each other by blending hyperbola and inverted circle arcs. These circles touch the hyperbola and have the same curvature as the hyperbola at that point. A single circle defines a (smaller) circle. the two asymptotic lines of the hyperbola intersect at a right angle. If then is a shrunken and translated copy of a Delaunay triangle. which requires a local rewrite here and in Section III. we first note that a circle in can at most touch the hyperbola. we would have two crossing reduced circles contradicting the orthogonality of the two corresponding original circles. . The skin of any finite set of circles can be decomposed into simple pieces. Symmetry.          III S URFACE M ESHING skin of consists of circles. [The order of the chapters on skin and pockets has changed now.6: Hyperbola with orthogonal asymptotic lines.38 touch if they are of the same size and they are disjoint in all other cases. The corresponding Delaunay simplex is . which is facilitated by a complex assembled from Voronoi and Delaunay polyhedra.] As explained there. contains a circle    Decomposition. and a triplet of circles defines an inverted circle. In other words. a pair of circles defines a hyperbola.6. we let be an index set and use it to denote the Voronoi polyhedron . rather intuitive explanation of the construction can be obtained by drawing the Voronoi diagram and the Delaunay triangulation on two parallel planes in . for if it crossed. We decompose the slab between the two planes into pyramids and tetrahedra. If the mixed cell is the shrunken and translated copy of a two-dimensional Voronoi cell.7 illustrates the construction by showing the mixed complex decomposing the skin into circle and hyperbola arcs. As shown earlier. To see this. and two osculating circles. Figure III. The two envelopes are therefore the same hyperbola. The smallest separating circle that touches both branches belongs to and has the same size as the two osculating circles that both belong to .7: The mixed complex and the skin of four circles. As shown in Figure III. The set is a two-parameter family spanned by three circles. which implies that the skin of is the same circle. Suppose contains only circles with real radii. Figure III. The skin of is trivially a circle. Note that the construction of the mixed complex is symmetric in the Voronoi diagram and the Delaunay triangulation. smallest separating circle.8.

1 Molecular Skin 39 [2] M.8: The top. Problem 1748. 1988. E DELSBRUNNER . Figure III. Geometry: a Comprehensive Course. Series 2 (1872). Math. 87–115. It identifies each circle in with the point in . DARBOUX . the mixed complex. We have seen that the skins of two orthogonal pencils are the same hyperbola. [5] D. 185–247. Note however that the two bodies are not the same but rather complementary. Discrete Comput. This interpretation is prominently used in the geometry text by Pedoe [5]. That paper also proves that the body of a finite collection of spheres has the same homotopy type as the dual complex. Annales de L’Ecole Normale. Mathematical Questions and Solutions from the Educational Times 44 (1865). F ROBENIUS . Since the mixed complex decomposes the entire skin of into such cases. Anwendungen der Determinantentheorie auf die Geometrie des Masses.    ¥       ¤ ©        ¡    ¤ ©    0 ¡ ¢            ¡ ¡¥¢ ¡ ¥¢    ¢ ¡            ¦ §¡ ¡ ¢¢ £ ¤ ¢¢ ¡ ¡ ¢  ¢ ¡    p 0  p W ¥      ¢    ¢   . It has been discovered in the nineteenth century and published at more or less then same time in three different languages by Clifford [1]. There is another interpretation of the vector space of circles exploited in this section. Dover. J. The Voronoi diagram of is then the Delaunay triangulation of . Under this interpretation. 21 (1999). 323–392. De points. Voronoi vertex (including those at infinity) with the radius chosen so that is orthogonal to the circles that define . where skin surfaces are introduced as orientable manifolds in . middle. the skins of one circle and the affine hull of three orthogonal circles are the same circle. Bibliographic notes. and Frobenius [4]. G. and bottom planes carry the Delaunay triangulation. Geom. and the mixed complexes of and are the same. 144. and the symmetry between and can be explained as a polarity between two convex polyhedra.III. C LIFFORD . it follows that the skin of is the same as that of . Reine Angew. The material of this section is taken from [3]. [4] G. 79 (1875). K. Darboux [2]. the Delaunay triangulation of is the Voronoi diagram of . and the Voronoi diagram. New York. P EDOE . the convex hull of a set of circles corresponds to the usual convex hull of points in . [3] H. Similarly. de cercles et de spheres. Deformable smooth surface design. [1] W.

.1 generalize . which are transformations that preserve the distance between points measured as lengths of connecting paths. In this section.10. as illustrated in Figure III. which are therefore unique. the principal curvatures determine all other normal curvatures at . #B ¢   ¦  III.9: A closed space curve to the left and its Gauss map to the right. it is preserved by isometries. The directions thogonal.     The skin curves introduced in Section III. The curvature is the length of that second derivative.   T HEOREMA E GREGIUM . # x f y Let and be the corresponding tangent directions. . is an isometric invariant. In contrast to the other notions. It is a geodesic at if its normal agrees with the surface consists of a portion normal at . . A closed space curve is a map of a circle to three-dimensional space. . This implies that if then all other normal curvatures are strictly between the two principal curvatures. For each curve in the plane we consider the space curve . The curvature of forced by how the surfaces curves in space and another portion accounting for how curves within the surface. Similarly. and all are obtained by considering the curvature of curves drawn on the surface. The velocity vector at the point is and the speed is the length of that vector. the Gaussian curvature is intrinsic. In other words. In this case and the second derivative. E ULER ’ S T HEOREM .40 III S URFACE M ESHING an open set in . For example. Note that a curve has a parametrization and the counter-clockwise orientation of the circle gives a sense of direction. There are several notions of curvature of a surface. If then all normal curvatures are the same and the point is an umbilic point of the surface. . and a parametrization. ¤ ¦ ¨ # ¦ §¡ ¦ Surfaces. Usually we need only a small number of derivatives. to compute the tangent plane at .    ¤      ¤    ¤  ¤   ¤ ¤         ¤ It is often convenient to assume unit speed.10: Construction of tangent plane from two tangent vectors. Two other common notions of curvature are the mean curvature. They span the tangent plane.3. This is a famous result of Gauss. and if and are orthen ¦    ¢ c ¢ I¥ £ ` ¢   §  $ ¢     ¤  § © ¤   P§ P§  § £   $ §¢  $ ¢      ¤ ¤   ¤ ¢  © ¤   ¤ §  § ¡            ¤   ¤     ¤ ©B ¢         § §  ¤ £ £    £   $ Curves. and if it does we call the normal curvature of at in the direction of the tangent vector . we define the curvature at in sections.2 Curvature ¢ £¡ ¦ © § ¢ ¡ © #D $   #D $     #D S $   ¢ ¡ p D© S $ q ©D S $ p ¢    ¡§ $   p #D E $ ' #D E $   #D ¢ p p ©D E p #D I £¢¢ ¢   ©D E $ $ ©D ¤ ©D ¥ ¢   p #D S $ p   #D     #D E $      ¤ #D ¢ ¡ ¢ . we straightforwardly to surfaces in study the curvature of these surfaces. By a result of Euler. Geometrically. We can think of as the Gauss map from to . The second contribution vanishes for geodesics. The tangent vector is the normalized velocity vector. we take the tangent vectors of two curves that cross at . and the assumption of the existence of infinitely many is convenient but not necessary. the curvature is one over the radius of the osculating circle at . which is defined as long as . There is a circle of tangent vectors. Figure III. which is defined as long as the speed is non-zero. as illustrated in Figure III. For a point   be a smooth surface or 2-manifold in . and the Gaussian curvature.      ©  §         ©  Figure III. and for each one we get a normal curvature. Let . Derivatives are taken along curves on the surface. . . is normal to the first. which is the circle in the plane spanned by the tangent vector and the normal vector. The Curvature Variation Lemma proved at the end of this section will play a major role in the meshing algorithm to be discussed in Section III. . It is smooth if the derivatives of all orders exist. The normal vector is the normalized second derivative. we let be a neighborhood. The principal curvatures at are the minimum and maximum normal curvatures.9.

Within each mixed cell. the symmetry plane is the affine hull of the Delaunay triangle and the symmetry axis is the affine hull of the Voronoi edge. the two hyperboloid cases are symmetric and differ from each other by the surface . we 41 spheres with indices in intersect in a circle. or two-sheeted depends on whether the two spheres orthogonal to the three ¦   Figure III. In the case of the hyperboloid is the affine hull of the Delaunay edge and the (orthogonal) symmetry plane is the affine hull of the Voronoi polygon. ery point is in every tangent direction.III. For the sphere.1. Recall that the skin defined by a finite set of circles in is the envelope of the infinite family of circles in the convex hull. the maximum normal cur- r r r x . and the symmetry plane . the symmetry axis along . the mixed complex defined by the circles decomposes the skin into circle and hyperbola arcs.11.12: The sphere. is one over the radius of ¢   B      "   ¥   ¢   ¢ B  B  B B   B   B ¥ £      ¢ B      ¡ © ¨¡ § £ ¡ ¤¢  ¦ ¥ ¥ £  £ ¢ ¦£ § ¥ ¢ ¦£ § ¥   £   © ¦!§  ©  ¡  !   ¥ ¢ ¦£ § ¥    ¤  £ £    ¢ £ £¥ §  ¡ ¡ ©   ¡    ¤ © ¦  ¦    ¢ ¡ . In either case. We can translate and rotate every sphere and hyperboloid to standard form. Furthermore. the symmetry axis orientation. .  ¡ # Table III.1: The cardinality of listed in the first column determines the dimensions of the corresponding Voronoi polyhedron and Delaunay simplex as well as the type of the mixed cell and of the skin patch. the body lies locally inside. £     £ Figure III. In the case . it lies locally outside the sphere. touch in a point.12.13. Either way. the skin of a finite set of spheres in is . as illustrated in Figure III. which we define as   The second equation defines a hyperboloid with the apex at the origin. Maximum normal curvature. the normal curvature at ev- have a sphere or a hyperboloid patch. or are disjoint. ©B ¡ B cases are symmetric and differ from each other by the surface orientation: in the case . Either way. We have a one-sheeted hyperboloid for and a two-sheeted one for . We have a one-sheeted hyperboloid if the two spheres intersect in a circle and a two-sheeted one if they are disjoint. Consider the hyperbola in standard form in . the one-sheeted hyperboloid. Similarly. and note that both the one-sheeted and the two-sheeted hyperboloid can be obtained by rotating the hyperbola about a symmetry axis. Whether the hyperboloid is one-sheeted. From left Figure III. The mixed complex that decomposes the surface consists of the four types of cells illustrated in Figure III. The hyperboloid can either be one-sheeted (an hour-glass) or two-sheeted. The cases are summarized in Table III. The common limiting case is a double-cone defined by two touching spheres. a double-cone. as shown in Figure III. the body lies on the side of the infinite circle in the symmetry plane. and the two-sheeted hyperboloid. the body is on the side of the infinite ends of the symmetry axis. Similarly.11: Typical mixed cells to right we have and 4. The situation is more complicated for the hyperboloid. The two sphere 1 2 3 4 3 2 1 0 0 1 2 3 mixed cell convex polyhedron polygonal prism triangular prism tetrahedron skin patch sphere hyperboloid hyperboloid sphere  ¤ ¦ vature at a point . and in the case . each reduced by a factor .13: Every point of the hyperbola is sandwiched between two equally large circles.2 Curvature Skin surfaces.

By the definition of the mixed complex. ¦  ¡ B ¡ ¢ ¡ ¢ p ¢ ¡  ¢ ¡ p ¥      p p B ¥ B ¤ B rp ¤ ¢ ¥ B [3] H. We strengthen the result by showing that varies rather slowly.42 the largest sphere that passes through and touches but does not cross the hyperboloid. E DELSBRUNNER . In short. 525–568. San Diego. We have seen that within a mixed cell. 21 (1999). The books by Bruce and Giblin [1] and by O’Neill [4] are good introductory texts to curves and surfaces and other topics in differential geometry.   ¢ ¡ ¤ p rp ¤ B B !p ¢   ¢ ¢  ¢    ¥ ¥ ¤       ¤  ¢ £¡ ¤     G ¤   ¥  ¢      ¢ #B        #B ¤ ¤  ¢ ¥ ¤ 0 ©B    ¤ ¢ ¡ ¢   ¢ ¡ B . Academic Press.1 by one dimension. O’N EILL . Discrete Comput. D EY. as introduced in Section II. As shown in Figure III. is simply the distance to the center. A more direct treatment of the general-dimensional case can be found in [3]. Second edition. H. for every point of a sphere or hyperboloid in standard form. In fact. W. S ULLIVAN . The specific results on the curvature and the curvature variation of skin surfaces are taken from [2]. England. Curves and Singularities. 87–115. Geom. Second edition. this is a continuous function on . Press.2. we obtain the result. Within the mixed cell. Cambridge Univ. we extend to a function defined on all of and show that has Lipschitz constant one. Deformable smooth surface design. T. 1997. Bibliographic notes.-L. the triangle inequality gives the Lipschitz bound. from to . J. The skin surfaces in are obtained by extending the results of Section III.13. Discrete Comput. . [1] J. [2] H. K. 1992. Geom. C HENG . Dynamic skin triangulation. [4] B. B RUCE AND P. E DELSBRUNNER AND J.   ¡ C URVATURE VARIATION L EMMA . The maximum normal curvature varies continuously over the skin because the common radius of the sandwiching spheres varies continuously. III S URFACE M ESHING By applying this to the pieces of the line segment from to contained in different mixed cells. Curvature variation. For all points we have  We note that the extension of to a function describes the maximal normal function of all skin surfaces in the family defined by the power growth model of the spheres. this radius is the same as the distance of from the origin. Elementary Differential Geometry. G IBLIN . 25 (2001).

III.3 Adaptive Meshing point
#

43

Closed ball property. One trouble with the restricted Delaunay triangulation is that it may not be homeomorphic to and thus not triangulate the surface. Indeed, it is easy to come up with cases where is not even a 2-manifold. A sufficient condition for to triangulate is what we call the closed ball property. It requires that each common intersection of restricted Voronoi cells is topologically a closed ball of the appropriate dimension. We formulate this condition in terms of the threedimensional Voronoi polyhedra defined by . Assuming general position, the Voronoi polyhedron has dimension , and we require that is either empty or homeomorphic to a closed ball . Depending on the cardinality of dimension of we have a closed disk, a closed interval, or a single point.

Figure III.14: Local decomposition into restricted Voronoi cells and dotted dual restricted Delaunay triangulation.

Figure III.15: To the left a barycentric subdivision of a portion of a Voronoi diagram drawn with solid lines. To the right the isomorphic barycentric subdivision of the corresponding portion of the dual Delaunay triangulation drawn with dashed lines.

¦

Let be the set of points sampled on . We use it as the vertex set of the triangulation, which we construct as the dual of a decomposition of . Specifically, for each

Proving that the closed ball property implies triangulates is not difficult. Decompose the restricted Voronoi diagram by adding a point in the middle of each

¤ ¢    ¤ ©  ©

#

¢ )

#

¢ £¡

 

¢ )

 

¤

¢ )

#

  

¢ )  ¢    ¢ ¤ ©     ¤  ¢  ¦§1¨ ¢    ¡ © !§   ¥ ¡  ¤ ¢ )

¦¢ £ £¥ § ¦¢ £ ¨£¥ §

¥ ¦

¥

#

¥

¦

¤ ¡

#

¦

¦

#

¦

Triangulations. Recall that a triangulation of a surface is a simplicial complex whose underlying space is homeomorphic to . Since is a 2-manifold, it follows that the simplicial complex is the closure of its triangle set, every edge belongs to exactly two triangles, and the star of every vertex forms a disk. Note that the last property implies the first two. We construct a triangulation by first selecting points on and second connecting these points with edges and triangles. Given the Delaunay triangulation of , we have sufficient information to sample points and to compute their maximum normal curvature values. Specifically, for each Delaunay simplex we construct the mixed cell . The center of this cell is the point at which the affine hull of intersects the affine hull of . It is also the center of the corresponding sphere or the apex of the corresponding hyperboloid. Next, we rotate the mixed cell so its center moves to the origin. Furthermore, if or is an edge then we rotate it into vertical position. The sphere or hyperboloid defined by is then in standard form, which can be sampled. For each sampled point we compute the maximum normal curvature from its distance to the origin and we obtain the corresponding point on by the inverse rotation.

where distance is measured in , as usual. It is the intersection of with the Voronoi polyhedron of in , . The restricted cells decompose into closed regions that overlap along common pieces of their boundaries. Locally the picture is rather similar to that of a Voronoi diagram in . The restricted Delaunay triangulation, , is the collection of simplices with non-empty common intersection of the corresponding restricted Voronoi cells, . The construction is illustrated in Figure III.14. We note that is a subcomplex of the (unrestricted) Delaunay triangulation of in . 

 

¡

¢ ¡ 

¡

#

¦   ¦ 

£ ¢

#

¢ ¡

In this section, we focus on constructing an explicit representation of a molecular skin surface. We choose a triangle mesh realized in that is a good approximation of the surface and has good numerical properties. 

& 

¥ § p    ¦

¥

B rp ¤ p    ¥rp ¢ ¦ ¡ B ¡ B

  ¡¡   
 

III.3 Adaptive Meshing

, the restricted Voronoi cell is 

£ ¢

¦

¥¨

¥

#

¥¨  ¥

¥¨ ¥ 

¦

¦

¦

#

#  

 

¦

¦

¥¨

¥

¡

¦

¢ £¡

£

 

   ¤ ©

 

 

¦

44 arc and inside each cell and connect each point to the points on the boundary. The star of every point inside a restricted cell is a triangular decomposition of that cell. The star of every restricted Voronoi vertex consists of six triangular regions that can be homeomorphically mapped to the six triangles in the barycentric subdivision of the dual restricted Delaunay triangle. By construction of , the triangles in the two barycentric subdivisions are connected the same way so we have a homeomorphism between and the underlying space of , which is illustrated in Figure III.15. -sampling. The question remains how we sample the points such that the restricted Voronoi diagram has the closed ball property. Since is smooth, small neighborhoods are fairly flat and the restricted Voronoi diagram behaves locally similar to the (unrestricted) Voronoi diagram of a set of points in the plane. In other words, a dense enough sample of points should have the closed ball property. This intuition can be made precise by formalizing the concept of density. Recall that is the maximum normal curvature at a point . Around we spread points at distance roughly proportional to . We therefore define and call it the length scale at . The Curvature Variation Lemma of Section III.2 states that for any two points , the difference in length scale is at most the distance between them in , . An -sampling is a subset such that for each point there exists a point at distance . Showing that a sufficiently small implies the closed ball property for the restricted Voronoi diagram is rather tedious and we omit the proof. H OMEOMORPHISM T HEOREM . If is an -sampling of with , then the restricted Delaunay triangulation of is homeomorphic to . The precise upper bound for is a root of the function 

III S URFACE M ESHING arbitrarily ugly. To improve the mesh, we impose conditions on the size of edges and triangles that imply both upper and lower bounds on the spacing between sampled points. , Let the size of an edge be half its length, and the size of a triangle be the radius of its circumcircle, . For edges we worry about them getting too short, so we compare size with the larger length scale at the endpoints, . For triangles we worry about them getting too large, so we compare size with the minimum length scale at the vertices, . We use two constants, and , to express the conditions on the size. The constant controls how closely the triangulation approximates , and controls the quality of the triangles. We refer to the two conditions as the Lower and Upper Size Bounds, [L] [U] for every edge , .
 

for every triangle

It is not necessary to bound the edge lengths from above would belong to because an edge with two triangles that both violate [L]. Symmetrically, we do not need to bound the triangle sizes from below because a triangle with would have three edges that violate [L]. Mesh quality. The constants and have to be chosen judiciously. For example would immediately lead to irreconcilable requirements on edge and triangle sizes. Furthermore, cannot be too large, else we would contradict the -sampling condition stated in the Homeomorphism Theorem. Without going into details, we state that and are feasible choices. In particular, these constants imply that is an -sampling for sufficiently small value of . More precisely, they imply that is either an -sampling or it grossly violates the condition for -sampling. An example of such a gross violation are four points close together on a sphere. The points form a tetrahedron whose edges and triangles may very well satisfy the Size Bounds, but the boundary of the tetrahedron is a miserable approximation of the much larger sphere. Fortunately, such a gross violation of the condition cannot be created from an -sampling without the intermediate generation of triangles that grossly violate [U]. The algorithm discussed below is unable to generate such triangles. The two Size Bounds together imply a reasonably large lower bound on the angles inside triangles of the restricted Delaunay triangulation.

Even sampling. The points of an -sampling can locally not be too far apart, but they can be arbitrarily close together. In other words, on a microscopic scale, the points can be placed every way one likes and the mesh can be

 

which arises in the proof of the Homeomorphism Theorem.

   

¡£
¦ 

 

  

¤  £  

¢ ) ¡ #§  

§  ¨¡    ¡ ¡ £ £ 

¢ ) ¡ §   

 

¡

 

¦ ¡ ¦  ¤ § £ $ § £   

¡  £ $  £   

 ¨ ¡

 

¤ 

£

 

£   

¢

§  ©¡    ¡ ¡ 
   

£¡

  

 !"§§ #   §§ #  

  

  

£ &  

¦ £ § ¥ 
£

  

 ¦ § £ ¡   
  

 

¦

#B

  

B

 

¤

¢ )

¥   

 

¦

¥

  ¡  

 

¦

©B

  

© § ¥ £ 

¡ B

¦

¤

  ¡

¡ B

¦

 

 

©B 

  

#B ¡   p   rp B ¦ ¡ B      B rp ¤ ¢  ¢ ©B ¡ ¢ ¢ ¡ ¡ 

¢ )

p 

 ! §    %

¤ 

 

¦

 

¥   

 

¥ 

 

  

 © § ¥ £   ¤ © §

©B ¡

¤  

 

P  

¥

  

B

  ¢

¦

¥

 

III.3 Adaptive Meshing M INIMUM A NGLE L EMMA . A triangle that satisfies [U] and whose edges satisfy [L] has minimum angle larger than . P ROOF. Let be the triangle and its circumradius. Assuming is the smallest angle, we have of length as the shortest edge. We have by definition of length scale. Using [L] and [U] we thus get 

45 violate the Upper Size Bound. It is possible that an edge contraction causes a vertex insertion, but a vertex insertion cannot create edges of size below the allowed threshold. This is what prevents infinite loops in spite of the algorithm’s partially conflicting efforts to simultaneously avoid short edges and large triangles. To prove this claim, that causes the addition of its we consider a triangle dual restricted Voronoi vertex .

Hence

endwhile.

The details of the algorithm that modifies the restricted Delaunay triangulation to reflect the addition of are omitted. A vertex insertion may cause other vertex insertions, but this cannot go on forever because we will eventually violate the Lower Size Bound. Given an edge that violates [L], we contract it by removing one of its endpoints. We are not able to exclude the possibility that the removal creates new violations of [L], and it certainly can create new violations of [U]. void E DGE C ONTRACTION: while edge violating [L] do if then endif; ; V ERTEX I NSERTION endwhile. The details of the algorithm are again omitted. An edge contraction may perhaps cause other edge contractions, but this cannot go on forever because we will eventually

Scheduling. [Summarize the results on scheduling edge contractions and vertex insertions described in [5].] Bibliographic notes. The restricted Delaunay triangulation is a generalization of the dual complex of a ball union. It can be used to triangulate surfaces and other spaces embedded in a Euclidean space. Besides the dual complex literature, there are several other partially dependent roots of the idea, namely the surface meshing method by Chew [3], the neural net work by Martinetz and Schulten [6], the formulation of the closed ball property by Edelsbrunner and Shah [4], and the surface reconstruction algorithm by Amenta and Bern [1]. The last of the four papers also introduces -samplings of surfaces, although in a slightly different formulation in which the distance to the medial axis replaces the length scale. All results that are specific to skin surfaces are taken from [2]. The algorithm in that paper is more general than 

£    

For therefore

and

we have , as claimed.  

¢

 

£ ¡   ¥G $ '¤££ G   

 

  

void V ERTEX I NSERTION: while triangle violating [U] do

and  

  

p

p

  

¥

¥ 

Brp       ¤ p B!p         ¤

   

¦     ¦ ¨¡  ©B  ¡ ¡ £ £ 

£

¡

 

¥

p

 

B !p  ©B ¡ ¤  ¡      ¡ ¤ ©B ¡ 

¥

B rp

£   ¤G

 

 

Density modification. Given an -sampling, we can enforce the Size Bounds by contracting short edges and inserting points near the circumcenters of large triangles. Given a triangle that violates [U], we add the dual restricted Voronoi vertex as a new point to . The insertion may cause new violations of [U] and thus trigger new point insertions.

¡

p

 

 

¥

B rp

B

 

B

¡

¥ ¥ £ ¨¦  

B

 

§

 

¦

©

 

¥    !  

B 

§ 

§ ¡ §  

#§  

¢ 

§ ¡ 

B¡ £       ¢

¥    

For

, the minimum angle is thus larger than , and the maximum angle is smaller than .

B

   ¡ § £ ¡ ¦ ¦§ £ ¡      ¦ £ § ¥  

¡ 

¡

Hence

.

. The sphere with P ROOF. We have center that passes through , , and has radius and contains no other vertices than inhas therefore length side. Every new edge . Assume without loss of generality that . We use the Curvature Variation Lemma to derive upper bounds for the length scales at and :   

 

£ ¥G

N O -S HORT-E DGE L EMMA . Every edge ing the addition of has ratio

B 

¡ £ $ ¤G  

§

¦

¡ CB
 
¡

¦ ¦ § £ ¡  § £   

§ 

B

B

¦ § £    

 

£    V    © § ¥ £  # ¡  £   © § ¥ £ ¢       §   £¡ p §  p  ¤ ¦§ £ ¡       £ ¡  §¦ £ ¡   ¤ p   p § § 

  

§

 

  ¡  ©

  

    #§   
  

V 

  

¥  

 © § ¥ £ 

#§  
¥

created dur.

 

¥

¥   ¦ £        ¥   ¤£      ¦

¢

    ¡  

 

§ 

46 what is explained in this section and maintains the surface mesh while it moves in space.
[1] N. A MENTA AND M. B ERN . Surface reconstruction by Voronoi filtering. Discrete Comput. Geom. 22 (1999), 481– 504. [2] H.-L. C HENG , T. K. D EY, H. E DELSBRUNNER AND J. S ULLIVAN . Dynamic skin triangulation. Discrete Comput. Geom. 25 (2001), 525–568. [3] L. P. C HEW. Guaranteed-quality mesh generation for curved surfaces. In “Proc. 9th Ann. Sympos. Comput. Geom., 1993”, 274–280. [4] H. E DELSBRUNNER AND N. R. S HAH . Triangulating topological spaces. Internat. J. Comput. Geom. Appl. 7 (1997), 365–378. ¨ ¨ [5] H. E DELSBRUNNER AND A. U NG OR . Relaxed scheduling in dynamic skin triangulation. In “Japanese Conf. Comput. Geom., 2002”, to appear. [6] T. M ARTINETZ AND K. S CHULTEN . Topology representing networks. Neural Networks 7 (1994), 507–522.

III S URFACE M ESHING

As explained in Section III. Skin curves. Using the Morfi software. One function in this family is the trajectory of the skin curve. The Morfi software is two-dimensional and constructs skin curves from finite sets of circles. Observe also that the five Delaunay polygons visible within the mixed complex apparently have eight vertices (not double-counting the shared ones). body. .3. ¡ ¦  ¡ §       ¢ ¡ . and the preimage of any real value is the envelope of the circles .III.16 we see seven disks whose union is decomposed into convex regions by the Voronoi diagram. we use two pieces of software to visualize the various geometric concepts introduced earlier in this chapter. ¡ ¡ YD     © ¦P§ ¡  0   ¢ 0 ¡   ©  ¦P § ¡ ©©  ¡¡  ¡ ¥ ¤¡ £   ¡ ¢  ¢ ¡   ¡      ¢   ¡ ¡Y¡ ¡ B ¡        ¡  ¤ © B  Mixed complex. Furthermore. the -skin is the envelope of the circles in the convex hull that are reduced by a factor . and the dual complex all have the same homotopy type.17 is degenerate. We choose and construct the family such that and approaches as goes to 1.16? Figure III. The skin shrinks the arcs in the boundary of the disk union and smoothly blends between the shrunken arcs using pieces of hyperbolas and inverted circles. portion of the hole boundary inside that quadrangle is circular while the portions outside the quadrangle are hyperbolic. The collection of circles generating the diagram in Figure III. which is converted into an almost entirely circular hole in the body. Note that the disk Figure III. The D ¡         D D D   0   ¡ ¦ ¡    ¡ ¢   §    ¡ ¢          §¡  0     ¡ ¡   ¡ ¢    union contains the body and the body contains the dual complex.1.16: Voronoi decomposition of disk union with superimposed skin. An example is the mixed complex illustrated in Figure III. Where is its center in Figure III. We return to an issue left open in Section V. rectangles. namely the points where dually corresponding Voronoi and Delaunay polyhedra intersect.17. The zero-set of is the envelope of the circles . Simulated smoothing. Superimposed on this decomposition is the skin curve with shaded body and the dual complex.16 because one of the eight radii is imaginary. We generalize this construction to any by letting be the trajectory of the modified skin curves.4 Skin Software 47 III. In Section V. Specifically. and shrunken Delaunay polygons. It decomposes the skin into circular and hyperbolic arcs. the body. and dual complex. We see only seven of them in Figure III.1. which can be seen from the fact that there are three shrunken Delaunay triangles but also two shrunken Delaunay quadrangles. it consists of shrunken Voronoi polygons. This is always true. In Figure III.17: Decomposition of the skin and body by the mixed complex. Most striking is the blending for the quadrangular hole roughly in the middle of the figure.1 we claimed that there is an infinite family of of that all have smooth approximations the same critical points. that maps each point to the moment in time at which belongs to the skin of . Following the notation in Section II. One of the quadrangles contains most of the hole in the body. we think of as time and denote the collection of circles at time by . the disk union. we can visualize concepts that are difficult if not impossible to show in .4 Skin Software In this section. where we considered the minimum weighted square distance function of a collection of circles .

all spheres are imaginary. This is sufficient to justify the Morse theoretic reasoning about the non-smooth function used in Section V. we compute triangulated skin surfaces using the Skin Meshing software. § ¥ ¤¡ £   ¢ £¡ ¡ ¦ ¢ ¡ Meshed skin surfaces. The function maps every point to the moment in time at which belongs to . software updates the mesh accordingly. We use edge contractions to eliminate edges that violate [L] and vertex insertions to eliminate triangles that violate [U].19 should be compared with the ren £   ¡  ¤ ©   Figure III. Shape adaptation. along the integral lines of the skin trajectory. As mentioned earlier. and the mesh is the empty complex.20 correspond to high density regions in Figure III. is also the envelope of the orthogonal circles as defined in Section III. which is facilitated by a motion of the mesh vertices in . At the beginning.48 is the skin as defined in Section III. Only the edges of the mesh and the cut boundary are shown. Figure III.19 shows a portion of this mesh for a small molecule. The image is created by slicing the surface with a plane and removing the front portion of the surface. The growth of the spheres implies a deformation of the surface. The apparent smoothness is an illusion created by Gouraud shading. which is . with the time continuously increasing from minus infinity to zero. it is twice differentiable at the critical points.1. the skin is the empty surface. and the slicing plane is chosen to cut right through the narrow part of the tunnel. Observe that the bod- III S URFACE M ESHING ies bounded by the -skins are nested. The algorithm thus reduces to executing a sequence of elementary operations. The complete surface has genus one. For . Growing the mesh. We classify the operations according to the adaptation purpose they serve. It takes as input a set of spheres and constructs a mesh by maintaining a triangulation of the set of spheres . Note that highly curved areas detectable in Figure III. In . As it turns out.1. .3 guarantee that the mesh adapts its local density to the maximum normal curvature. Figure III.19.18: From inside out the sequence of skins for . the height function is differentiable and assuming non-degeneracy of the input circles. The algorithm moves vertices normal to the surface. At time we have the mesh of the skin of . which is a graphics technique that interpolates between normal directions to generate the smooth impression. the surface moves and the . As time increases. the innermost -skin. with as usual.1 to define pockets.   D ¡    ¡ ¡ B      B       ¡ ¡ ¢ £¡ ¡ g¡ D     ¡ ¡ #B ¡     ¦  ¥ ¤¡ £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡    ©  D § ¤ ¡    ¡       ¤ ©     Figure III. D   ¡ ¡ ¡   ¡ ¡ dering of the same surface in Figure III. Recall that the conditions [L] and [U] given in Section III. the mesh is constructed by maintaining it while growing the spheres. We use edge flips to maintain the mesh as the restricted Delaunay triangulation of the moving vertices. defined for .18 illustrates the construction by showing the modified skins for several values of .20.19: Cut-away view of the mesh of a small molecule of about forty atoms. The image of the mesh in Figure III. ¡ Curvature adaptation. Note that and is the envelope of the original disks.

There are four types of topological changes that occur. Observe that the surface around a handle is the same as that around a tunnel. Figure III. From the second snapshot to the third. we see both tunnels disappear again. It displays measurements of mesh quality.20: Smoothly shaded rendering of the mesh in Figure III.19.21. which control how the metamorphoses are performed. Note that in Figure III.3. and .22. The two-dimensional Morfi software has been developed by Ka-Po (Patrick) Lam. or vice versa. As proved in Section III. For the standard setting of . Quantification. the algorithm guarantees that the smallest angle inside any (nonspecial) triangle in the mesh is larger than . The Skin Meshing software comes with a quantification panel that displays parameters used in the meshing algorithm. Each handle creates a tunnel in the complement. the ratios all lie inside the allowed interval.4 Skin Software 49 are . From the first snapshot to the second. and a void is filled at a maximum. V   © § ¥ £ £    ¢         £  ¥ £ &&!        . The three-dimensional Skin Meshing software has been developed by Ho-Lun Cheng [1. a tunnel is closed at an index-2 saddle. [This panel needs to be updated to fit the text. provides various measurements of mesh quality.22: The quantification panel of the Skin Meshing software. The software permits other parameter settings since a violation of the inequalities does not necessarily imply a failure of the algorithm. a handle is created at an index-1 saddle. Computer graphics techniques used in displaying shapes. The only difference is the reversal of inside and outside. . The quality measures do not include the special edges and triangles that facilitate topological changes and purposely violate some of the properties required for the rest of the mesh. which is . including size versus length scale ratios of edges and triangles and the angles inside and between triangles. can be found in [3]. 5]. We use metamorphoses to change the mesh connectivity accordingly. and . including Gouraud shading. The software has been used in [2] to explain two-dimensional skin geometry and its application to deforming two-dimensional shapes into each other. Topology adaptation. and the smallest angle observed in the mesh is indeed . which controls the size of the angles. Two of the four types of metamorphoses can be seen at work in Figure III. we see two new handles appear. ¥ §£ £     Bibliographic notes. A component is born at a minimum. and they correspond to the four types of generic critical points of threedimensional Morse functions. the software works fine for small violations but breaks down for moderate ones.22 shows the panel after the construction of a mesh. and indicates the number of operations executed during the construction.] Figure III. this is roughly . By closing a tunnel we also remove the handle that forms it.III. and is described in his master thesis [4]. The correctness of the algorithm is guaranteed only if the inequalities referred to as Conditions (I) to (V) are all satisfied. In our experience. The two most important parameters are . The three other parameters shown in the panel Figure III. which controls the numerical approximation of the surface. namely a two-sheeted hyperboloid that flips over to a one-sheeted hyperboloid.

Addison-Wesley. F OLEY. L AM . P. we note two metamorphoses that each add a handle in the front. A. Principles and Practice.-L. F U AND K. L AM . [5] Molecular Skin web-site in the software collection at biogeometry. J. Illinois. 19 (2001). 205–218. E DELSBRUNNER .edu. C HENG . [1] H. Urbana. [2] S. Comput. P. Massachusetts. [3] J.50 III S URFACE M ESHING Figure III. Ph. F EINER AND J. [4] K. From center to right. Master thesis. we note a metamorphosis that closes a tunnel on the left. Appl. VAN DAM . Geom. 1996. 1990. H. thesis. Dynamic and Adaptive Surface Meshing under Motion. Second edition. P. D. S. Two-dimensional geometric morphing. H UGHES . From left to center. . Reading. Dept. Computer Graphics. C HENG . Comput. Sci.duke.. Comput..21: Three snap-shots of the deforming triangulation of a molecular skin defined by continuously growing spheres. Univ. Dept. 2001. Internat. Sci. Hong Kong University of Science and Technology. Design and analysis of planar shape deformation.-W.

(ii) Calculate for the portion of a double-cone within a unit-sphere around its apex. (iii) Prove that the number of points in a minimal -sampling of (as defined in Section III. Show that goes to infinity as the hyperboloid approaches its asymptotic double-cone. we write for the heights of and . (i) Give an example illustrating that is not continuous. Something about triangles. Curvature in the plane. what is the analog of a coaxal system in ? ¦  pi § p g   p p  p i § p p    p § £  p g   p p    p § ¥ ¥    ¥ ¥       1. 5. Similarly. and . Total curvature. Let be a triangle in the plane.3 is proportional to . For this purpose assume and are two sphere that are both orthogonal to the spheres . We write for the height of defined as the distance of from the closest point on the line  ¥    ¥   (i) Prove that every affine combination of and is orthogonal to .      § § Exercises ¦  § ¦¡ ¢¡ ¤ ¤ ¡   ¡   ¡ ¤ ¥  #§   ¦  B #B ¡ B #B  ¤ ¢ © G ¡ ¢ ¢ ¥ ¡   ¦¡ ¢¡  ¦ ¦ ¤ ¡ ¦ ¢©G ¦¡ ¢¡ ¢ £¡ £      ¦  ¡      ¦         ¡ ¥ ¥   ¥ ¥ ¡ ¦ ¡   ¦ ¦ ¦ . (iii) In the light of (i) and (ii). and . Define the total square curvature of a surface as the integral of the maximum principal curvature squared:  (i) Calculate for a sphere . and is orthogonal to and . Prove that the radius of the circumcircle satisfies 2. 4. Let us extend the concept of a coaxal system of circles to three dimensions. (ii) Let be the portion of a hyperboloid of revolution within a unit sphere around the apex. (ii) Prove that every affine combination of . Pencils of spheres. 3. (ii) Introduce a new function (perhaps similar to ) that is continuous over .Exercises 51 and passing through and . Define the total curvature of a surface as the integral of the maximum principal curvature:  (i) Calculate for a sphere . Note that the curvature of a molecular skin curve in is not continuous. Total square curvature.

52 III S URFACE M ESHING .

4. In Section IV. we formally ¡ ¢ ££   ¡ §¥ ©  ¨ ¦§¥       ¦  ¦ ¡ ¤ ¡ §¥  ¨ ¦§¥ ¦ ©   ¦ ¡ ¤ ££   ¡   IV. homology is the most important tool to study connectivity. we prove that space-filling diagrams are homotopy equivalent to their dual alpha shapes. where it indicates a vague notion of similarity. homology modeling of proteins). they are homotopy equivalent ( ).3. we describe an incremental algorithm for Betti numbers.] Given two triangulated spaces. In Section IV. If the groups are not isomorphic then we know that the two space are different.2 IV. The three notions are progressively weaker: define homology groups and their ranks. the Betti numbers. if their homology groups are isomorphic then we still do not know whether the two spaces are the same also under the two stricter definitions of sameness. However. In Section IV.2. or they have isomorphic homology groups ( . there is a polynomialtime algorithm that computes and compares their homology groups. which implies the two have isomorphic homology groups.3 IV. the classification of spaces by homology groups is coarser than that by homotopy equivalence. we focus on algorithms computing the homology groups of molecules represented by space-filling diagrams. for two spaces and to be “connected the same way”. which is fast but limited to complexes in three dimensions. we need to be aware that there are perfectly well-defined and reasonable but different precise notions that correspond to the intuitive idea of connectivity. which is in sharp contrast to how the term is used in biology (eg. In this chapter. could mean they are topologically equivalent ( ).Chapter IV Connectivity Given a shape or a space. [We should stress that homology in this topological context has a precise algebraic meaning. In Section IV. meaning they are neither homotopy equivalent nor topologically equivalent.1 IV. It might not be immediately obvious what this question means. we can ask whether or how it is connected. In words. which in turn is coarser than that defined by topological equivalence. we can draw from precise definitions developed in topology to answer the question.4 Equivalence of Spaces Homology Groups Incremental Algorithm Matrix Algorithm Exercises     ¡ ¢   53 . we present the classic matrix algorithm for Betti numbers. which is significantly slower but not limited to three-dimensional space.1. For example. In spite of the apparent weakness. However.

¢ is Topological spaces.1 Equivalence of Spaces The space-filling diagram of a molecule is a subset of . . as illustrated in Figure IV. and the composition of two homeomorphisms is a homeomorphism. Here is one. the inverse of a homeomorphism is a homeomorphism. Let be the three-dimensional Euclidean space. and that they have the same topological type. The two-dimensional sphere. A homeomorphism is a bijective map that is continuous and whose inverse is continuous. we can induce the subspace topology. then it is a topological subspace of . According to a more general definition. An interesting example of a pair of IV. we can map points from the sphere to the plane by stereographic projection from the north-pole. topological subspace of the pair non-homeomorphic spaces are the sphere and the plane. For example. is continuous if the preimage of every open set in is open in . An open ball is the set of points at distance less than some from a fixed point. (i) (ii) (iii) and . We study the connectivity of this space by considering equivalence classes defined by continuous maps between spaces. which is not an open set. This distinction is the motivation for the following definition. and if we choose its intersections with open sets in as the open sets in its topology. In other words. topologically equivalent.2: The stereographic projection maps the sphere (minus the north-pole) to the plane. Recall that a map continuous if for every there is a such that if have distance less than then the points have distance less than . Now that we know what a topological space is. Note that the identity is a homeomorphism. being homeomorphic is reflexive. To check whether or not is continuous.1. but both are not topologically equivalent to the annulus on the right. so is indeed an equivalence relation for topological spaces. the common intersection of the open balls of points at distance less than from the origin. and with induced subspace topology it is a topological space. Here we only need to distinguish between open and non-open sets. and an open set is a union of open balls. which is the system . and there are spaces that look quite similar and do not have the same topological type. Note that the common intersection of finitely many open sets is again open. This map between and is indeed a homeomorphism. We write if a homeomorphism exists and say that and are homeomorphic. for .2.  ¡    ¢ ¡  ¡  ¥ ¡¥ ¥    The system is called the topology of and the sets in are the open sets of . The space together with the system is a .1: The circle on the left is topologically equivalent to the trefoil knot in the middle. N Figure IV. We thus see that the restriction to finite subsystems in condition (iii) is necessary. symmetric and transitive. we thus have be able to measure the distance between points in both sets. A topological space is a set together with a system of subsets of such that  ¦  ¡ ¢   ¡ ¡ ¦     ¢ ¡ ¢ ¡   ¥       urp ¢ pB ¢ ¡ ¡ ¡ ¥     ¡   ¥     ¢  £ ¡   ¡   §¡ ¢      ¢ ¡ ¢  £   ¡ B¡ ¢      ¡ ¢        &   ¡        ¢ £¡       ¡ ¢ ¡   ¥   ¡¢ ¡  ¢  ©B ¢   ¡ B ¢ £¡ ¢ ¡ ¢ ¢ ¡ £  £ ¢ ¡ ¤¢ ¢ ¡    £    ¢ ¡ ¢ ¡ ¢ ¢  . there are spaces that have the same topological type and look vastly different. we can define when two are the same. for every subsystem . The lower hemisphere maps to the shaded disk and the upper hemisphere to the complement of that disk. . and for every finite subsystem . If . is just the origin itself. After embedding both in . but this is not necessarily true for infinitely many open sets. Figure IV.54 IV C ONNECTIVITY Topological equivalence. is a subset of . but there is no homeomorphism between and . To get comfortable with these abstract ideas requires a number of concrete examples. As suggested by Figure IV. Another topological subspace of is the two-dimensional Euclidean plane. .

a disk is contractible but a circle is not. Similarly. ¤ ¥  ¡      ¡ 8D ¡ "   ¡ ¡  "   " D  "     .       ¡ ©B £    ¡ 9D  ¥   ¡ £  ¤      ¢¦ ¡ §   ¥ ¤             ¡ #B ¦   ¡ ¡   ¡ B   CB ¡   ¡   CB ¡ ¥ ¢ ¦   ¦       ¡  § ¢ ¡     ¡    ¦ ¨      ¡    § B ¦ D #B    ©B ¥     § ¦¥      § "    B# " #B "      ©B I ¢ #B    ¤   ¦ . For example. is equal to the identity on and therefore certainly homotopic to it. If is a topological subspace of then we may prove that the two spaces are homotopy equivalent by constructing a map that retracts to . and that it is a map. Note that is reflexive. Figure IV. A deformation retraction from to is a continuous map with    Two spaces and are homotopy equivalent if there are continuous maps and such that is homotopic to the identity on and is homotopic to the identity on . namely triangles and disk sectors. ¡ "   ¡   Figure IV. a join between two sets and in some Euclidean space is the union of closed line segments that connect points in with points in . and .4.3. for all . there is a deformation retraction from the double annulus to the figure-8 curve. (Why not?) im k im H im h Figure IV. for all     55 and all ¡  Note that is a homotopy between . It is easy to show that two topologically equivalent spaces are also homotopy equivalent. and transitive and is therefore indeed an equivalence relation for topological spaces. Define and . and and it is defined iff any two such line segments are either disjoint or meet at a common endpoint. and the cylinder connecting the two images of the circle. Next we introduce an equivalence relation that is less sensitive to the local dimension of spaces than topological equivalence. which maps to . We may think of the parameter as (iii) . To see that the reverse is not true we note that the annulus in Figure IV. If is a deformation retraction from to then and are homotopy equivalent. For example. We begin by comparing maps between the same spaces. We write and call a homotopy between and . which is the same as saying that the image of may be self-intersecting. but the two are not topologically equivalent. which is the identity on . and both map the circle into maps the circle times to three-dimensional space. As illustrated in Figure IV.4: The arrows indicate a deformation retraction from the double annulus to the figure-8 curve. ¡      (ii) ¡ ¡   (i) . but there is no deformation retraction to the circle. Then is homotopic to the identity on because is a homotopy between the two maps. symmetric. for all . In general. The simplest homotopy type is that of a point. We construct a deformation retraction between a union of balls and its dual complex using a decomposition into joins. ¦ ¦   ©B  " ¡% §   ¡ #B ¡¢ ¦  § ¢  "   ¢  ¦  ¦  "¢        ¦   time and sweep out the image of by the images of the . Furthermore.5 uses two kinds of joins to decompose the difference between the union and the dual complex of a set of disks. A triangle is the join between a    B   © £ 8 © G ! ©      § ¨ Deformation retraction.1 is homotopy equivalent to the circle. This definition is illustrated in Figure IV. a ball is contractible but a sphere is not.1 Equivalence of Spaces Homotopy equivalence. The only requirements has to satisfy is that it starts with . . is not required to be injective. Decomposition into joins. ¦ P ROOF. We write and say that the two spaces have the same homotopy type. D EFORMATION R ETRACTION L EMMA . We construct maps and with the required properties.3: In this example. Two continuous maps are homotopic if there is a continuous map with and . ends with .IV. A space is contractible if it is homotopy equivalent to a point. for all .

6: The decomposition after shrinking the joins half way to zero. In the Alpha Shape software. or a pair of points. We assume general position and construct a deformation retraction from the union. A triangle in the decomposition shrinks from its outer vertex towards the opposite edge. An arc may be a full circle. we define a patch as the contribution of the sphere bounding to the boundary of . an edge is principal if it is not face of any other simplex in the complex. It turns into a trapezium whose height decreases and reaches zero at time . Subtleties of the definitions of a topology      D          " £      D Figure IV. which is a vertex of the dual complex.6. to the underlying space . homotopies. The deformation retraction is obtained by shrinking all joins simultaneously.3. such an edge is referred to as singular.4. arc. It does not have to be connected or simply connected. Figure IV. Shrinking joins. and deformation retractions are covered in most texts of algebraic topology. Each join is the union of line segments with on the boundary of and on the   Bibliographic notes.56 boundary of . including Seifert and Threlfall [6] and Munkres [5]. and corner and its dual vertex.7 shows an entire sequence of shapes during the deformation retraction visualized for the model of gramicidin also shown in Figure II. and they correspond to the four principal edges of the dual complex. a point. There is a technical problem at the very beginning of the shrinking process that arises already in two dimensions. we choose and move the points differently in the time interval . We get a deformation retraction from to by shrinking joins from outside in. . A corner may be empty.) There are also four arcs that consist of more than one component each.5: The union of disks is decomposed into the underlying space of the dual complex and two types of joins connecting that complex to the boundary of the union.5 illustrates the construction in the plane. By choosing small. There are four corners that are point pairs. edge. To finesse this difficulty. and they correspond to the vertices on the boundary of the dual complex that are exposed to the outside in more than one interval of directions. To be specific. In the assumed case in which is in general position. Figure IV. which belongs to the dual complex. Figure IV. ary of consists of sphere patches separated by circular arcs connecting corners. this initial motion needs to bridge the non-zero gap between the boundary of and the boundary of the image of at time . or any number of intervals along the circle. and triangle. It maintains its shape while getting smaller until it reaches the size of a point. It is illustrated in Figure IV. We shrink IV C ONNECTIVITY point and an edge and a sector is the join between a circular arc and a vertex. we define an arc and a corner as the contribution of the intersection of two and of three spheres to the boundary of . which shows the image of the retraction at time . for every point on the line segment .     B D   D $ ¥      B   D  " $   $   ¡ by defining D   § "   ¡ ¡ ¢ ¡ ¢¢ ¤ ¢¢   ¡   ¢¢   ¤ ¢¢ ¢   ¡       ¢       B       ¦      B          . we can make the gap arbitrarily small and easy to bridge. the outer vertex of each triangle join belongs to more than one line segment and thus retracts towards more than one point of the dual complex. The decomposition is constructed by forming the join between every patch. Let be a finite collection of closed balls in . Recall that the boundof the dual complex. Similarly. A disk sector shrinks from its outer arc towards its center. Specifically. Homeomorphisms. (As defined in Section II.

A First Course. L ERAY. 13 (1995). Math. The Nerve Lemma says that a space is homotopy equivalent to the nerve of a finite open cover whose sets have either empty or contractible common intersections. 1980. Maybe the paper by Leray [3] is the first publication on that topic. T HRELFALL . The union of balls and its dual shape. [3] J.IV. Sur la forme des espaces topologiques et sur les points fixes des repr´ sentations. San Diego. E DELSBRUNNER . A Textbook of Topology. Prentice Hall. Springer-Verlag. 24 e (1945).1 Equivalence of Spaces 57 Figure IV. and of a topological space are discussed in texts on general topology. Englewood Cliffs. 95–167. Redwood City. 415–440. 1955. Addison-Wesley. Pure Appl. The particular deformation retraction used to prove the homotopy equivalence between a union of balls and its dual complex is taken from Edelsbrunner [1]. K ELLEY. The history of the Nerve Lemma is complicated because different versions have been discovered independently by different people. M UNKRES . California. R. E. Discrete Comput. New Jersey. Topology.7: Six snap-shots of the deformation retraction from the union of balls representation of gramicidin to the dual complex. We can turn the Voronoi cells of a union of balls into such a cover and get the homotopy equivalence result from that lemma. M UNKRES . including Kelley [2] and Munkres [4]. [6] H. Geom. 1975. Academic Press. [4] J. New York. J. General Topology. . 1984. [2] J. [5] J. S EIFERT AND W. That equivalence can also be derived from general theorems about coverings. Elements of Algebraic Topology. R. [1] H.

The zero of this chain group is the empty set. The kernel of is the subset of whose elements map to . We note that it does not matter which representatives we choose in computing the sum of the two cosets. In words. The resulting coset is always the same. A group is a set together with an associative operation for which there is a zero and an inverse for every group element. Observe that implies G x+y+ H An isomorphism is a bijective homomorphism. A homomorphism between groups and is a function that commutes with addition. is the set of -chains and is the group of -chains. So if and then . We have . A subset forms a subgroup if is a group. We construct groups by defining what it means to add sets of simplices. and because implies . In the preceding chapters. and we have seen an example in Section II. including the empty set and as its two improper faces.1 that the underlying space of the dual complex is homotopy equivalent to the spacefilling diagram. A simplicial complex is a finite collection of simplices with pairwise proper intersections that is closed under the face relation. A topologically more accurate representation would have a homeomorphic underlying space. so addition is indeed well defined. for the case   ¥    ¢ ¥  B % ¡    This section introduces homology groups as an algebraic means to characterize the connectivity of a topological space. the sum of two -chains is the symmetric difference of the two sets. and the finite cyclic group of elements.3. . The group is abelian if the operation is commutative. We proved in Section IV. The quotient divided by . that is. has the same number of faces. the term has a precise meaning. We connect chain groups of different dimensions by          ¡    ¥   £   ¥      ¥      Abelian groups. Call a set of -simplices a -chain. A face of is the convex hull of a subset .  ¥   with ¥          #B    § ¥ ¡ ¥     ¡C6¢ ¥ ¡ B ©B   ¢¢ ¡ B ¡¡       ¥ ¤   ¥ ¡        ¥          ¥ ¦ ¡¤ £ ¤   § ©B       Recall that the underlying space of is the union of all simplices. and we write .   ©B ¥   ¥     ¥  #B ¢ §   IV. If has cardinality then has dimension and is also referred to as a -simplex. where the dual complex of a space-filling diagram was used to represent a molecule. . Addition in the quotient group is defined by . In topology. The remainder of this section ically equivalent. Its kernel is the zero element of and its image is the entire . Chain complex. introduces the algebraic concepts we will use to define homology groups of triangulated spaces. is the collection of cosets. two cosets are either disjoint or the same. We thus define a triangulation of a topological space as a simplicial complex whose underlying space is topolog. Let be a simplicial complex. . there is a bijection between and each coset ¢ This is like adding modulo 2 where . By definition. which we now develop. and § ¦ Triangulations. Since has subsets. since a chain belongs to iff it belongs to neither or to both chains. To keep the discussion reasonably elementary. Examples are the infinite group of integers with addition. denoted as . A simplicial complex can be used to represent a topological space. .2 Homology Groups  ¥ ¡  ¢ ¥ ¥¢    B¡       ¥ ¤ ¡     © ¨ ¦P§ ¢    ¨ ¤ ¨  £ ¤ ¨¡¨ ¡ ¨ ¨ ¨ ¤ ¤ ¤¡ ¨ ¨ ¤    ¨ ¥ C ¡ ¥ ¨ ¢   ¤ ¨    "¥ ¥ . and the image is the subset of whose elements have preimages in :  B©   ¥   ¥ ¢ ¦£ § ¤£¥ §   ¥ ¢ ¥    ¢ £   ¢ £¥ £ §       ¥   B   B ¥ ¢   B   ¥  ¢   ¥       ¤  ¤     £¦ ¢  ¡     ¤ ¢ ¡     ¢ ¢¢ ¤ ¢¢ §  ¤ ¢ ¥ ¤   ¢¢ ¤ ¢¢    ¥ ¡ (ii) if both.  x+ H H 0 y+ H Figure IV. If is finite this implies that all cosets have the same cardinality and . mod .      ¥ Suppose is abelian and is a subgroup.8: Partition of into cosets defined by in which contains a quarter of the elements. we restrict it to triangulated spaces and to addition modulo 2.58 IV C ONNECTIVITY . we have talked about triangulations in an intuitive geometric sense.  then is either empty or a face of .   ¢ ¢ ¥  ¡ ¥ B ¡ ¥ ¥ ¥  B    ¥    ©  ¨ B B (i) if and then . A simplex is the convex hull of an affinely independent point set.

If then is the trivial group consisting only of one element. Equivalently. Similarly. which implies that is a subgroup of .  ¥ 59 0 a a b a+b b a+b b a 0 0 a P ROOF. . Hence . . Hence and no non-empty 2-boundary. and boundaries as sketched in Figure IV. .9 illustrates the sequence but contains information about subgroups that will be introduced shortly. For this purpose we define . . as sketched in Figure IV. as required. ¥          ¢    ¥   ¥ ¡ £¦  ¨      ¥    £ ¤   ¦ ¥ ¥   ¥       ¢ ¥              ¡§     § © £¦ ¤¤¢  ¥         ¡ ¨   ¢© ¨ ¤ ¡   ¨   £   ¥    ¢  ¥ ¡ ¤    ¢ Ck+1 Z k+1 Bk+1 k+2 k+1 Ck Zk Bk k Ck−1 Z k−1 Bk−1 k−1 ¥ ¡  ¦ £ ¢  ¨   ¦   ©  ¨   ¦               ©         ¥         ¥ ¥    ¥ ¥ ¥ § ¥ £¦ ¨¦¤¢  ¥   ¥ £       ¥  ¨          V ¦¥¤£¢                ¦    ¥               ¥       ¥ ¥   0 0 0 Figure IV. The cosets are the elements of and are referred to as homology classes. we get the same homology groups for different triangulations of a topological space. b a 0 a b a+b 0 a a+b 0 a+b b b a+b Figure IV. . the homology groups are properties of the space and not artifacts of the complexes used to represent that space. Two -boundaries add up to another -boundary. group. ¥ §  ! ¥ #§ ! ¥ "§  ¨      F UNDAMENTAL L EMMA OF H OMOLOGY. This is because every -simplex belongs to exactly two -simplices.IV. ¥ ¥ ¥ ¥ Figure IV.  . . Proving that this is really the case is beyond the scope of this book. We prove that is a subgroup of . A -cycle is a -chain with . namely the ones with even cardinality. The set of -cycles is the kernel of the th boundary homomorphism.10. We thus have a boundary homomorphism .9. the boundary of every boundary is empty. which is the group of elements with component-wise addition modulo 2. Homology groups. As an example consider a triangulated torus.9: The chain complex and the groups of cycles and boundaries contained in the chain groups. . cycles. . The -th homology group is the quotient of the -th cycle group divided by the -th boundary An important property of homology groups is that they are the same for triangulations of homeomorphic and of homotopy equivalent spaces. The rest follows because taking boundary commutes with adding:   which is the empty set.10. The two nonbounding 1-cycles labeled and generate a first homology group of four elements. In particular.    ¤  ¥ ©    ¡    ¤ ©    ¡ ¡  ¢ ¡  R      '     ¤ ¡ ¥   ¤  &            ¤ ©         ¢ Cycles and boundaries. The size of is a measure of how many -cycles are not -boundaries. A -boundary is a -chain for -chain with . All 0-chains are 0-cycles and half of them are 0-boundaries. Observe that the boundary of the sum of two chains is the sum of their boundaries. which implies that is a subgroup of . In other words. We can therefore draw the relationship between the sets of chains. The sequence of chain groups connected by boundary homomorphisms is the chain complex of . The boundary of a chain is the sum of boundaries of its simplices. There are two types of chains that are particularly important to us: the ones without boundary and the ones that bound. for every .10: The curves and represent the homology classes and . the homology groups of (any triangulation of) a union of balls are the same as the homology groups of the dual complex. This assumes of course that and have the same dimension.2 Homology Groups homomorphisms that map chains to their boundary. . Two -cycles add up to another -cycle. else would not be defined. which generate the homology group . It is isomorphic to . Note that for every -simplex . as shown in Figure IV. The set which there exists a of -boundaries is the image of the -st boundary homomorphism. There is only one non-empty 2-cycle.

Similarly. Similar to .60 Betti numbers. general cells. We refer        ¥      ¤      ¥          ¤   ¥     ¤ Revisiting the example above. . If the group is the -th homology group of a space. . . . The French mathematician Henri Poincar´ is usually credited with the conception of the idea e [4]. the Euler charactherefore teristic of the two-dimensional sphere is and that of the torus is . the rank is known as the -th Betti number of that space: . Eventually. This operation can also be expressed in the terminology of linear algebra.). Note that this implies that the disk. Given a subset of such a group . . All these groups are idempotent. boundary and homology groups. This subset is a basis if it is minimal and generates the entire group. §¥   ´ E ULER -P OINCAR E T HEOREM . We show that is also the alternating sum of Betti numbers. The concept of a rank applies equally well to chain. cycle. Homology groups have been developed at the end of the nineteenth and the beginning of the twentieth centuries. and because is idempotent. we rewrite this relation as . the closed disk has one component. Earlier we derived . It follows that ... and and therefore . . This relation can often be used to quickly find the Euler characteristic of a space without constructing a triangulation and counting simplices. As for the torus. The beginning of the twentieth century witnessed parallel developments of homology groups that differed in the elements they added (simplices. we can form all sums of elements in and thus generate a subgroup. and . To see this remember that a 0-cycle bounds iff it contains an even number of vertices in each component. no shell. the 1-st and 2-nd Betti numbers have intuitive interpretations as the number of independent non-bounding loops and the number of independent non-bounding shells. Since we have Since and IV C ONNECTIVITY is a homomorphism. The most useful aspects of homology groups are their ranks. we see that the Betti numbers of the torus are . ¤ ¡  ¥  ¢    0  © ¡ ©   ¥  ¡           ¤     ¡¡£   ¡ ¤ ¥ ¥    ¢ ¦£  § ¥      £ ¥          £   ¢ 0    ¡   ¢ ££   ¤ £¥   ¤ £¥ 0 £      ¢'    ¥  ¥ ¤ £¥    0      ¡ ©   ¥          ¡ 0 ¡    ¡ ¤ ¢  ¥ 0 ¥ ¥    £ ¦¨¥  ¡ ¥V £   £ ¥    ¥ ¡   ¥   ¥  ¡ 0   ¥   ¡ ¡    ¤   ¡       ¡ ¢    B ¦¥ ¥ ©      ¤  ¡  £ ¥   ¡§ 0 ¡     ¥ 0    £ ¢ ¥         ¥ ¤ £¥       £ §  ¤ ¡ ¡ 0 0    ¢ ¦£ § ¥ § ¨   ¥ ¢    ¡ ¡ ¦ B  B     ¡   ¢           ¢ ¦£ § ¥ . For example. . for every .   § ¡ ¡  £  0 0 . all other Betti numbers vanish. In this case. . Note also that exactly half of the subsets of a finite set have even cardinality.) and the coefficient groups they used ( . cubes. He named the ranks of the homology groups after the English mathematician Betti. Note that if is a homomorphism. . and . the sphere and the torus are pairwise non-homeomorphic. Consider a simplicial complex and let be the number of its -simplices. The number of -simplices in the complex is also the rank of the chain group. By definition. the Euler characteristic is the alternating sum of these numbers: . with and . Even though there is no unique basis. and . we have Using corresponding lowercase letters for ranks. ¡ ¨ ¤  ¤ Euler characteristic. . all bases have the same size.    ¡ ¤     ¥ 0   ¡  ¡       ¤  £ ¥    ¤ ¦¨£¥  ¡       ¡ ¥ ¡ 0 ¡   ¡ ¡     ¥    ¤ ¢ ¢   ¥    ¥    ¥    ¥  ¢                ¡     ¤ £¥     ¡    §   ¦ £ ¤ . hence We state this result because it is important and so we can use it for later reference. By definition.. who introduced a slightly different version of the numbers years earlier. Bibliographic notes. For the closed disk we have . consisting of all . homology is a general method within algebraic topology. This is hardly surprising but not easy to prove with elementary means. that size is the binary logarithm of the number of group elements. then the rank of is equal to the sum of ranks of the kernel and the image. If there are components and vertices then and . the 0-th Betti number is the number of connected components. and .   ¥  In general.. Similarly for the two-dimensional sphere we have . Indeed. The homology groups of dimensions are all trivial and the corresponding Betti numbers are all zero. where the subgroup is knows as the linear hull. the spaces are neither homotopy equivalent nor topologically equivalent. no non-bounding loop. the rank of is the size of a basis: . Today. two spaces with different Euler characteristics have homology groups that are different in at least one dimension. all this work was unified by axiomizing the assumptions under which homology groups exist [1]. that is. which have intuitive interpretations in terms of the connectivity of the space.

R. Press. 1981.IV. Compl´ ment a l’analysis situs. Elements of Algebraic Topology. G IBLIN . 285–343. P OINCAR E . 61 . ROTMAN . ´ [4] H. S TEENROD . E ILENBERG AND N. Surfaces and Homology. J. 1952. An Introduction to Algebraic Topology. New Jersey. Chapman and Hall. 1988. [2] P. London. [1] S. Foundations of Algebraic Topology. Rendiconti e ` del Circolo Matematico di Palermo 13 (1899). [3] J. Graphs. [5] J. Addison-Wesley. Princeton Univ. New York.2 Homology Groups to Giblin [2] for an intuitive introduction to that area and to Munkres [3] and Rotman [5] for more comprehensive sources. Redwood City. 1984. Springer-Verlag. M UNKRES . J.

As illustrated in Figure IV. The dunce cap is best created from a triangular piece of soft cloth. the triangle completes a surface. we need a triangulation of the dunce cap. To run our algorithm. ¢ £  £ ¤  ¢ ¥ £   ¨ for if ¤ ¨   ¥     0   ¢ u v ¨¡ £     0¨ ¡  ¤   ¤    ¤ is a vertex.12: To the left.3. and in the second case have . all three sides are equally long and are glued to each other with matching orientations.62 IV C ONNECTIVITY Observe that the four cases follow one and the same rule: if belongs to a non-bounding cycle in then we increment the Betti number of the dimension of and. we decrement the Betti number of dimension one less than that of . To compute the Betti numbers of a complex. For example. Assuming is a complex in . we may use the filtration of a Delaunay triangulation introduced in Section II. We analyze what happens to the Betti numbers when we add a simplex to a complex . by adding one simplex at a time. u v All are complexes. but we have to avoid pitfalls such as creating edges that share more than one endpoint and triangles that share more than one edge.11: The edge closes a loop on the left and connects two components on the right.3 Incremental Algorithm The Betti numbers of a simplicial complex can be computed incrementally. we mention only the Betti numbers that change. we form a filtration that ends with that complex:  IV. closes a tunnel and we have . If completes a 2-cycle then . to do .2: adding a -simplex always increments the rank of the -th chain group. while to the right. belongs to a -cycle in then else endif endfor. σj σj Figure IV. Otherwise. integer B ETTI: . cannot connect to and thus forms a component by itself. it just closes a tunnel formed by the surface holes. we . otherwise. Betti numbers of the dunce cap. Therefore. . then all edges. This is justified by the equation developed in Section IV. Case is a tetrahedron. In this section. it cannot have any 3-cycle. Adding a simplex. The algorithm is but a simple scan along the filtration. Alternatively. It is not difficult to construct one. which is particularly well-suited for filtrations. Algorithm. the filtration contains all alpha complexes and we get the Betti number of all of them in one sweep. Case 0 Case is an edge. so is also a complex.14.   ¥  0  ¥  0  ¡ 0 ¨ Case is a triangle. and it does this by either incrementing the rank of the -th cycle group or that of the -st boundary group. return The only difficult part of the algorithm is deciding whether or not belongs to a -cycle. There are two sub-cases depending on whether the endpoints of belong to the same component or to two different components. By observing how fits into .12. Being a vertex. we may sort the simplices in non-decreasing order of dimension and take all prefixes of that sequence. ++ -- ¨ ¥ ¤       ¡ ¤ ¥ ¥ ¢    ¢  ¨ ¤ ¢ ¡ ¤  ¡ ¡   ¢       ¡ ¥ 0 ¨ ¥ ¨ ¨ ¤   ¨ ¨ ¤    ¥ 0 ¨ ¤   ¤ ¨  ¨   ¡ 0     0   ¨  ¤  ¤   ¤ ¢  £ © 0       ¥  ¤ ¡ 0 0 0   ¤ ¡ 0    0    ¨¡     ¡   ¤ £     0 ¤ ¤   ¡   ¡ ¤ ¤ ¨¤ ¨ ¨ ¨¢ ¡ ¡ ¡   0   . both illustrated in Figure IV. Let and assume that all proper faces of belong to . we describe the details of this algorithm. Adding can therefore only turn a non-bounding 2-cycle (its boundary) into a 2-boundary. When we run our algorithm. Both cases are illustrated in Figure IV. .13. Hence.11. A valid triangulation is shown in Figure IV. In the latter case. . and it is convenient to assume that any two contiguous complexes differ by only one simplex: . Again we have two sub-cases. In the case analysis. we first add all vertices. we can determine the Betti numbers of from those of . We study this problem after illustrating the algorithm for a small example.  ¡ 0 0  Figure IV. In the first case.

IV.3 Incremental Algorithm

63 Classifying vertices and edges. We now return to the problem of deciding whether the addition of a simplex increases the rank of a cycle group or that of a boundary group. In the former case, we say the simplex creates, and in the latter case it destroys. All vertices create, but edges in Figcan create or destroy. For example, the edge ure IV.11 creates on the left and destroys on the right. To distinguish between the two cases, we maintain the components of the complex throughout the filtration using a union-find data structure, which represents a system of pairwise disjoint sets: the elements are the vertices and the sets are components of the complex at any moment in time. The data structure supports three types of operations:

Figure IV.13: In the first step, we glue two sides of the triangle, thus forming a cone with a seam. In the second step, we glue the seam along the rim of the cone (not shown).
1

3 8 2

2 4 9 A B 2 C 3 1 D

1

Figure IV.14: A triangulation of the dunce cap.

The algorithm scans the filtration from left to right and classifies each vertex and each edge as either creating or destroying: for to do case is a vertex : creates; A DD ; case is an edge : F IND ; F IND ; if then creates else destroys; U NION endif endfor.

Classifying triangles and tetrahedra. In three-dimensional Euclidean space, every tetrahedron destroys but triangles can destroy or create. Deciding whether or not a triangle belongs to a cycle is not quite as straightforward

£ 

¡

¤

£

triangulation, each closing a tunnel and thus decrementing . Indeed, no collection of triangles has zero boundary, which can be proved by observing that three edges belong to three triangles each and all other edges belong to two triangles each. The final result is therefore and . Indeed, the dunce cap is connected, all its closed curves bound, and the surface formed by the triangles does not enclose any volume in .
 

£

£

£ 

¡

£

 

¡

0

¢ ¡

 

Table IV.1: Evolution of and triangulation in Figure IV.14.

while adding the edges of the

Standard implementations of the union-find data structure take barely more than constant time per operation. To be more precise, let be the extremely fast growing Ackermann function. Its inverse is extremely slow growing. To get a faint idea of how slow the inverse grows, we note that cannot be bounded from above by any constant, but unless is larger than the estimated number of electrons in the universe. Any sequence of operations takes time at most proportional to . For all practical purposes, this means that each operation takes only constant time. 

¤

£

¤

12 12 0 28 3 1 3C 2 10 56 1 19

13 11 0 29 3 2 45 1 10 5D 1 20

16 10 0 2A 3 3 46 1 11 67 1 21

17 9 0 2B 2 3 47 1 12 78 1 22

19 8 0 2D 2 4 48 1 13 89 1 23

1A 7 0 35 2 5 49 1 14 9A 1 24

1C 6 0 36 2 6 4A 1 15 AB 1 25

1D 5 0 37 2 7 4B 1 16 BC 1 26

23 5 1 38 2 8 4C 1 17 CD 1 27

25 4 1 3B 2 9 4D 1 18

#

©

§

§    

¨ 

¨

 

 

#

£¤

  

¡

#

¤

  © 

¨

  © 

¨

and finally all triangles. After adding the thirteen vertices, we have , and . The evolution of Betti numbers while adding the edges in lexicographic order is shown in Table IV.1. There are 27 triangles in the 

£ 

¨

 

  ¨

 

A DD

add

as a new singleton set to the system.

£

substitute U NION the system. 

 ¡

§ 

§ 

5

for the sets

and

#

©

 

#

©

#

 

F IND

§©

7

6

3

return the set that contains vertex . in

§ 

  

0

¥ 

 

¢ £

  

0

¥   

 

 

¡

0 

0

 

§ § § §
   

  ¡¥   ¡¥

¥ ¥ ¥ ¥ ¥ ¥ 


0 0

64 as it is for an edge. However, with an extra assumption on the filtration, we can use the dual graph of the complement to classify triangles and tetrahedra the same way as we classified edges and vertices. The most convenient version of this assumption is that the last complex in the filtration, , is a triangulation of . Think of as the one-point compactification of . Given a Delaunay triangulation in , we can construct such a triangulation by adding a dummy vertex and connecting it to all boundary simplices of the Delaunay triangulation. In and also in , every closed surface bounds a volume. In other words, a triangle completes a 2-cycle iff it decomposes a component of the complement into two. We keep track of the connectivity of the complement through its dual graph, whose nodes are the tetrahedra and whose arcs are the triangles. Figure IV.15 illustrates this construction in two dimensions. Adding a triangle to the

IV C ONNECTIVITY tetrahedra, but this is exactly what compactification does for us when it adds tetrahedra outside the boundary triangles of the Delaunay triangulation. The running time for classifying all triangles and tetrahedra is again propor. tional to Summary. The entire algorithm consists of three passes over the filtration: 1. a forward pass to classify all vertices and edges, 2. a backward pass to classify all triangles and tetrahedra, 3. a forward pass to compute the Betti numbers. Figure IV.16 illustrates the result of the algorithm. In the first two passes, we maintain a union-find data structure, which takes time proportional to . The third pass does only a constant amount of work per step, namely incrementing or decrementing a counter. The total running . time is therefore at most proportional to

Figure IV.15: A subcomplex of the Delaunay triangulation and the dual graph of the complement. The region outside the Delaunay triangulation is represented by a single node.

complex effectively removes an arc from the dual graph of the complement. Deciding whether removing an arc splits a component is more difficult than deciding whether adding an arc connects two components. We therefore scan the filtration backward, from right to left:
 

for downto do case is a tetrahedron: destroys, unless , in which case it creates; A DD ; case is a triangle: let and be the tetrahedra that share ; F IND ; F IND ; if then destroys else creates; U NION endif endfor. The algorithm requires that each triangle is shared by two

Figure IV.16: The evolution of the Betti number (the number of tunnels) in the filtration of gramicidin, which is shown in Figures II.3 and II.15.

Bibliographic notes. The incremental algorithm for computing Betti numbers described in this section is taken from [2]. It exploits the fact that the connectivity of the complex determines the connectivity of the complement. This relation is a manifestation of Alexander duality, which is studied in algebraic topology [3, Chapter 3]. This algorithm has been implemented as part of the Alpha Shape software, which computes the Betti numbers of

£  ¡

¥

¤

£ 

£

¡

¤

£

£  ¡

¤

£

¢

  

¨

¢ 

$

 

¢

  ¢ ¡ 

¨ 

£

  ¨ 

¨ 

¨

¢

 

  £¢

¢

 

¨

¢ £¡ 

¢ 

¨  ¨

¡ ¤
  

¨ £ ¨ ¨  

 

¨ 

¢ ¡

 

IV.3 Incremental Algorithm typically thousands of complexes in the filtration of a protein structure in less than a second. The key to achieving this performance is a fast implementation of the union-find data structure, namely one with running time proportional to for operations. The details of such an implementation can be found in most algorithm texts, including [1, Chapter 22]. A proof that the running time cannot be improved from to has been given by Tarjan [4].
[1] T. H. C ORMEN , C. E. L EISERSON AND R. L. R IVEST. Introduction to Algorithms. MIT Press, Cambridge, Massachusetts, 1990. [2] C. J. A. D ELFINADO AND H. E DELSBRUNNER . An incremental algorithm for Betti numbers of simplicial complexes on the 3-sphere. Comput. Aided Geom. Design 12 (1995), 771–784. [3] A. H ATCHER . Algebraic Topology. Cambridge Univ. Press, England, 2002. [4] R. E. TARJAN . A class of algorithms which require nonlinear time to maintain disjoint sets. J. Comput. System Sci. 18 (1979), 110–127.

65

£

£  ¡

¤

£

£

£ 

¡

¤

£

. . ¥  ¨ ©§ ¨ ©§   ¤ ¤ ¤ ¥£ ¡    ¥ ¡   ¡  ¡   ¡          ¡ ¥ ¡     3     ¡              ¦ ¥  ¡ ¥      0 ¡        ¤ be a simplicial complex -simplices . . and similarly the form a basis of . £  col . It does this by exchanging rows and columns. that can be handled symmetrically. . we extend it to integer addition. while the basis of changes at the modifying column. Incidence matrices. -with . Adding column to column has the effect of replacing by .17. The above formula thus expresses the boundary of every basis element of as a sum of basis elements of . as illustrated in Figure IV. The function fails to make non-zero iff all entries in the remaining sub-matrix are zero. that if col then col else find row endif endwhile. ¤ ¡ £  0   ¤  ¤ 0 £ X£  £  £ ) ¡   )   ¡    £       £    ¤¦ ¦ .4 Matrix Algorithm In this section. After explaining the algorithm both for addition modulo two.  . Using this notation. we develop the linear algebra view of homology and formulate a matrix algorithm for computing Betti numbers. We can use Gaussian elimination to transform the incidence matrix into normal form. The algorithm uses a boolean function NON Z ERO that makes sure that during the -th iteration the -th diagonal entry. boolean NON Z ERO and while assume w. .   . add column to column .g. namely . subtraction is the same as addition. add row to row . forall columns do if then col col endfor endif endfor.17: The effect of elementary row and column operations on the bases of and . : do . . adding row to row has the effect of replacing by . we need to consider more general bases. As illustrated in Figure IV.66 IV C ONNECTIVITY IV. is non-zero.   . To make this interpretation of the incidence matrix useful for computing Betti numbers. The al-    ¤ 0   £ £ 0 ¦ ¦  © © £              £    Exchanging two rows or columns is equivalent to reindexing the or .18. matrix. . Let with -simplices and -th incidence matrix is hj hj − h s hs gi + gr gr + gi where iff is a face of . £ £     ££   £ ) ¡ £ £ ¤ £ £ )X   ¡ £     ¡ £  £       ¡   Recall that the form a basis of the -th chain group.l. but it is still describes a correspondence between and . exchange column with column . The + £ ¨   £      ¡     ¨    0   ¨ ¥ ¦ ¡ ¡ ¢        ¡ ¦ £   ¥  ¨ ¦  £   ¦   £   £    3    ¥ ¦  ¤           ¥ Figure IV. row     £ Normal form algorithm. The matrix is in normal form if bases of its non-zero entries are lined up along an initial segment of the main diagonal. These can be generated by performing elementary row and column operations:  row endif £ 0 for to do if NON Z ERO then forall rows do if then row row endfor. After a few elementary row and column operations.) Note that the effect is not symmetric: the basis of changes at the modified row. . ¦ .o. .. is no longer the -th incidence We use the phrase “assume without loss of generality” as a short-form for expressing that there is another case. return . (Since we deal with idempotent groups. we can write the -th boundary homomorphism in matrix form:  col endif exchange row with row .

.18: The normal form of the -th incidence matrix.IV. A curious new phenomenon that arises with the use of integer addition is algebraic torsion. and for a given oriented simplex .4 Matrix Algorithm gorithm consists of three nested loops. Maybe the simplest topological space whose homology groups have torsion is the Klein bottle. Letting the running time is therefore at most proportional to 67 1 1 bk −1 ck −1 bk −1 zk ck Integer coefficients. .19: A triangulated rectangular piece of paper glued to form a Klein bottle.    ¨   ¥ ¥ ¥    Figure IV. we talk about what this means in terms of adding simplices and chains. we write for the other orientation of the otherwise same simplex.  "    ¥  ¡ 1                ¡   ¡               ¥ By definition. The -th Betti number is the rank of where is the function value of . where the hat marks the deleted vertex. It can be constructed from a rectangular    ¢       ¥ ¥   ¥ ¤ ¦¨£¥    0  ¥ ¥ ¥ the -th cycle group minus the rank of the -th boundary . as long as it belongs to the same orientation. the boundary of alternating sum of ordered dropping one vertex at a time:  ¨  ¡   #R   ¨ ¡    ¨ R  is the -simplices obtained by  ¡      ¨ ¥             ¨   Deriving the Betti numbers. and the -th Betti number is the rank of that homology group: . the running time of the algorithm is cubic in the number of simplices in the complex. and we write . the group of -cycles. We add two chains componentwise. We can check that the boundary is independent of the ordering. in which case it has only one. Each simplex has two orientations. The -th homology group is again . and that it is the negative boundary for an ordering of the opposite orientation: . The zero-rows correspond to -cycles. . It does not occur for spaces that can be embedded in . As before. of which we have many. we define the group of -chains. Two ordered simplices have the same orientation if their orderings differ by an even number of transpositions. To set the stage. We can thus derive group: the Betti numbers from the sizes and numbers of non-zero entries in the normal form matrices. piece of paper by gluing opposite sides as shown in Figure ¢ ¡ We note that the ranks of the incidence matrices suffice for computing the Betti numbers and it is not necessary to go all the way to normal form. Torsion. and the group of -boundaries. . We start at the beginning. Before discussing the necessary modifications. formed all incidence matrices of As illustrated in Figure IV. An ordered -simplex is an ordering of the vertices of a -simplex. It follows that the number of non-zero entries along the main diagonal is . we can check that the Fundamental Lemma of Homology still holds: . Suppose we have transinto normal form. this function as a formal polynomial: ¥     ¥   ¤    ¡     ¥ ¡ ¡ ¤ ¥  ¥ ¡ ¡     ¡  ¨  ¢ ¨ ¥ ¤ ¥ ¥  ¥ ¡ ¨  ¥ ¡ ¢ ¡      0  ¢ ¥ ¥ ¥ ¥ ¤  ¥ . by adding the coefficients of like simplices: ¤ ¥  ¨ £¨£R      ¨ R  ¨ R      ¨  ¡ R   ¨  £ ¢ ¥    ¥      . The matrix algorithm can be extended to coefficients in instead of . Either way. we give each simplex in an arbitrary but fixed orientation. so it is not part of people’s immediate experience. A -chain is a function from the -simplices to the integers. except if it is a vertex.18. It is convenient to write  1 4 5 1 2 3 3 2 1 4 5 1 Figure IV. Similarly. the -th matrix has rows and columns.

Hence. The rank of the group is the number of copies of . The matrix algorithm presented in this section is taken from [2. As for coefficients. The are the torsion coefficients. California. Trans. The abelian group is thus the direct sum of a free subgroup. this attempt will be successful and will divide every entry in the sub-matrix. This is what causes torsion. The 1-cycle marked around the neck of the bottle does not bound. 8 (1979). . On systems of indeterminate equations and congruences. 293–326. . and for addition modulo 2 and . [3] H. and we can make zero with a row operation. The normal form of a bases transition matrix is the same as before. the algorithm generates the torsion coefficients with the required properties. Indeed. . A more substantial modification is needed within the function NON Z ERO.. for each . For the Klein bottle. and the rest. The Betti numbers obtained for and (or other coefficient groups) are not necessarily the same. We get the torsion coefficients from the -st normal form matrix: they are the diagonal entries that exceed one. . it is unclear whether or not its running time is polynomial in the input size. but twice that 1-cycle bounds. We modify the above algorithm to transform the incidence matrix into normal form. namely . Polynomial algorithms for computing the Smith and Hermite normal forms of an integer matrix. Since it has torsion. Sympos. We thus get different Betti numbers for addition modulo 2 and for integer addition. assume there is an entry . The running time of the algorithm is no longer guaranteed to be at most cubic in the number of simplices. Chapter 1]. we know that the Klein bottle cannot be embedded in . Chapter 7]. R. For integer coefficients. we may require that all are larger than one and that divides for each . as before. By adding row to row we keep unchanged and we change to . Indeed. which is not an integer multiple of . In “Proc. Symbol. Comput. the Euler-Poincar´ Theorem is true independent of the type e of coefficients we choose to define homology groups and Betti numbers. . and it is not even clear whether or not it is polynomial in the input size. 151 (1861). and for integer addition. 1984. K ANNAN AND A. [4] A. To see this property. 267–274.  ¤ ¤ Algorithm revisited. [1] R. if we get such a positive integer in a single row operation. Redwood City. Symmetrically. First we extend the elementary row and column operations by allowing the multiplication of entire rows or columns by non-zero integers. Otherwise. namely . SIAM J. Bibliographic notes. we can determine the homology groups directly from the normal forms of all incidence matrices. we need the fact that every finitely generated abelian group is isomorphic to a direct sum (Cartesian product) of copies of and of cyclic groups: ¥ ¦ ¤ ¤ IV C ONNECTIVITY If we get a positive integer smaller than in a single column operation. . the initial sequence of ones is followed by integers . We get the rank of the -th homology group from the -th and the -st normal form matrices: . To describe the phenomenon more generally. . S TORJOHANN . except that we now allow entries in the main diagonal that are neither zero nor one. 4].. which now attempts to turn the next diagonal entry. Now we get a positive integer smaller than in a single column operation. M UNKRES . and similarly. Since divides every entry in the remaining sub-matrix. and when we draw it. Specifically. that is not an integer multiple of : ¡ ¥   ¤ ¥ ¡  ¦      ¡ ¥   ¥ 0 ¥    ¥ Furthermore. This extra condition fixes and the indices . all larger than one. Philos. . we have . but their alternating sums are both equal to the Euler characteristic: . Elements of Algebraic Topology. Near optimal algorithm for computing Smith normal forms of integer matrices. but their differences are predictable and described by the Universal Coefficient Theorem of Homology [2. the algorithm is sometimes called the Smith normal form algorithm. Addison-Wesley. The normal form it uses is sometimes referred to as the Smith normal form [3]. we may assume that divides both and . with . S MITH . BACHEM . we have to allow for a self-intersection. which is referred to as its torsion subgroup.19. which is . 499–507. J.68 IV. Internat. the sequence of operations is sensitive to the size of the integers that arise. [2] J. it is possible to modify the algorithm to guarantee polynomial running time [1. such that divides . it will also divide the future nonzero diagonal entries. Unless the entire remaining sub-matrix is zero. . 1997”.  ¤ ¡   ¡      £      ¡ ¤               ¤  £ ¡ ¤    ¡   £  0 0  ¤ 0 ¤  ©   ¥  ©  ¥ ¡ ¥  ©  ©  ¡ ¥      ¤      ¥ ¤   ¤  ¤ £     V    ¡ ¡¤     ¨   ¤ ¤ ¨ ¥    § ¨ § £       ¢ £¡  ¤      ¡0 ¤      £  £ ¤    ¤¤ ¥ ¦ ¤   £ £       £   ¢    £¡    £ ¥      £  §  ¥  ¡ ¤ ¤    £  ©   ¥  ¥ ©  ¤ ¤  ¤ £ 0    . into the smallest positive entry achievable by row and column operations. However. Algebraic Comput.

Download a protein structure from the pdb database and use the Alpha Shape software to compute the Betti numbers of its van der Waals and its solvent accessible diagrams. every face of a simplex in the link also belongs to the link. #   ¡ ¡ ¦   ¤ £¡   Exercises #   ¦ ¨    £ ¡ ¢  ¡¨    ¤ ¨ ¤¢¤   ¢ ¢  § ¢ £¡ ¤ ¤ ¡¡  ¡ ¤ ¨ ¡   ¤ ¡  ¤ £ ¨£           ¤   ¡ ¢  £ . a M¨ obius strip. Consider the following topological spaces: a circle. 5. The halfway plane is parallel to both line segments and lies exactly halfway between them. no matter whether or not it has (partial) double bond character. a trefoil knot. Use the language of homology groups to re-confirm the following formulas. (i) Triangulate the rectangle such that you get a valid triangulation for both ways of gluing its sides. 6.8 and I. Let finite collection of balls in . (i) Show that the halfway plane intersects the tetrahedron in a parallelogram.9 as definitions of the amino acids as (onedimensional) topological spaces. Equivalence classes. Protein structure. Take a rectangular piece of paper and orient the left and right sides from top to bottom and the top and bottom sides from left to right.   0 0   ¡     0 ¥ ¥ ¥ # #     ¦ ¦ (i) (ii) (iii) if the graph is a tree. The sphere (ii) Assume is the center of bounding intersects all other balls in caps. Simple graphs. You get a projective plane if you glue again the left to the right and the top to bottom sides but now with opposing orientations. A tetrahedron can be defined as the join of two skew line segments in space. be the dual complex of a 4. Take the graphs drawn in Figures I. each time with matching orientations. which are well-known for simple graphs: 1. which ones? (ii) Calculate the Betti numbers and Euler characteristics of the graphs. 3. if the graph is connected. Draw the decomposition and highlight the intersection with the halfway plane. Torus and projective plane. (i) Are there any two amino acids with isomorphic graphs? If yes. in general. Here an atom is a vertex and a bond is an edge. (i) Partition the collection into classes of same topological type. Define the star of a vertex as the collection of simplices that contain . You get a torus if you glue the left side to the right side and the top side to the bottom side. Let be the number of vertices and the number of edges. (iii) Partition the collection of graphs into classes of the same homotopy type. (ii) Compute the Betti numbers of the torus and the projective plane by running either the incremental or the matrix algorithm (by hand) on your triangulations. Show that is isomorphic to the dual complex of that collection of caps. and a plane with origin removed.Exercises 69 . (ii) Decomposing the line segments into and pieces implies a decomposition of the tetrahedron into joins. (ii) Partition the collection into classes of same homotopy type. (i) Show that is a complex. A simple graph is a simplicial complex that consists of vertices and edges but has not triangles or higher-dimensional simplices. Joins and simplices. Amino acids. a sphere with north-pole and south-pole removed. and the link as the collection of faces of simplices in the star that do not belong to the star: 7. Stars and links. 2. the halfway plane separates the two line segments. that is. which are smaller tetrahedra. Since the line segments are skew.

70 IV C ONNECTIVITY .

Chapter V Shape Features The topological analysis of spaces. We do this be introducing three essentially new concepts. We define it as a two-dimensional sheet separating the molecule. we make an attempt to give a precise meaning to interfaces between interacting molecules. To decide what is appropriate. There is overwhelming evidence that interesting events in such interactions happen preferably in cavities. and that local shape complementarity plays a significant role in making such events happen. which are partially protected regions in the protein or molecular assembly. is an important first step.4. and the relevant shape complementarity is local and imperfect. It appears that organic life is based on computations performed by dynamically matching the (changing) pieces of a three-dimensional puzzle. we make an attempt to give a precise meaning to cavities in proteins. The goal we have in mind is understanding how proteins interact with each other and with other molecules. The main idea here is to combine the topological concept of a hole with a minimum amount of geometric information. We see this as a tool to cope with imperfections as it permits us to distinguish topological features from topological noise. and this information is the evolution of the shape under growth. interactions that are based on shape complementarity are not entirely so. Finally.2 V. we return to homology groups and introduce the concept of topological persistence.1 V.1. In other words. the details are tricky and require that we use what we learned about pockets and topological persistence. as discussed in Chapter IV. In Section V. Our goal in this chapter is to introduce mathematical and computational methods that allow us to start talking about the real problem in more precise terms.4 Pockets Topological Persistence Molecular Interfaces Software for Shape Features Exercises 71 . the situation is hopelessly complicated. In Section V. It is a measure of how important a topological feature is during the evolution. In Section V. in Section V.3.3 V. we need to have a purpose. but by itself is insufficient to appropriately characterize the shape of protein structures. A statement like this needs to be accompanied by a series disclaimers: not every interaction is based on shape complementarity. V. we illustrate the concepts using the Alpha Shape software and extensions. While this idea seems simple enough.2.

Following the vectors. in the direction normal to the surface. Voids. The simplest type of pocket is a void. the boundaries of the voids form a basis of that homology group. and all other components are voids. we grow the space-filling diagram and observe how it changes: the relatively narrow entrances close before the inside disappears. We extend it to the rest of space by using the circles that sweep out the Voronoi polygons and the intervals that sweep out the Voronoi edges. It is convenient to use the one that gave rise to the sequence of alpha complexes. Note that voids are pockets without mouths. The boundary is a collection of triangles in . we formalize the idea of a cavity in a protein by introducing the concept of a pocket in a spacefilling diagram. a pocket is a maximal portion of space outside the spacefilling diagram that turns into a void before it is subsumed by the growing diagram. we can reverse the deformation retraction to show that the two voids have the same homotopy type. Indeed. Since the dual complex is a subcomplex of the Delaunay triangulation. Definition of pockets. We may think of the growth as pushing the points on the boundary of the space-filling diagram outwards. the vector field is defined by the sweeping spheres. According to this in remains model. Since is finite. To formalize this intuition. in normal direction. Exactly on component is unbounded (infinitely large). is the number of voids in . which is the same as the number of voids in . Suppose. . See Figure V. Its connected components are open twodimensional sets.72 V S HAPE F EATURES from infinity. consists of one or more connected components. The latter set of points may formally be defined as the intersection of the pocket with the closure of the outside.1: The union of disks has a single (shaded) void. . The plain existence of that retraction implies that for each void in we have a void in that contains the void in .2 illustrates this view in two dimensions. Indeed.1 Pockets In this section. we follow vectors and thus form a path that may or may not go to infinity. In the interior of V. The points that flow to infinity form a single component. .   0 ¢  ¡   ¡ ¡ gD D   0 ) " ¤       ¤   ¤     ¢ ¤ "    ¤ ¤ )    ¤ " ¥ " ¢ ¡ ¤ " ¥ " ¢ ¡  0 . that is a finite collection of closed balls in and is the space-filling representation of a molecule. the balls cannot cover the entire space. the points in the shaded region have paths that end at Voronoi vertices. This collection bounds in but not in . To make this idea concrete. Figure V. the center of the ball fixed and the radius at time is equal to the square root of . Hence. The corresponding void in the dual complex consists of five triangles. which we define as a bounded connected component of the complement. Figure V. Each pocket is open where it borders the space-filling diagram and closed where it borders the outside. Starting at a point outside the space-filling diagram. All we require is that a pocket be wider on the inside than at possible entrances from the outside. to the dual complex. which we refer to as the outside. the Voronoi cells.2: The growing disks push the points on the boundary outwards. but we should keep in mind that this choice does affect what we do and do not call a pocket. . which implies that the complement. for example. Recall that Figure V. we described a deformation retraction from the space-filling diagram. In other words. It follows that represents a homology class in the second homology group of .1 for an illustration of the definition in two dimensions. we may think of each void in as a collection of tetrahedra. we need to settle on a growth model. A pocket generalizes the concept of a void by relaxing the requirement it be disconnected " ¥ ¢ ¡ ¡ B in Chapter II. which we refer to as the mouths of the pocket. We define a pocket as a connected component of the set of points whose paths do not go to infinity.

again by observing how the space-filling diagram changes as it grows. In the first case. the balls touch the edge at the same moment they encounter the two polygons and one cell dual to the two visible edges and the vertex they share. the orthocenter lies inside the triangle. From left to right.4. Its orthocenter is necessarily the corresponding Voronoi vertex. the three balls touch the Voronoi edge at the same moment they encounter the Voronoi polygon dual to the visible edge. we may associate a pocket of the space-filling diagram with a pocket of the dual complex. while on the right. ¨ ¡  ¡ ¨ ¡  ¡ £ ¨ ¢¡        ¨ ¡ ¨ ¨ . The two balls approach the Voronoi polygon from both sides.3: The vertical lines are side views of polygons in space. Metamorphoses and collapses. Case is a tetrahedron. In Case C and in the last sub-case each of Cases C and C . it lies on ones side of the polygon. . On the left. namely in              M1 C1 Case is a triangle and lies in the interior of the corresponding Voronoi edge. This cell is encountered at time . lies outside and sees one edge. both illustrated in Figure V. That edge appears as a solid dot. The latter is defined combinatorially. lies outside and sees two edges and their shared vertex. In the second case. The solid dot marks the orthocenter of the Delaunay edge. . The dual complex changes only at discrete moment. edges and vertices visible from . The four balls touch the Voronoi vertex at the same moment they touch the Voronoi edges.       ¨ ¨    ¢   Case M : . Case C : . Case is an edge and lies in the interior of the corresponding Voronoi polygon. Assuming lies outside the space-filling diagram. edge. namely when the space-filling diagram encounters a new vertex. In four of the ten cases. Case M : . polygon or cell of the Voronoi diagram. polygons and cells that correspond to the triangles. Here we have three sub-cases depending on whether sees one. which is the moment when the -th ball changes from imaginary to real radius.     73 Case C : . At the moment they touch. We recall that is the point at which the affine hull of intersects the affine hull of its dual in the Voronoi diagram. The two balls approach the polygon from the same side. which marks the orthocenter of the triangle. There are ten cases distinguished by the dimension of the dual Delaunay simplex. all illustrated in Figure V. Similar to voids. and the relative position of its orthocenter. Case M : . There are two generic sub-cases. Case C : . ¨ ¢¡ ¨ ¡  ¡ ¨ ¢¡  ¢ ¨  # ¨   ¨ C2 M2 C2    0 ¥   D ¨   Figure V. This is unlikely to happen for molecular data and usually indicates a measurement or modeling mistake. Case M : is a vertex and the orthocenter lies in the interior of the corresponding Voronoi cell. sees a vertex of from the outside. two or three triangles from the outside.4: The thin solid lines represent polygons that meet along a common edge in space. . Here we have two sub-cases depending on whether sees one or two edges from the outside. The three balls completely surround the Voronoi edge before they touch at . There are three generic subcases. eventually touching it at .1 Pockets Evolution of dual complex. The four balls completely surround the Voronoi vertex before they reach it.3.V. this is only possible if the ball centered at that vertex is contained inside the union of the balls centered at the other vertices of . this edge intersects its dual Voronoi polygon. only one simplex is added to the dual complex. the smaller ball breaks through the outer sphere and starts sweeping out the Voronoi cell on the other side of the polygon.   ¢ Figure V.

the operation does not affect the homotopy type of the complex. top to bottom: collapsing a tetrahedron from a triangle. This implies that the square radius increases along every chain of the relation. We are only interested in tetrahedra. Partial order. Formally. the two orthospheres intersect in a circle that lies in the separating plane and the orthocenter of is further from that plane than the orthocenter of . With this notation. Using the classification into ten different operations. and neither does its inverse. and so on:  ¨ §  ¢ ¤!£ © ¥    £  ¡  ¢¡   £ § ¢ ) ¡   ¨ Figure V. The ancestor set of a tetrahedron contains . since they change the homotopy type.6: Think of the triangles as projections of tetrahedra and the circles of projections of spheres. we may introduce a partial order on the Delaunay simplices. Hence. the predecessors of the predecessors. The other sinks are the tetrahedra that contain their orthocenters. For each triangle visible from . By definition. if any. the collapse removes the tetrahedron. this is true because their orthocenters are Voronoi vertices that lie on the same side of the plane separating and .  01−collapse ¨  12−collapse 02−collapse implies that the square radius of the orNote that thosphere of is less than that of the orthosphere of . Such a pair defines a collapse. we call these operations metamorphoses. We will see shortly that the remaining six cases do not affect the homotopy type. we define . its predecessors. the transparent triangles. for . the changes in the dual complex described in Case C are caused by inverses of -collapses. if the orthocenter of a Delaunay tetrahedron lies outside then it sees either one. the complex obtained from by collapsing the pair is . M and M . they define metamorphoses in the evolution of the dual complex. and the dotted vertices. an edge and a vertex. It is convenient to specify the type using the dimensions and and to talk about -collapses. that represents the space outside the triangulation. this is true because the orthoradius of is infinity. for . which is the operation that removes all simplices between and including and . As noted in Case C .    ¢ ¢  ¨ ¢  © ¨ ¨ ¨ ¨  © ¨   ¢    ¢  ¡  ¢   ¤    ¨   ¨  £ ¤    ¥   ¥  ¥ ¤      ¨ £ ¤  ¢    ¨   ¥     ¤ ¨ ¢ ¤ ¨ ¥  ¤ ¨ ¢ ¡ ¤ ¨     ¡ the flow along normal vectors. and collapsing an edge from a vertex. is acyclic and its transitive closure is transitive. Pockets of dual complex. Consistent with the discussion in Chapter III. Recall that a princi- V S HAPE F EATURES 23−collapse 13−collapse 03−collapse pal simplex is not face of any other simplex in the complex. In the process. We are now ready to define and compute the pockets of the dual complex using the partial order over the tetrahedra. the dashed edges.5. To cover the case in which the triangle lies on the boundary of the Delaunay triangulation. Being a deformation retraction. . In each case. A proper face of a principal simplex is free if all simplices that contain are faces of .6.  ¥ ¤  ¥ ¨      . The centers of both (dotted) orthospheres lie on the right of the separating plane. two or three of the triangles. This is what we call a sink of the relation. They can be understood as inverses of the six types of collapses illustrated in Figure V. Each collapse can be realized as a deformation retraction that pushes a portion of ’s boundary through toward the remaining portion of the boundary. by definition. M . we introduce a dummy tetrahedron. If and are both (finite) Delaunay tetrahedra. If . which we think of as a discretization of Figure V. its orthocenter is at infinity. so can only be a successor but not a predecessor of other tetrahedra. the retraction removes and all faces of that contain .5: From left to right. collapsing a triangle from an edge and a vertex.74 Cases M . where is the tetrahedron on the other side of the shared triangle. As illustrated in Figure V.

Cambridge. who introduce a concept they call a hollow which is similar at least in spirit to our formal notion of a normal pocket. 88 (1998). FACELLO AND J.8 for a two-dimensional illustration. Surface reconstruction by wrapping finite sets in space. E DELSBRUNNER . An extension to include simplices of all dimensions has been used for reconstructing the surface of scanned point sets [2] and might have further applications in the analysis of protein shape. In Step 2. We ¤ ¥ ) Step 2. we can compute the connected components using standard graph algorithms. to appear. Bibliographic notes. 1078–1082. although this is not the common case. [4] I. Step 1. As illustrated in Figure V.7. Discrete and Computational Geometry — The Goodman-Pollack Festschrift. L IANG . This growth model forms the basis of the partial order over the Delaunay tetrahedra. A. On the definition and the construction of pockets in macromolecules. Sharir. The definition of a pocket is not purely topological and requires a crucial geometric component. The pockets in the dual complex are defined by the tetrahedra that neither belong to the dual complex nor to the ancestor set of . D. To complete Step 1. MIT Press. only one dimension lower. Structure-based strategies for drug design and discovery. we collect the triangles in that belong to exactly one pocket tetrahedron. we now collect all unmarked tetrahedra in a single scan through the list. we assume the Delaunay simplices are given in a list ordered by birth-time. namely the growth model of the input balls. mark the tetrahedra in the dual complex. In Step 1. The importance of cavities in drug design and discovery has been known for a while [4]. Computing mouths is similar to computing pockets. C ASATI AND A. we use the same standard graph algorithms to compute components.8: The eight disks form one pocket. Collect the boundary triangles not in Step 2. The formalization as pockets introduced in this section has been described in [3] and implemented as part of the Alpha Shapes software. VARZI . Basu. This has also been noticed by the philosophers Casati and Varzi [1]. Discrete Appl. the relation over the tetrahedra is acyclic and goes monotonically from left to right. We compute the pockets in two steps: ¥ ¥ 75 complex. C. Massachusetts. Step 1. Next. which form a prefix of the sub-list of tetrahedra. It is also possible that it belongs to more than one ancestor set. [1] R. Note that this is more conservative than collecting all tetrahedra outside that belong to ancestor sets of finite sinks. we call two triangles adjacent if they share an edge does not belong to . we mark the tetrahedra in the ancestor set of by searching backward from along the pairs of the relation. such as depth-first search or union-find. E DELSBRUNNER . . Pach and M.1 Pockets We have seen that a tetrahedron can have more than one successor. ω K Figure V. We may do the computation for individual pockets or for all pockets at once. K UNTZ . Call two tetrahedra in this collection are adjacent if they share a triangle that is not in the dual ¤ To collect the tetrahedra. Based on this adjacency information. M. Science 257 (1992). 1994. Collect the tetrahedra in . The corresponding pocket in the dual complex consists of four triangles and a single mouth edge.V. J. [3] H. dra of all pockets. such as the ones counted by the Betti numbers. Partition this collection into components. In everyday language we barely make any difference between pockets and other holes.7: Ordered list of simplices with relation over the tetrahedra indicated by arrows. Partition this collection into components. [2] H. Holes and Other Superficialities. See Figure V. Springer-Verlag. eds. B. Aronov. Math. Berlin. The resulting collection contains the tetrahe¢ ¢ Figure V. S. Finally. 83–102. which connects to the outside along one mouth. ¤ ¢ §  ¢ ¤ ) ¤ ¢ .

Hence . the two components merge twice.3. and the second merge creates a void that eventually disappears. any two contiguous complexes differ either by a metamorphosis or an anti-collapse.9 as an example. it is possible to decide in an unambiguous manner whether or not the tetrahedron destroys what the triangle created. We can thus write the Betti numbers of in terms of the ranks of various groups defined for as follows:   ¥ ¨  ¡ ¨ ¡ M1 ¡ ¥ ¤   ¡ ¤  M0 M0    ¤ M2 of is zero because is not a face of any simplex in . The only matrices affected by adding to the complex are the ones of and of . then we are talking about a void with positive life-time.9: The region grows from two vertices. We will formalize the idea of pairing creations with destructions by revisiting the incremental algorithm for Betti numbers presented in Section IV. a component gets destroyed. is the same for hand.76 V S HAPE F EATURES . Incremental algorithm revisited.10: The addition of to the complex appends a column to the matrix of and a row to the matrix of . Consider the it destroys if its addition decreases evolving two-dimensional space illustrated in Figure V.  ¡  ¤  ¡   ¨ ¦  ¡§     £ ¤  ¢     ¡© ¨ ¥  ¨ ¨¡   £ ¦   ¡   ¡§       ¤       ¤ for the corresponding filtration. There are three events at which homology classes are created. When the components merge the first time. We will see that even if a triangle and a tetrahedron are added at different moments. and we may interpret that life-time as a measure of significance of the void.  ¥ ¨ ¢ ¨ £ ¥ ¤¨ ¢ M1 Figure V. For example. The measure can be used to distinguish between pockets with relatively wide and narrow entrances and they are essential in the definition of molecular interfaces discussed in the next section. Each anti-collapse may be viewed as a sequence of metamorphoses in which the later simplices destroy the topological features created by the earlier simplices.3 and depends on the effect on the Betti numbers: a -simplex creates if its addition increases and  ¤  ¡  ¤ Figure V. The are the complexes that arise during the evolution and. We may also interpret it as a shape measure of the corresponding pocket. which are displayed in Figure V. As before. may remain the same or it may increase. If it does. The labels indicate the types of metamorphoses that correspond to the topological changes. Let the dimension of be . we measure the life-time or persistence of a topological feature in an evolving topological space. Nobody destroys the component created by the left M . in the generic case. a 1-cycle gets destroyed. Then belongs to a -cycle. It should be clear that M destroys what the upper M created. Case creates. The life-time of this void is zero because the triangle and the tetrahedron are added at the same moment. and that the lower M destroys what the right M created. we write ¡  In this section. and when the hole gets filled. which implies that its row in the matrix of can be zeroed out. We study the algorithm in terms of matrices of boundary homomorphisms.10. the rank of the -th boundary as it is for . A prime example of an evolving topological space is a space-filling diagram that grows in the way discussed in the preceding section.2 Topological Persistence 0 )   ¡¤  ¤ ¢    ¢  ¤ ¢¡ ¤   ¥ . ¡ ¡  Ck C k −1 C k +1 0 Ck   ¡  ¡ ¥ ¡  ¡ ¡     ¡  ¡   ¥  ¡ ¥ ¥       ¤  0  ¤ ¡   0 The idea of creation and destruction is the same as in Section IV. a 23-collapse consists of a triangle creating a void and a tetrahedron filling the same. namely when the two components get born at the points labeled M and when the components merge the second time at the upper point labeled M . ¡ 0  V. Recall that a single step in that algorithm computes the Betti numbers of a complex from the Betti numbers of . On the other group. The new column of the matrix   The intuition.

we let be the index of 1 1 1 1 1 1 1 1 1 1 1 After running Function DOES C REATE for the -th row. the index. To make this precise. After that addition.   ¥ 0  £    ¡ ¡ ¦     ¢    £  ¤ ¨ ¡ $ £  £  £          ¥     0 ¥ ¨ ¨ ¢     ¡    ¡  ¡      ¡  ¡ ¡   ¡  ¡   ¥ ¥ ¥ ¥ ¨     £ ¥ ¥ ¥    ¨   ¨ ¥ ¥     ¥ ¥ ¥  ¤  0  ¤ ¡    ¥    ¥ 0  ¨ . we now define the persistent -th homology group of as the cycle group divided by the boundary group at positions later in the filtration: Taking the intersection of the boundary group with the cycle group is necessary for technical reasons to define the quotient group. and we assume a function L AST C OL that returns the index of the last column. we return to the situation in which the filtration represents meaningful information. which are slower but more general. To describe how this is done. columns in the matrix of correspond to individual -simplices and rows represent cycles. that row is either zero. we define persistence so it depends on the time when simplices are added to the complex in the filtration. we say Keeping this convention in mind. the -st Betti number decreases by one and the -th Betti number remains unchanged. but to simplify matters here. boolean DOES C REATE int while L AST C OL do if ROW then row row row else return FALSE endif endwhile. To explain the algorithm. we re-define time equal to is added at time . ¨ $ £ In words. such as scale in the case of alpha shapes. it returns zero if that last column does not exist. we attempt to zero out its row from right to left.   ¥ The case analysis confirms that the incremental algorithm as described in Section IV. Figure V.V.12 illustrates the difference Zj 0 B j+p Bj Figure V. this property is satisfied by the matrix in Figure V. we use row operations to reinstate the property before adding the next row. among the first the last column. Clearly. in which case the corresponding simplex creates.3 computes the Betti numbers correctly. Since we only use row operations.    77 rows. Then does not belong to a -cycle. in which destroys.11: The shaded rightmost non-zero entries identify last columns of rows. we call the column of the rightmost non-zero entry in a row its last column. we use elementary row operations. We argue below that Function DOES C REATE computes more than just Betti numbers: it also determines how long a homological feature lasts along the filtration. we maintain inductively that each column is last for at most one row. Instead of a unionfind data structure. each row has at most one last column. Hence. Given a column . the row that corresponds to the new simplex . return TRUE.11 before the shaded last row is added. Besides re-proving the correctness of the incremental algorithm. case Persistent homology. Recognizing creations. When we add . the changed and the -th Betti number increases by one. For example.12: The cycle group and its decompositions into solid -persistent homology classes and dotted 0-persistent homology classes.2 Topological Persistence -st Betti number remains unIn words. for which is index of the row. In general. Case destroys. In other words. It returns zero if the row is not defined. Its row in the matrix of can therefore not be zeroed out and we get a new non-zero entry in the normal form of that matrix. the above analysis points the way to an alternative procedure for distinguishing creating from destroying simplices. or it has a unique last column. we also assume a function ROW that returns the Figure V. Conversely.

Observe the large triangular plateau. each taking time at most proportional to . The persistence of a pair is the time-lag between the additions of the two simplices to the complex in the filtration. Specifically. The index in the filtration varies from left to right and the persistence from back to front. the number of destroying -simplices is the rank of the boundary group: . The -persistent -th Betti number is the rank of the -persistent -th homology group: . Indeed. (Can you prove that?) In contrast.  ¨  ¨  ¥ #     ¥ £¨ ¥ §¥ ¥ ¨    £ £ ¡ ¤¢  ¨   ¥ Interval property of persistence. each starting at the position of a creating -simplex and ending at the position of a destroying -simplex (or extending to infinity if there are no destroying simplices left). £ £ £ Figure V. The Betti number at position is then the number of intervals that contain .78 between the -persistent homology group and the usual or 0-persistent homology group. Function DOES C REATE spends fewer than row operations per simplex. which corresponds to the dominant tunnel that passes through gramicidin. We develop an intuitive picture of persistence using the distinction between creating and destroying simplices. The -persistence -th Betti number of is represented by the point in the index-persistence plane. Note that the number of creating -simplices until position in the filtration is the rank of the cycle group: .13: Each right-angled isosceles triangle in the indexpersistence plane represents a non-bounding cycle that persists over the complexes covered by its interval. $ 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 [ [ [ ) [ ) ) ) index .14: Graph of scale for gramicidin. as shown in Figure V. it is the number of right-angled isosceles triangles that contain this point. there is exactly one pairing that has the following stronger property for persistent Betti numbers:    ' ¨ $  ¤ ¥ ¥  ¥ ¨ ¥    ¥   ¤    £ ¥   ¥   ¡    ¤  £ ¨ ¥    ¥ $ ¨ $ ¢ $   ¤ ¥ ¤  £ ¥      0 ¨ ¥ ¨   ¥ ¡  ¥ ¥ ¢   ¨ ¥    0 ¥ persistence . According to the Interval Property. Pairing.4. as it witnessed by the cycle represented by the row. We can therefore pair them up and form vertex disjoint intervals. which is at most some constant times . $ V S HAPE F EATURES We illustrate this property by drawing a right-angled isosceles triangle below every interval. Because Betti numbers are non-negative. The -persistent -th Betti number at position is the number of intervals that simultaneously contain and . the creating -simplices and destroying -simplices are arranged like opening and closing parentheses in an expression. The Betti number is the surplus of creating versus destroying simplices: . In the assumed simplified case in which is added at time . except that some closing parentheses may be missing at the end.13. Each triangle is closed along the top and left edges but open along the hypotenuse. The pairing of simplices to obtain intervals satisfying the Interval Property is done using Function DOE S C REATE explained above. Similarly. The running time of the pairing algorithm is roughly the same as that of the normal form algorithm described in Section IV. the persistence is the difference between indices: . In particular.16. every prefix contains at least as many creating -simplices as destroying simplices. which shows the persistent first Betti numbers of the space-filling diagram modeling the gramicidin protein. We use intervals that are closed to the left and open to the right. This is the convention we used to generate Figure IV. the number of tunnels in logFigure V. Any arbitrary pairing creating vertex disjoint intervals has this property for Betti numbers. Note that this simplex indeed creates. namely cubic in the number of simplices. each destroying -simplex corresponds to a non-zero row in the matrix of and is paired with the -simplex that corresponds to the last column in that row.   ¥ $ 6 5 4 3 2 1 0 0 I NTERVAL P ROPERTY.

Discrete Comput. which are special tables of related homology groups [2]. Persistent homology groups are embedded in spectral sequences. M C C LEARY.2 Topological Persistence Bibliographic notes. Persistent Betti numbers have been defined independently by Robins [3]. 28 (2002).V. Topology Proceedings 24 (1999). Geom. [1] H. the algorithm and its correctness proof. the implementation uses a union-find data structure to classify simplices as creating or destroying. that the implementation in [1] differs in two possibly significant aspects from the algorithm described in this section. The material for this section is taken from [1]. We should note. [2] J. and second. The algorithm has been implemented and experimental results suggest it is considerably faster than the obvious cubic time bound. Second edition. Toward computing homology from finite approximations. it uses a sparse matrix representation that permits row operations in time proportional to the number of non-zero entries. 2001. however. A User’s Guide to Spectral Sequences. Cambridge Univ. 79 . 503–532. Topological persistence and simplification. 511–533. It might be interesting to explore the other groups in that table and to find meaningful interpretations in the context of alpha complexes. E DELSBRUNNER . L ETSCHER AND A. England. Z OMORO DIAN . D. ROBINS . First. Press. who uses them to study the fractal nature of two-dimensional point patterns. where we find the definition of persistent Betti numbers. [3] V.

and four-chromatic vertices.16. every interface vertex is a four-chromatic vertex in the Voronoi diagram. Recall that the Voronoi diagram of consists of a polyhedral cell for each ball and of the polygons. Every sheet is a maximal component consisting of bi-chromatic polygons. By construction. In other words. curves and vertices. An interface edge belongs to two cells of one and to one cell of the other color. every curve is a maximal component consisting of tri-chromatic edges and vertices of a given color triplet. We conclude that in the generic case  V. One of its applications is to display functions defined over the interface. but now these 2-manifolds meet along curves formed by tri-chromatic edges.16: The shaded polygons and their edges belong to the interface. In the generic case. As illustrated in Figure V. if belongs to then we say and have the color . we get a 2-manifold. Figure V. On the left. and let be the collection of all balls. Similarly. each represented by a collection of balls in .    Figure V. Together. the interface has a particularly simple local geometric structure. We use colors to keep track of the correspondence between balls and molecules. the sheets. the local structure of the interface can For be more complicated because we may have tri-chromatic edges and tri. For any two colors. and the boundary of every curve consists of finitely many interface vertices. every edge belongs to three and every vertex to four Voronoi cells.   ¡          ¢   ¨      #  # ¢ ¡   ¡        ¡ ¡ ¡ . a polygon can be mono-chromatic or bi-chromatic depending on whether the two cells that share the polygon have the same or different colors. edges and vertices get their colors from the cells they belong to. with the cells of one color on one side and the cells of the other color on the other side. that 2manifold is orientable. the interface is a two-dimensional complex of sheets. which is a topological space in which every point has an open neighborhood homeomorphic to . The interface between the is the subcomplex of the Voronoi diagram consisting of all colors. The dotted mono-chromatic edges show the rest of the Voronoi diagram.80 V S HAPE F EATURES bi-chromatic polygons and their edges and vertices. Our definition of a molecular interface is a formalization of two intuitions. On the right. Consider an assembly of molecules. namely that the best separation of two or more molecules is part of the Voronoi diagram and that the interesting portion of that separation is protected by a relatively tight seal. There are two types of interface vertices: those that belong to three cells of one and one cell of the other color and those that belong to two cells of each color. The polygons. Figure V. Specifically. Finally. the local neighborhood of both types of vertices is a topological disk. We will come back to the second intuition later and formalize the first intuition now.15 illustrates the definition by showing the interface of two collections of disks in the plane. Interfaces without boundary. While all cells are mono-chromatic. we present a proposal for a surface or complex of surfaces that geometrically represents that interface.15: The solid bi-chromatic edges form the interface of the two collections of disks. curves and vertices form a complex in the sense that the boundary of every sheet consists of finitely many pairwise disjoint curves and vertices. edges and vertices of a given color pair. and exactly two of the three polygons sharing the edge are bi-chromatic and thus belong to the interface. edges and vertices shared by the cells. Local structure. This implies that for colors.3 Molecular Interfaces The interface between two or more interacting molecules is the location of that interaction.  ¡ the interface for colors is a -manifold. In this section. we have two cells of each color. we have three cells of one and one cell of the other color.

we delete principal triangles. we retract the interface back to the multi-chromatic dual of the dual complex and its pockets. we clip the polygon before adding it to the interface.    ¨   ¨   ¤  ¤   ¨ void C OLLAPSE if and forall faces endif. We use 23-collapses to remove these tetrahedra. Figure V. In the second step. the interface may go to infinity. More specifically. we connect the cut points in contiguous pairs and retain the portions of the polygon with vertices of the first type. We will return to the second step later. The interface is now obtained as the dual of . we take pairs from the stack and add new pairs whenever we create new boundary triangles by collapsing. in other words. There are. C OLLAPSE endwhile. It seems natural to do this with a distance threshold. We further remove all mono-chromatic tetrahedra and let denote the remaining collection of multi-chromatic tetrahedra. In other words. which happens in rare cases. During the process. We therefore shrink from outside in and use relative rather than absolute distance measurements to decide where to stop the process. we clip at the endpoint that is closer to the plane. This is equivalent to saying that the effect of the 23-collapse is the inverse of that anticollapse. complications because such a bi-chromatic edge may either be completely or only partially surrounded by tetrahedra in . Finally.  In this context. We have mono-chromatic vertices and mono. Initially. We clip the polygon by cutting each edge connecting vertices of different types with the plane of the corresponding boundary triangle. which is sometimes a disadvantage. but this would most certainly lead to the deletion of interior portions and produce fractured surfaces. In the latter case.17 illustrates this idea in two dimensions. which represents the space outside the Delaunay triangulation. As defined above. triangles and tetrahedra. however. In the implementation of this operation. The result of the retraction is the collection of tetrahedra in the dual complex together with the tetrahedra in the pockets. To describe the shrinking process. We define a retraction as a maximal sequence of collapses. we consider the Delaunay triangulation of the collection of balls . edges and vertices as soon as they arise. Clipping. In the first step. we add the dual polygon to the interface. edge corresponds to a polygon with two types of vertices: those dual to tetrahedra in and the others.3 Molecular Interfaces Retraction.   © ¢ ¨ ¦ ¤ ¢   §§£§¥£¡ 81 We may think of a retraction as successively removing sinks from an acyclic directed graph. The boldface interface is dual to and clipped at the boundary of this collection. If that plane does not intersect the dual Voronoi edge. The interface as defined above is dual to the subset of multi-chromatic simplices in . we use topological persistence to shrink the interface even further. It follows that the result of the operation is independent of the sequence in which the collapses are performed. We simplify the algorithm by ignoring principal triangles. this stack contains all boundary triangles of the Delaunay triangulation together with their incident tetrahedra. edges and vertices.as well as multichromatic edges. we maintain a stack of candidate pairs. for each bi-chromatic edge of the tetrahedra in . : is collapsible then do delete endfor     ¨   ) ¨ ¨ ¤ ¤ ¢¡    ¨ ) Figure V. A partially surrounded bi-chromatic     Complex R ETRACT : while the stack is non-empty do P OP. but we should keep in mind that the situation in three dimensions is more complicated.V. Note that the first step of the shrinking process is equivalent to removing all tetrahedra outside the dual complex that belong to the ancestor set of the dummy tetrahedron. Our goal here is to shrink the interface back to where the molecules are sufficiently close to interact.17: The triangles drawn with solid edges are the bichromatic triangles constructed by the contraction algorithm.     ¨ . we collapse as long as we can. we consider collapsible if the pair is part of an anti-collapse in the construction of the filtration and the collapse of and renders the other simplices in this anti-collapse principal. Let denote the dual complex.

We may start with the set of all Delaunay tetrahedra. void R EMOVE : if then delete . we get the interface by duality from the computed collection of tetrahedra. Finally. the interface is a two-dimensional complex. . :   ¨ ¨       D   ¨ ¨       ¡     ¨   ¡ ¡   ¡     ¡      ¡ ¡  ¡      ¡ ¡ ¨  ¡    ¨ ¡    ¨ ¢      ¤ ¨ ¨ ¡ ¡   £ ¨       &¡      &¡   ¡   ¨   ¨ ¤ ¢     ¨ ¨ ¨ ¨ ¢  ¢ ¡          ¨ ¨ ¡ ¡ ¡     ¢ £  D ¡     ¡ ¡ ¡ ¨ ¨   endif . There are two kinds of one-dimensional elements: the original tri-chromatic curves and the new bi-chromatic curves outlining the sheet boundary created by shrinking. With some care. gets deleted just because it becomes principal. Indeed.2 generates simplex pairs with the property that destroys what created. We note that it is possible to use other functions that satisfy the monotonicity property (V. We take all sheets and curves as open sets so the complex is a collection of pairwise disjoint open elements. although it can be. We now restate the algorithm and simplify its description by declaring a 23-collapse as a special case of a removal. Eventually. V S HAPE F EATURES As before. Global structure. If the retraction from reaches far enough. endfor. we remove principal triangles. the interface shrinks with decreasing . however. Initially. We do the operation only if is a boundary triangle of and does not belong to the dual complex. if we use . To decide whether or not to remove and in the first place. The algorithm maintains a stack of triangletetrahedron pairs formed by the topological persistence algorithm. However. A second potential advantage of this function over the inverse of the persistence is that it is dimensionless and thus amenable to the use of universally meaningful constant thresholds. which takes cubic time to form the triangle-tetrahedron pairs. We now take the shrinking process beyond the retraction from the dummy tetrahedron. we compare their persistence with a constant threshold and remove only if . edges and vertices. The running time is dominated by the topological persistence algorithm. Note that we may get different interfaces for different values of the threshold . This is done implicitly during the retraction. We first delete and then retract from . but we are only interested in the case in which is a triangle and is a tetrahedron. if then R EMOVE endwhile. the interface is guaranteed to be empty. it can happen that the retraction does not reach all the way. we can implement the rest of the algorithm so it takes only constant time per simplex in the Delaunay triangulation. which would remain. but it is more complicated because is generally not a face of . Since a smaller threshold permits as many or more removals than a larger threshold.1). We think of the operation that removes and as a generalization of a 23-collapse.82 Further retraction. © ¨ ¦ ¤ §§¨ ¥   ¡ ¡ ¡ £      &¡  ¢     &¡     ¢   ¨  ¨ on the boundary of the current set . For a fixed . in which case we recurse for other pairs of simplices before deleting . there are two kinds of zero-dimensional elements. In this case. Note that for we have ¢   Complex R ETRACT M ORE while the stack is non-empty do P OP . for . the interface is the original surface or complex defined by the set of bi-chromatic Voronoi . all other collapses can be ignored. forall triangles do P USH R ETRACT endif. Note. we get a filtration that is parametrized in a way similar to the sequence of alpha shapes. Because of our policy to delete principal triangles. For example. The dimension of is one larger than that of .1)  ¡ ¡ ¡ £      &¡  £   ¢        Here. the stack contains all pairs with  (V. the interface is empty. we can further decrease the interface by making negative. For dual complex of contains bi-chromatic triangles. As before. For . and are the moments when and are born. edges and vertices as soon as they get created. where and are the moments when and are born. £¤¥      This monotonicity property is important for the correctness of the algorithm because if the retraction from does not reach then this can only be because there is a triangle between and that split the void created by before it was destroyed by . Recall that the topological persistence algorithm of Section V. we may bias the shrinking process against large triangles and tetrahedra by using . But then the other part of the void must have been destroyed by a tetrahedron preceding in the filtration. namely the original four-chromatic vertices and the new tri-chromatic vertices forming the curve boundary created by shrinking. but we have to modify the retraction to allow for collapses of simplices in the dual complex. Its two-dimensional elements are sheets defined by bi-chromatic Voronoi polygons. In other words. Here. is the tetrahedron that shares with . unless the polygons. The monotonicity guarantees that the simplices between and are removed by recursive deletions so that can eventually be deleted.

. 1995”. ¦ ¦    ¥ ¦  ¥    "  ¦       ¥ ¥ # ¤ " ¥ ¨      ¦ ¤ ¦ #  ¡   . A classic result in topology says that two orientable 2-manifolds with boundary are homeomorphic if and only if they have the same genus and the same number of holes. We may think of this manifold as obtained by punching holes into a -fold torus. A competing proposal for a geometric definition of molecular interfaces can be found in [3]. 36–43. edges and triangles of any arbitrary triangulation of the 2-manifold. [2] W. Defining. RUDOLPH . Algebraic Topology: an Introduction. where two independent real parameters are used to define the interface as a portion of the molecular surfaces of the two or more molecules. S. New York.. Each component of the boundary is a closed curve outlining a hole in the 2-manifold. and are the number of vertices. computing and visualizing molecular interfaces. B ROOKS . W. The fact that the topological type of a connected orientable 2-manifold is determined by the genus and the number of holes can be found in a number of texts. [4] J. including [2]. V. 1967. R ICHARDSON . To explore this further. In “Proc. The material in this section is taken from the recent manuscript by Ban et. Springer-Verlag. Given a sheet. Durham. E DELSBRUNNER AND J. 93 (1996). Bibliographic notes. Manuscript. al [1]. H. the Euler characteristic of a 2-manifold with genus and holes is  ¡ 83 where . There is evidence that the geometric interfaces shed new light on the hot-spot theory of protein-protein interaction [4]. VARSHNEY. Furthermore. Acad. we excise thin strips along the curves to turn each sheet into a connected 2-manifold with boundary. C. In topology. P. it is easy to compute its Euler characteristic and to determine its number of holes. J R . [1] Y. We then get the genus as . W RIGHT AND D. D. Proc.V. W ELLS . M ASSEY. 2002. BAN . A definition of interfaces for protein oligomers. Natl. M INOCHA . F. Binding in the growth hormone receptor complex. IEEE Visualization.-E.3 Molecular Interfaces that the elements are not necessarily simply connected. 2-manifolds with and without boundary have been studies for more than a century. Sci. 1–6. A. [3] A. Duke Univ. North Carolina.

The noise in the signature decreases from back to front.2. The index 2.19: The signature panel with the tunnel signature displayed in log-scale.20: The graph of . and finally look at molecular interfaces. Two of the three views are taken along tunnel systems that intersect orthogonally and give rise to a rather complicated cave system.4 Software for Shape Features In this section.19.296 atoms. Note that the tunnels shown in the second view are smaller in diameter than those shown in the third view. Prior to developing and implementing pockets. ¢ ¡  0  0 ¡ 0 Figure V.354-th dual complex in the filtration of a periodic zeolite molecule consisting of 1. To the left of each button we can toggle the display of the evolution of the number as a function of the index in the filtration. It follows that there are complexes in the filtration that have the tunnels in the first system closed while the tunnels in the second system are still open.18. We refer to these functions as signatures of the data set. One such idea was to display the difference between the Delaunay triangulation and the dual complex. we have experimented with other and more simple-minded ideas aimed at getting a handle on cavities in molecular data. The persistence of the tunnels is formally defined in Section V. the components. The two persistent tunnel systems are visible as plateaus that escape the noise removal the longest. then proceed to pockets. tunnels and voids of a complex in are counted by the Betti numbers . of the zeolite data. Figure V.19. Betti number signatures. 12 10 8 6 4 2 0 0 5000 10000 15000 20000 25000 30000 35000 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 . They are computed by the algorithm explained in Section IV. we explore extensions of the Alpha Shape software that are concerned with connectivity information and shape features.3 and displayed to the right of the correspondingly labeled buttons in the signature panel shown in Figure V. . which implies that both tunnel systems are open in the displayed complex. As an example consider the zeolite data shown in Figure V.354 belongs to the higher of the two plateaus. As explained in Section IV.  ¤ ¥ ) Figure V.20 shows the two-dimensional Displaying pockets. or more generally the difference #   §¥    £ £ ¡ ¤¢  Figure V.84 V S HAPE F EATURES V.2. tunnel signature with filtration index increasing from left to right and persistence increasing from back to front. We begin with signatures. The two systems can be detected in the tunnel signature shown in Figure V.18: Three axis-parallel views of the 2. and . the number of tunnels in log-scale.

is similar to that of the signature panel. with . whose dual complex is considerably smaller than a corresponding space-filling diagram.23: Pocket panel of the Alpha Shape software. which may lead to confusion. The panel also provides a means to step through the sequence of individual pockets and to select pockets by their number of mouths. In contrast. A useful feature is the ‘Shapewire’ button. as shown in Figure II. An ex- . The main design of Figure V. and using the explosion function to separate all simplices. Two boundary triangles that share a common edge may or may not belong to the same mouth depending on which shared edges belong to the pocket. The interface also supports the ¥ 85 closed under the face relation. The software indicates the presence or absence of boundary triangles by the choice of color. For example. can be computed in the Alpha Shape software by first selecting and and second pushing the ‘Difference’ button in the scene panel. Pocket panel. We observe the same phenomenon for the mouths of a pocket. all tetrahedra .V. In other words. This elimination of large pockets helps in the exploration of detailed structures. as in Figure V.17. The skeleton does not block the view and helps positioning the pockets relative to the complex.21. such as side pockets of larger pockets.21: All pockets in the dual complex of the zeolite data for index 2. shown in Figure V. ¨ Figure V. ¥  ¤  ¤ ¥ ¨ Figure V. but a more detailed exploration requires interaction with the software. The results are not encouraging because a typically large number of inessential simplices clutters the view of important cavities. We should keep in mind that the pocket in the dual com- Remember that pockets in the dual complex are not ¥  ¡   ¨ ¢ ¨ ¥ plex is geometrically considerably larger than the pocket in the corresponding space-filling diagram.4 Software for Shape Features . and Figure V.23. which can be used to display the edge skeleton of the dual complex together with the pockets. the internal connectivity of the pockets is not immediately visible. This effect is the reverse of that for the molecule. It is possible to visually inspect the connectivity by turning on the display of simplices of all dimensions in the scene panel. which is facilitated by that panel. the dual set of a pocket usually gives a clear indication of the cavity. However. It is used to eliminate ancestor sets of tetrahedra whose indices are larger than or equal to .21. The mouth regions are therefore visually easily identifiable. It contains a window for its own signatures.22: Side view of the largest pocket of the collection shown in Figure V. The second index. . display of individual pockets. two pockets may appear connected but are not because of missing shared triangles. are treated like in the computation of pockets. Pockets can be computed without opening the pocket panel. can be chosen anywhere between and the maximum.22 shows the largest of the pockets in Figure V. which start after the index of the first chosen complex.926. This difference between two dual complexes.21 from a different angle. the pocket panel.

[Say a few works about the particular two proteins. [2] J. 81 (2001). thesis. D. [3] J. Figure V. E DELSBRUNNER AND C. Liang and Dill [2] provide numerical evidence that proteins are packed tighter in the core than near the outside.18 are still open. and as can be seen in the first view. Illinois. 2001. Dept. Some of these features can be seen in visualizations of interfaces presented in this section. Sci.18. It is built on top of the Alpha Shapes software but requires a variety of additional features to be useful to biologists. which shows the pockets filling the system of narrow tunnels visible in the second view in Figure V. L IANG AND K. Univ. the pocket with the largest volume is also the biologically active site of the molecule. Both systems are shown as holes in Figure V.24: Three axis-parallel views of the pockets representing the narrow tunnel system decomposed into pieces by opening up the wide tunnel system.] [Show one figure with iso-lines of that function. [4] A. In another application. Z OMORODIAN . thesis. The pockets thus only fill the remains of the narrow tunnels. but with set such that the system of wider tunnels visible in the third view of Figure V. Ph. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand binding. D.86 ample is shown in Figure V. Are proteins well-packed? Biophysics J.] Bibliographic notes. 751–766. Using this software. these remains are not connected. A. 1996. Urbana. the largest pocket is assisted in its function by smaller auxiliary pockets in the vicinity. The pocket software has been developed by Michael Facello and is described in his dissertation [1]. Urbana. [1] M. which we remove for simplicity. FACELLO .24. Univ.18. Dept. Comput. 1884–1897. Liang and collaborators [3] studied ¥ . The interface software has been developed by YihEn (Andrew) Ban but is not yet complete.] A human growth hormone example. [The input is a complexed collection of proteins. The most interesting outcome of that study is perhaps that in about 80% of the cases. L IANG .] [Talk about the weighted square distance function over the interface. H. W OODWARD . D ILL . Sci. The persistence software has been developed by Afra Zomorodian and is described in his dissertation [4]. Displaying interfaces. Ph. A. V S HAPE F EATURES fifty-one proteins and their cavity structure.. Protein Science 7 (1998). Analyzing and Comprehending the Topology of Spaces and Morse Functions. Comput. Geometric Techniques for Molecular Shape Analysis.] [Mention the issue of water molecules. In many instances.] [Show the sequence of figures illustrating the interface filtration.. It is currently not part of the Alpha Shape software. Illinois.

(ii) Prove that the Euler characteristic of and are the same. (ii) Prove that the ancestor sets of any two different sinks in the order are disjoint. Sperner’s Lemma. be a simplicial 4. (i) Given a pairing. A pairing is a perfect matching between the opening and closing parentheses such that the opening parenthesis precedes the closing parenthesis in every pair. and the label of a vertex in the interior of is either . 2-manifolds.  ¡ B ¤   ¤ ¤  ¥ ¤ ¤ ¥     B ¡R   ¡  ¡ R ¡R   B     ¡     ¡ R ¡ R ¥ ¡R  ¨ ©  ¤ ¤  ¡ ¥    ¤£ ¤ ¡ ¤ ¤  ¤ ¥  ¢ ¤ ¥ for all points . Consider the Delaunay triangulation of a finite points set in . Let be a triangle and a . (ii) Strengthen the result in (i) by proving that the number of triangles with three different labels is odd. such as for example . and the length of a pair is position of the closing minus the position of the opening parenthesis. Recall that a contractible topological space has the homotopy type of a point. The Gabriel graph of consists of all edges for which  (i) Prove that all edges in the Gabriel graph belong to the Delaunay triangulation of . (iii) Explain how the Gabriel graph relates to the ancestor sets of the sinks. 7. Connectivity of voids. Clearly. Each parenthesis has an integer position in the sequence. let be the sum of lengths of . can the Euler characteristic of a void be any integer or are there restrictions? 6. (i) Prove that if is embedded in then is collapsible iff its underlying space is contractible. Paired parentheses. Consider a sequence of parenthesis of a well-formed expression. (i) How would you define the Betti numbers of a void? ¥ £ 8. ¥ d       ¡ §    pB u § p   sB p ¥  §  ¡  p ¥     ¤   ¢  §¡CB p§ ¢ £¡   © ¤  ¥ ¤  p ¢   £ . (i) Show that a two-dimensional simplicial complex in which every edge belongs to exactly two triangles is not necessarily a 2-manifold. for every . (ii) Prove that the Gabriel graph is connected. A void of a space-filling diagram is by definition connected but can have handles and islands. Recall that a 2-manifold is a topological space in which every point has an open neighborhood homeomorphic to . Ancestor sets in the plane. or . Collapsible complexes. Barycentric subdivision.Exercises 87 (ii) Following your definition. We call a simplicial complex collapsible if there is a sequence of collapses that reduces it to a single vertex. Gabriel graph. (i) Prove that is a partial order. Write if the two Delaunay triangles share an edge and both orthocenters lie on ’s side of that edge. the pairs. (i) Show that each -simplex in gives rise to -simplices in . Prove that (ii) Prove that depends on the given sequence but not on the pairing. 3. 5. (i) Prove that there exists at least one triangle in whose vertices have three different labels. (ii) Show that a simplicial complex in which the closed star of every vertex is the triangulation of a disk is necessarily a 2-manifold. Let be a finite set of points in . The label of a vertex in that triangulation of lies on the edge is either or . Let complex and let denote its barycentric subdivision. (iii) What would be a natural generalization of these results from a triangle to a tetrahedron?  Exercises 1. 2. for . (ii) Give an example of a simplicial complex embedded in that is not collapsible but whose underlying space is contractible. if is collapsible then its underlying space is contractible.

88 V S HAPE F EATURES .

we introduce Morse theory with an emphasis on the twoand three-dimensional cases.4 Morse Funcitons Critical Points Morse-Smale Complexes Jacobian Submanifolds Exercises 89 . In the second section. we will see that many themes are familiar from Chapter IV.1 VI. we make an effort to relate the Morse theoretic concepts with the discussion on connectivity. Together with suitable non-degeneracy assumptions.2 VI. The differentiability assumption allows the introduction of otherwise undefined concepts. Because of this relation. While Morse theory requires differentiable spaces and thus seems to be built on rather specialized assumptions. Possibly the best known result in Morse theory is the relation between the critical points of a smooth real-valued function over a manifold and the Euler characteristic of that manifold. it brings order into the complicated world of geometric form. In this chapter.Chapter VI Density Maps Morse theory grew out of the study of the variational methods in analysis. We use two sections to introduce the basic setting of Morse theory and one to explain the concept of molecular pockets in Morse theoretic terms.3 VI. [The material will have to be partially rearranged according to the following plan of sections:] VI. Morse theory is but a different language or framework to talk about connectivity. In some ways. Morse theory is sometimes also referred to as critical point theory. The initial interest focused on highand possibly infinite-dimensional settings.

1. Each time the homotopy type of changes. two 1-cells. each hemisphere can be parametrized by orthogonal projection to one of the . The elements of the vector space are called tangent vectors to at ¦ ¨  ¡ B  ¡ G ¦© G ¦© £ ¨¥ ¨ B ¨ ¦ It is instructive to look at the evolution of the homotopy type of . ¦ ¨¡ B ¥ (¢'     p  £ h (q)  B    q "     ¦  ¦ ¦ h ( r) ¡ ¥ r © ¦ ¦ h ( s) s ¥ ¤ ¨ pB s!p ¢ ¢ ¡ ¡ B ¡      ¦ ¡ ¥ ¦ ¡ §§ ¡ ¥ ¦ ¡ ¦ © § ¥ ¥ ¡  © ¦ ¡ ¦ ¡ B ¡  ¦ As illustrated in Figure VI. For we have empty of boundary.2. A -cell. from an open set to another open set is smooth if the partial derivatives of all orders exist and are continuous. The primary goal is to find out about the topological type of the manifolds through a differential analysis of the functions. The attachment of to a space requires a continuous map . note that the boundary of is a -sphere. . . The tangent space at is the -dimensional hyperplane through the origin of that is parallel to this best approximating hyperplane. In order to relate the topological type to differential properties.2: The upper open hemisphere is parametrized by projection to the -plane. we can interpret this event as attaching a cell of some dimension.1 Smooth vs. Piecewise Linear   ¢  ¡   ¢ ¡  ¡    © # ¡     ¦     p B ¤ urp ¢  ¡ ¡ B       ¢ ¡  ¡    § ¦  R ¤ ©B ¢ ¢ ¦ ¡ B ¡ ¡ ¡ ¦ ¦ ¡ ¡ ©B  £    ¤£ ¡   ¦  R ¡ § ¦       ¦ ¦ R ¦ ¢ £¡ ¡ ¡B  ¡   ¡ B     ¥ ¡ YR ¥   ¡¦ ¥    ¡   ¦ ¦ ¥ . As an example we may consider the 2-sphere . We need some basic definitions from differential geometry to express these restrictions. which we refer to as the gluing map. Then with attached by is the space obtained by identifying every points with . . we consider the set of points with height less than or equal to . Sweeping a torus. and two spaces are diffeomorphic if there is a diffeomorphism between them. we can construct coordinate planes. and its inverse is called a coordinate system on .1: Evolution of the torus in the sweep from bottom to top and the corresponding construction by attaching a 0-cell. so attaching a point or 0-cell is the same as taking the disjoint union. and a 2-cell. For a point   A Morse function is a smooth real-valued map over a manifold that satisfies certain non-degeneracy assumptions. As shown in Figure VI. the map is smooth if for every there exists an open set containing and a smooth map that coincides with throughout . Note that the composition of two smooth maps is smooth. is a space homeomorphic to the -dimensional ball.  B B   A map   h ( p) attach 0-cell 0 attach 1-cell attach 1-cell attach 2-cell Figure VI. We can cover with six open hemispheres defined by for . A subset is a smooth manifold of dimension if each has a neighborhood that is diffeomorphic to an open subset . we need to restrict ourselves to sets for which such properties are defined. All interior points Figure VI. For general and not necessarily open sets and . Smooth manifolds.1. This section introduces Morse functions as a crucial piece in the basic mathematical framework of Morse theory. The evolution of the torus during the sweep and the interpretation of attaching cells is illustrated in Figure VI. . The standard introductory example is the torus embedded in upright position in and the height function this embedding defines. a -dimensional hyperplane in that best approximates near . A diffeomorphism is a smooth homeomorphism whose inverse is also smooth. For each .  ¦   ¡ § ¢  ¦ ¦ © § ¢ ¡   ¡ ¥   ¡  ¡ B   © © ¡ B  ¡     VI. Morse theory talks about manifolds and smooth functions over these manifolds. To define what attaching a cell exactly means. is defined by mapping each point to its distance from the plane.90 VI D ENSITY M APS do not belong to . Formally. A particular diffeomorphism is called a parametrization of . changes its topology only at certain critical values of .

The Hessian is symmetric and we can compute its eigenvalues.  throughout .1 are 0. and a minimum for . Let be a non-degenerate critical point with index of . and marked in Figure VI. These directions span a dimensional cell needed to realize the connections. . . 1. Recall that the eigenvectors define an orthogonal coordinate system in the Figure VI.1 Smooth vs. Assuming the Hessian is non-singular.VI. Non-degenerate critical points are isolated. 1. the tangent vector is a tangent vector and thus an element of . lar degenerate critical point exists for the monkey saddle shown in Figure VI. The second derivative vanishes too. the indices of the critical points . .   £ ¦     £ ¡   ¡¡ ¥    ¢ ¡ ¦      ¦ 0 (  ¥            ¦     ¢¡ § ¡ $      ¦         ¥   ©   £ ¢  ©        ¢   ¦ ¡       ©       ¢    £             ¦    ¦ ¦ ¡ ¡ 7 #B  B    B   ¦   £ § $   ¦   ¢   ¡ B ¡   ¢      #B I ¢ ¢ ¢          B   B "   ¢  ¦ ¡   5 ¥        ¡ ¡   B  G ¦©   ¢    " ¥  ¡ R   ¡   XB ©B    ©B ¢     ¡     B 0  ( ¥  ¥   ¦ B tangent space of . Noncritical points and non-critical values are also referred to as regular points and regular values. a saddle for or . Degenerate critical points. . . Assuming a local coordinate system in a neighborhood. Index. There is a neighborhood of and a local coordinate system in with for all and If is a critical point then is a critical value. the Hessian of at is the matrix of second derivatives. The origin is a critical point for every possible assignment of signs to . A connected open subset is an open interval. and 2. The index of at a non-degenerate critical point is the number of negative eigenvalues and is denoted as . In contrast. and it is a maximum for . the degeneracy is manifested by the fact that an arbitrarily small perturbation can remove the critical point or turn it into two non-degenerate ones. 91 M ORSE L EMMA . This fact is also expressed in the lemma of Morse. These are the points with horizontal tangent planes. The middle function has a degenerate critical point at 0. It may be specified as the graph £ ¥ ¤§ ¥ ¡ ¨ ¢ ¨ B    ©B   ¡ § ¨£ ¡ ¦ ¡   ¡ £ £ §     ¡  ¥ ¦£  A critical point is non-degenerate if is nonsingular. Consider the height function defined by . maxima. which identifies 0 as a degenerate critical point. The saddle is the most interesting case of the three because a circle drawn around it has two peaks alternating with two pits. and in Figure VI. Piecewise Linear . a point is a critical point of if all derivatives vanish. . which is unfolded in different ways by the other two functions. Note that for every smooth curve passing through . The index is then the number of eigenvector directions along which decreases. Just like the first derivative can be used to compute the best linear approximation to . For example. graphs of the function for . which is homeomorphic to . We call a Morse function if all critical points are non-degenerate. This is generally the case because a critical point with index connects to the past along directions. and minima. which means there is an open neighborhood without other critical points. A simi A quadratic function in two variables has only three types of critical points. Critical points with small circles that oscillate more often than twice are necessarily degenerate.1 are equal to the indices of the corresponding critical points.1. where is the dimension of the manifold . $ . Geometrically. all eigenvalues are non-zero. Note that the dimensions of the cells attached to the evolving torus in Figure VI. The homotopy type of the partial torus changes when passes the height value of the points . A 1-dimensional manifold is a closed curve.3: From left to right. saddles.3 illustrates the instability of the degenerate critical point.4. Critical points are marked. Critical points. and . Specifically. a maximum and a minimum. . The derivative vanishes at 0. . that is. the second derivative can be used to compute the best quadratic approximation. Figure VI. a circle drawn around a regular point has only one peak and one pit.

New Jersey. but none of its points are isolated. Euler characteristic. Variationsrechnen im Großen.2 that we can construct a -cell for each index. and . Differential Topology.critical point so that can be constructed by successive attachment of these cells.    ¡    ¤  §  ¥   ¥          ¦ ¡¢ B ¡    ¥ ¦ ¥ B ¢¢ ¦ £ ¦F ¤£   ¥    ¥ §  ¥ ¥ ¥ ¦ §   B   ¥  ¥  B ¢¢     ¦ ¥ £  B  ¥ £ & ¤     ¥ ¥   ¥ §  ¤ ¢         B  ¢ of         ¡ B d¥ B ¢ B        B ©B XB £ #B       XB ©B     ¥ . P OLLACK . S TRANG . for the entire -axis is critical. [5] G. WALLACE . Similarly. and . Let be the number of critical points of index . We will see in Section VI. The only critical point is . G UILLEMIN AND A. the Euler characteristic is the alternating sum of cells. In words. which is also the alternating sum of critical points. The Calculus of Variations in the Large. 1934. Amer. A minimum example is the ordinary height function. which is the real part of . All critical points in the above examples are isolated. 1974. For the sphere we get . the function has three peaks at . [1] V. Morse Theory. Massachusetts. For example. Soc. Englewood Cliffs. [4] H. and three pits at . New York. Prentice-Hall. Press. S EIFERT AND W. ¥ which is zero at 0. no matter what Morse function we use. Differential Topology. WellesleyCambridge Press. Introduction to Linear Algebra. This implies that every Morse function of the sphere has at least two (non-degenerate) critical points. 1993. [6] A. First Steps. New York. Good introductory texts to the related subject of differential topology are the books by Guillemin and Pollack [1] and by Wallace [6]. . Figure VI. 1963. the height function has a circle of minima and another circle of maxima. New Jersey. The matrix of second derivatives at that point is ¥ & ¦  ¥ §     VI D ENSITY M APS For example for the torus we get . Math. 1951. The original development of Morse theory from its variational background is described by Morse [3] and by Seifert and Threlfall [4]. [2] J.92 . As we go around a circle centered at the origin. 1968. . Bibliographic notes. for every minimum and maximum we get exactly one (non-degenerate) saddle point. Milnor’s later book [2] emphasizes the topological analysis of manifolds and has since become a standard reference in Morse theory. Wellesley. which has a minimum at the south-pole and a maximum at the north-pole. A good introduction to linear algebra including an intuitive discussion of eigenvalues and eigenvectors is the book by Strang [5]. New York. Benjamin. T HRELFALL . M ORSE . M ILNOR .4: Monkey saddle with degenerate critical point. [3] M. but there are others that are not. Let be a compact and smooth manifold without boundary and a Morse functions. if we lay down the torus on its side. Published in the United States by Chelsea. Princeton Univ.. As always.

The collection of stable manifolds thus satisfies the two conditions of an open complex: its cells partition and the boundary of every cell is a union of other cells. symmetrically.5 Figure VI. everything we said about stable manifolds is also true for unstable manifolds.6. if we have a smooth curve with velocity vector then the derivative of can be computing using the gradient as The stable manifold of a minimum is the minimum itself.     ©D   ¤ §   ¥ ¦ I £ ¤  ¢ £ ¤  ¢      £  ¤  ¤ £  !   ¤  £  ¥ ©£F !¨ ¡ © £ ¡ ¦ ¢ ¦ ¥ ¡ D© § ¤ ¤     § ¡     © ¥ ¦ I   £  ¢     ¦ ¦ ¤ ¥ §  §   £ ¢ G ¦ © ¡ #B     B   ¡ # ©B ¦ R   ¢ ¢ § ¢ ¢  £¡¢     ¡¢  ¡¢       # G ¦  R  §          #  #   VG G ¢ £ ¦  ¥ #  ¦ § ¤   ¡ ¦ ¡ ¦ ¡ § ¤   R R  § D  ¤ ¦    ©B ¥ ©       ¢ £ ¦ ¥       ¢ £ ¦ ¥ ¢ £ ¦ ¥ §   EB § § ¤ D   ¦       ¦ ¡ B #      ¥   ¢ ¨  £ ¥  B R §   ¦ I ¡¢¥ ¢ # ©   . same as for linear maps.5: From left to right. . Every maximal integral line is open at both ends and thus a map of an open interval or. Note that the dimension of each stable manifold is the index of the critical point that defines it. Each stable manifold is the injective image of an open balls. The gradient vanishes precisely at all critical points of . VI. A vector field. the flow in the neighborhoods of a regular point. .6: From left to right. Nevertheless. By symmetry. However.6. maps every point to a tangent vector . we introduce the gradient of a Morse function and use it to construct the -cells whose inductive attachment reproduces the evolution of the homotopy type of . Every regular point belongs to an integral line. Neither can an integral line fork. Two integral lines can therefore not cross. £ ¤  ¢   £ ¢ ¥ ¦ £ ¤  ¢       £ ¤  ¢ !§ Stable manifolds.VI. This path is called an integral line. The same concept can also be defined for a Morse function . The patterns of integral lines in the neighborhoods of a regular and several critical points on a smooth 2-manifold are shown in Figure VI. which we refer to as its origin and destination. Gradient flow. of the real line. It approaches two critical points. a minimum. where is the directional derivative of along . which is the union of two integral lines and the saddle itself. We can define it also without reference to a coordinate system. for every vector field . and . In a 2manifold . which is its regular starting points. the gradient of is . that stable manifold of a minimum. It depends smoothly on the initial condition. and a maximum.2 Morse-Smale Complexes 93 joint or the same. the closure of a stable manifold is not necessarily homeomorphic to a closed ball. as indicated by the examples in Figure VI. Assuming an orthonormal local coordinate system at . The dimension of the unstable manifold of a critical point is the co-dimension of the stable manifold. The stable manifold of a saddle is an open curve. the stable manifold of a maximum is an open disk. For example. If we start at a regular point and follow the gradient we trace out a path. . . a saddle. equivalently. All three cases are illustrated in Figure VI. It is convenient to consider each critical point as an integral line by itself so that the collection of integral lines partitions .2 Morse-Smale Complexes In this section. two integral lines can also not merge. which is the union of a circle of integral lines and maximum itself. and because we can reverse the gradient vector field by considering . The stable manifold of a critical point is the union of integral lines with destination and. and two maximal integral lines are either dis-      © ¦ Figure VI. the closure of each stable manifold is the union of (open) stable manifolds. which is a solution to the ordinary differential equation . for continuously increasing real threshold . The gradient is the particular vector field that satisfies . The gradient of a linear map is the vector . the unstable manifold is the union of integral lines with origin . and a maximum of a two-dimensional Morse function. It is the projection of a normal vector of the graph of and points in the direction of the steepest ascent. a saddle.

for . From left to right they have one. but they can also assume more general shapes with arbitrarily many saddles alternating between index-1 and index-2 separating the minimum from the maximum. two.94 Morse-Smale functions.8: Three 3-cells of a three-dimensional Morse-Smale complex. minimum saddle maximum $  ¡  R ¡R  B  0   B Figure VI. We may refine the complexes of stable and unstable manifolds by forming unions of integral lines that agree on both limiting critical points. it suffices to tilt it ever so slightly sideways in order to get transversality. The vertices of a 2-cell alternate between saddles and other critical points. . Height functions over manifolds occur in many practical problems. we consider a point common to and . Each point of a triangle is a convex combination of the three vertices. The result is a topological 2-sphere with minima and maxima. which implies .9. we can extend these values to a continuous function over the entire surface. The two bold 2-cells share the same origin and destination. . Morse-Smale functions are again dense in the set of maps from to . we define the Morse-Smale complex as the collection of connected components of intersections of stable and unstable manifolds. . along entire one-dimensional integral lines. Using linear interpolation.7: Solid stable and dashed unstable 1-manifolds with overlaid dotted iso-lines of a rectangular portion of a MorseSmale function. In the case of the upright torus. and the function would be specified by its values at the vertices. The Euler characteristic of the 2-sphere is . as shown in Figure VI. For example. provided we count an arc twice if it bounds the cell on both sides.1 is Morse but not Morse-Smale because the stable 1-manifold of the upper saddle. and three index-1 saddles and the same number of index-2 saddles. and the non-saddles alternate between minima and maxima. The 3-cells of a Morse-Smale complex may have the structure of a cube. Note that all 2-cells in Figure VI. The intersection is transversal at if the tangent spaces and span the tangent space . VI D ENSITY M APS Shape of Morse-Smale cells. A Morse-Smale function is a Morse function whose stable and unstable manifolds intersect only transversally. Figure VI.8. In other words. the height function of the upright torus in Figure VI. Q UADRANGLE L EMMA . This amounts to overlaying the two complexes. To explain what this means. In doing so. Any such cyclic sequence has length . The common features of all 3-cells are that they have one minimum and one maximum. The surface would typically be given as a triangulating simplicial complex . all two-dimensional Morse-Smale cells are quadrangles. Saddles become regular points. We take two copies of a -gon and glue them together along the shared boundary. Assuming a Morse-Smale function. the dimension of the intersection of the two tangent spaces is . Equivalently. but they are never smooth in the mathematical sense of the word. We need some definitions to explain the linear interpolation. P ROOF.7 have four sides. minima remain minima. We can see in Figure VI. with ¤ ¥ ¦     ¥ ¦ ¥ ¥   ¥   ¡ ¥  ¥ ¦ ¥     ¦  ¦   § G © © ¡ G   © ¦ ( © ¡ 0   ( 0 B ¦ G © £ ¤  ¢ £ ¤ © £ ¤  ¢  ¢ ¤ ¥  ( G © © £ ¤  ¢ § ¦ £ ¤ ©   ©  ¢ ¥  0   © £ £ ¤  ¢      G ¦© B ¢             £ ¤   ¡ ¢ ¡ . and maxima remain maxima.7 that it is indeed necessary to take components. it is convenient to assume that the stable and unstable manifolds intersect in a generic manner. Every 2-cell of a two-dimensional Morse-Smale complex is a quadrangle. A few examples of 3cells are shown in Figure VI. An example is a surface of a molecule model and the electrostatic potential on this surface. and all 2-cells in the boundary are quadrangles. Piecewise linear height functions. meets the unstable 1-manifold of the lower saddle.

Lower stars. H ARER AND A. This interpretation is consistent with the result that regular minimum saddle maximum Figure VI. Define the star of a vertex as the collection of simplices that contain . Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds. Instead. [1] T. 245–256. J. minima. and maxima. It still shares many characteristics with Morse functions. §    ¢      ¨ § ¢    and . .10 illustrates the definitions by showing the lower stars of vertices that behave like regular points. and the lower star as the subset for which is the highest vertex. which implies that the linearly interpolated agrees with the value specified at . With this assumption. Z OMORODIAN . E DELSBRUNNER . the alternating sum of critical points is equal to the Euler characteristic of . and adding the lower star of a critical point is similar to attaching a cell in the smooth case. BANCHOFF . Furthermore. for points along the edge we have . saddle. The three parameters are unique and referred to as the barycentric coordinates of . Indexing the vertices accordingly. Adding the lower star of a regular point does not change the homotopy type of . [2] H..9: Portion of a triangulated surface of a molecule. and we cannot remove them just by perturbing the height values.  Another similarity between smooth and piecewise linear height functions arises when we sweep in the direcfor tion of increasing height. which implies that is continuous.10: The star of every vertex in the triangulation of a 2-manifold is an open disk. F. the lower stars partition the complex . and . we define as the the union of the first lower stars and note that is a simplicial complex. The values computed for within the two triangles that share thus agree. . maximum.2 Morse-Smale Complexes 95 times between lower and higher values of as a -fold saddle. The transversality condition for stable and unstable manifolds has its origin in dynamical system and is named after Steve Smale [4]. The value at is now defined as the analogous combination of values at the vertices. saddles. The shaded portions are lower stars. Figure VI. Geom. ¥ ¥    ¥     ¥    ¦ ¥   ¤¤  R ¤ ¡ $ R      ¨ ¤ § §    ¦ ¥ ¡  ¦    $  ¡    R   § ¢ ¢ ¤ ¢ ¢   ¡R        §  ¤ §      ¨ ¤ ¢  ¢ B  ¡ 0   $   B $ 0  B ¡ ¨ ¡ ¢  ¡ ¡ ¤ ¨ ¡       0       B     #B ¤         $ 0 $  0        ¡R ¡¢  ¡ ¢    ¡R    . J. Differential Geometry 1 (1967). The idea of writing a triangulated manifold as the disjoint union of lower stars goes back to Banchoff [1]. Critical points and curvature for embedded polyhedra. is a filtration and a discrete version of the evolution of during the sweep. we sort the vertices in the order of increasing height. It follows immediately that is the number of minima and maxima minus the number of saddles counted with multiplicity. More complicated lower stars are possible. to appear. Discrete Comput. The Morse-Smale complex has been introduced recently in [2] along with algorithms for piecewise linear height functions over 2manifolds. Bibliographic notes. Assuming all . The gradient and related concepts from vector calculus are intuitively described in the booklet by Schey [3]. It is convenient to assume pairwise different height values at all vertices so that each simplex belongs to exactly one lower star. minimum. The alternating sum of simplices in the lower stars of a regular point. The height function is continuous but not smooth. . and -fold saddle are . The sequence of complexes ¤ Figure VI.VI. we may consider a vertex whose circle of neighbors alternates ¦ ¡ ¦  ¤   £ ¤ ¦  ¤  ¡ ¢   ¤  ¤     Note that the barycentric coordinates of the vertex of are and .

1992. and Related Topics. Curl and All That. M. 1980. An Informal Text on Vector Calculus.96 VI D ENSITY M APS [3] H. New York. Economic Processes. Second edition. S MALE . Springer-Verlag. Div. [4] S. Essays on Dynamical Systems. S CHEY. Norton. Grad. The Mathematics of Time. . New York.

Manuscript. PASCUCCI AND D. J. Comput. [4] M. Geom. VAN K REFELD . NATARAJAN AND V. S CHIKORE .3 Construction and Simplification [Explain the sweep construction for two-dimensional Morse-Smale complexes using the simulation of differetiability. Z OMORODIAN .3 Construction and Simplification 97 VI. In “Proc. [3] H. L. Visualization. Visualization of scalar topology for structural enhancement. maybe the first time by Smale(?). V. to appear. Hierarchy of Morse-Smale complexes for piecewise linear 3-manifolds. R. V. 13th Ann. L. 18–23.] [Again. Sci.. North Carolina. [2] H. E DELSBRUNNER ..] [The most important part of the algorithm is maybe the handle slide. H ARER AND A.VI. IEEE Conf. BAJAJ . R. Sympos.] Bibliographic notes. 2001. Comput. H ARER .. Dept. R. BAJAJ . 212–220. which is the only restructuring operation necessary to go between different complexes. Discrete Comput.] [That operation has been used in early work on Morse theory. PASCUCCI AND D. In “Proc. 1998”. Durham. E DELSBRUNNER . Geom. S CHIKORE . Hierarchy of Morse-Smale complexes for piecewise linear 2-manifolds. C.. Contour trees and small seed sets for iso-surface traversal. J. there should be reference to the early mathematics literature on the topic of cancellation. [1] C. . Duke Univ.] [Build a hierarchy through prioritized cancellation. 1997”. VAN O OSTRUM . V. 9th Ann.] [We can describe the cancellation as a combinatorial restructuring operation and we only need this one to go up the hierarchy. PASCUCCI .

Third edition. Catastrophy Theory. North Carolina. Duke Univ. Germany. P OSTON AND I. [3] T. Mineola..4 Simultaneous Critical Points [Explain the work with John on the topic and mention papers by Hassler Whitney and books in Catastrophy Theory. S TEWART. Berlin. Manuscript. New York. Jacobian submanifolds of multiple Morse functions. Springer-Verlag. A RNOL’ D . H ARER . .98 VI D ENSITY M APS VI.] Bibliographic notes. [1] V. 1992. 1978. [2] H. Durham. Dover. I. 2002. E DELSBRUNNER AND J. Catastrophy Theory and Its Applications.

(2 credits). Let be a line that avoids all point. 1. Prove that intersects at most edges of and that this upper bound is tight for every . Let be a triangulation of a set of points in the plane.Exercises 99 Exercises The credit assignment reflects a subjective assessment of difficulty. Every question can be answered using the material presented in this chapter.  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Section of triangulation.

100 VI D ENSITY M APS .

we study the problem of finding the best rigid motion for matching one points set with another. Instead of asking how similar two shapes are. But then again. we explore rigid motions in three-dimensional Euclidean space and introduce quaternions as a tool to specify and compute with rotations. The complementarity question.2 VII. we apply the methods to questions of similarity and complementarity.2. This is particularly true on the molecular level. The molecules that participate in the mechanism of life tend to be large and composed of small molecules. As always in this book. VII. is at the root of natural and other re-production processes and it takes part in protein interaction. In Section VII.1. In particular. The complementarity question is a similarity question between one shape and (a portion of) the complement of another shape. It really makes sense only for space-filling diagrams and does not seem to apply to information expressed in terms of sequences and space curves. on the other hand. The similarity question is at the core of human understanding.1 VII. In Section VII. The measure of choice is the root mean square distance between the two sets. we look at the problem of identifying matching subsequences with minimum root mean square distance and at score functions that assess the shape complementarity of two space-filling diagrams. and shapes formed by space-filling diagrams. The underlying question is one of definition: when do we call two molecules the same or of the same type. we look at the related problems of sampling a rigid motion and of covering the space of such motions with small neighborhoods.Chapter VII Match and Fit As a general theme in biology. questions are almost always about populations and rarely about individuals. space curves modeling backbones. In Section VII. In Section VII. and how do we quantify and assess that notion of sameness. which forms the basis of functioning life.4.4 Rigid Motions Optimum Motion Sampling and Covering Alignment Exercises 101 . we may also ask the related question of how well two shapes fit side by side. we focus on mathematical and algorithmic methods that shed light on the broader biological issues. Minor variations in the type or arrangement of the components are frequently inessential and do not alter the role of a molecule within the larger organization. There are various approaches to the question applied to proteins. there are seemingly small variations that do have significant consequences.3. including the comparison of amino acid sequences. which crucially relies on classification to simplify and create order.3 VII.

A rotation about a coordinate axis has a comparatively simple rotation matrix. It is mostly true that two different triplets of angles specify different rotations. Sometimes it is more convenient to    ¥ K   ¥ J ¥ I (¢0 ¡ (¢  0 (¢0 ¢(¢0 ¢ 0  0  0 ¡0  (0  ¡(0 ¢( 0 ( 0 ¥    ¢   ( 0  ¢ ( ¡ 0    ¢ ( 0 ( ¡ 0   ( 0  ( ¡ 0  ( 0 ¡ ( ¡ 0 ¥ ¡  £  ¥   £  £ ¢   ¢£ The rotation matrix moves the unit coordinate vectors to the vectors . however. we may use quaternions to represent rotations. the map ¥ VII. In this section we consider different ways to mathematically represent rotations. and are real numbers and I. but there are exceptions. . where is an orthonormal 3-by-3 matrix with unit determinant and is a 3-vector: £ where . and we focus on quaternions. More formally. Quaternions can be viewed as a generalization of complex numbers: ¤  B  B ¤ ¢ B ¥ ¦ ! ¥    ¦           ©            § ¥   ©  ¢ ¡     ¡ ¦ B         ¢ D  ¦ D D ¢ ¡   ©B   ¢ ¢ ¡ ¢ ¡ §   ¡ ¦ £ ¦ ¡   ¢ B  ¤  § © ¤  © ¥    B B ¢p £¡ P   ¢ #B   £ ¦  ¡ ¥ ¡ ¡ ¢ B ¢ ¢ ¢¢0    £  ¤ © ¤  © 0 0 !§ ©B p      ¢0 ¢ ¡ 0 0 0       ¢ ¢      ¢0 0 0 p    ¢ 0 0     ¥   B rp        ¡ £ ¦   ©B   ¢ ¢          ¢     B ¡ . we specify how to multiply the imaginary units: I I J J K K J I  I J K ¢ ¢( (  (  (  ¡(    ( ( ¡( ¢ Figure VII. x2 x1 The angle of rotation about a coordinate axis is referred to as an Euler angle. This suggests that the Cartesian product of three circles is not an appropriate model and we will indeed see shortly that is not homeomorphic to the space of rotations. As an alternative to orthonormal 3-by-3 matrices. In other words. abbreviated as SO .102 VII M ATCH AND F IT can be obtained by a sequence of three rotations about coordinate axes. which provide a particularly elegant mathematical framework. Using matrix notion. Leonhard Euler proved that any rotation The product has a similar form but six of the terms have their signs changed. a rotation is a rigid motion that preserves the origin. For example. J and K are three different imaginary units. the rotations form the so-called special orthogonal group of 3-by-3 matrices. Quaternions. followed by a rotation by followed by a rotation by about the -axis and note that we get the same composite rotation if we switch and . It is important to specify the Euler angles in a fixed sequence as other sequences of the same angles usually specify different rotations. Every rigid motion can be written as x3 SO is not injective. Note. A rigid motion in is an orientation-preserving isometry of three-dimensional Euclidean space. rotating about the -axis gives £ Note that reversing two different imaginary units changes the sign of the result. it is a map such that and for every pair . Rotation and translation. that this group is not abelian because the multiplication of matrices and therefore the composition of rotations is not commutative. In general. In preparation of an operation that multiplies two quaternions. As illustrated in Figure VII.1: The translation of the boldface original coordinate system preserves the directions of the axes while the rotation preserves their anchor point. Consider for example a rotation by about the about the -axis.1. and a translation is a rigid motion that preserves difference vectors. If I J K is another quaternion then the product of and is   ¥ ¥ K J   ¥ K I  ¥   ¥ ¥   the composition of a rotation and a translation: . axis. Indeed. we can write . and that make up the columns of . the composition of any two rotations is another rotation.1 Rigid Motions A motion in three-dimensional Euclidean space can be decomposed into a rotation and a translation.

It follows that the lower right 3-by-3 submatrix of is also orthonormal. since . the products with their transposes are diagonal: . Furthermore. The expansion of given in Table VII. we show that the rotation by an angle about the axis defined by the unit vector can be represented by the unit quaternion  ¢ because . and we can check that composite multiplication does not. we can use the scalar product to define the length of a vector: . We can express think of a quaternion as a vector in the product of two quaternions in terms of an orthogonal 4-by-4 matrix and a vector. This implies in particular that multiplying with also preserves length: . Since the matrices are orthogonal. we  As illustrated in Figure VII. Similarly. both and are orthonormal. This is true from either side and we show it for multiplication from the left:   ¢     ¡  0 ©  ¥    $  © ©    P§   © $ ¥ $ ¡   ¡   §¡ 0 $ ¢£ ¢   ¡£ ¢ £ As usual. Axis and angle.1 provides an explicit method for computing the orthonormal rotation matrix from the unit quaternion. where is the 4-by-4 identity matrix. First. The imaginary part of gives       I J K     ¢           ¢    §£ § Representing rotations. While the product of two quaternions is another quaternion. namely . and . we get and . the imaginary parts vanish when we multiply a quaternion with its conjugate: . a reflection reverses the orientation of a sequence of three vectors. Observe that the matrices associated with are the transposes of those associated with . Observe that ¤ ¥£ ¤ ¥£ 0 £   ¦££   ¦ ¦ ¢ ¡   ¡ ¡ ¢  ¢  ¥ 0 0 0 0 0 0 0 0  £ ¡ ¥ ¥ ¥ ¥  0   ¦£ £    ¡ £   ¢0   ¢0 ¡ ¡  0 0 0 0 0 0  £ ¦¥ £ ¢  £  £ ¦  ¥ ¥ ¥ ¥ ©  ¢ ¢©   §  ©¨§   ¥ ¡ ¡  ¢ ¢  £  ¤ £ ¦£   £ ¢    0 0 0 0 0 0 © 0 0     £   ¢§ £ § © ¥ ¥ ¥ ¥ ¦£   p ¢p ©    ¢  ¢ ¡ ¡    0 0 0 0 0 0 0 0  £ ¢ © £ ¢ ¢ ¥ ©   ¡ ¡ ¢  ¡ ¡ ¢ ¡ £     ¢ £ ¢ ¢£ . we have  §£ ¦ ¢£ ¢   ¡£ § £ §   ¦ ¢ £  ¢   ¦ ¢  ¢  ¦ ¢ £ ¢    ¡  ¡ £ ¦ ¢ ¢  ¡ ¦ ¡   0   ¡#   ¡ 0  ¡   0  0¡¡  ¡¡0    ¡  ©0 ¡ ¡ ¡ 0   ¡  £ (  ¡(   ¥¢ §£ § ¢ 0 (  ( (   (     ¡ 0 £  ¢ 0  0  0   0 £ § £ §   ¦ §  ¡ £§ ¢ £ § § §   ¦ ¢ £ ¢ ¦   ¢£ ¢   p ¢p  ¢   ¢ ¢ £ §£  ¦£           £    ©   ¢ £  p ¢p 0 £ ¦£     £ ¢ §£ ¢ ¥    © ¢0¢ ¡  © ¥ ¤ ¥£ ¤ ¤ ¥£ ¤ £ ¦ ¦    (  0(  0 (  ¡0¡( 0   ¦££    ( ¢  ( ¢ ¡ ¡ ¦ £¥ £ ( ( ( ( ( ( £  ¡ ¡ ¢  ¡ ¡ ¢ § £ § £   ¥   © ¦ ¡  ¤ ¤ cannot use simple multiplications to represent rotations because the product of a unit quaternion and a purely imaginary quaternion is not in general purely imaginary. This can be done by expanding either the first or the second quaternion to a matrix: 103 Take a moment to verify that the matrices and are indeed orthogonal. is the result of applying the composite product with the unit quaternion to . To do this. which also preserves scalar products. which shows that the composite product preserves cross-products. unit length. The rules for computing can be rewritten as When and are purely imaginary then these results simplify to and . we use the composite product . Notice that Hence. We start with a few properties. If we now apply the composite product with a unit quaternion . always assuming .VII. an observer who looks against the direction of the axis sees the vector rotate in a counterclockwise order. Another possibility is that it represents a reflection.1 Rigid Motions . the scalar product is preserved if we multiply with . We use purely imaginary quaternions to represent vectors in and compound multiplication with unit quaternions to represent rotations. differs from by having the lower right 3-by-3 submatrix transposed. multiplication with a unit quaternion neither changes the angle nor the length. In the reverse direction. This implies that every non-zero quaternion has an inverse. However. However. Same as rotation. the scalar product is a real number:  where and are the 4-by-4 matrices that correspond to .1 and see that is purely imaginary. In the special case when has . Similar to complex numbers. the conjugate of a quaternion is obtained by negating the imaginary parts: I J K . with .2. The justification for to represent a rotation is not yet complete. Instead. we think of a quaternion as composed of a scalar and a vector. We expand the product of the two matrices in Table VII. This 3-by-3 matrix is the familiar rotation matrix that takes to . as required.

as in Figure VII.2: The rotation of the vector by an angle of about the line spanned by . making sure that the second plane of the first rotation is also the first plane of the second rotation. ¨ ¨      © © P§  ©  ¤  P§  ©       © !§      ©   © !§   (   ¡( ( 0  (  0   (  ¡ (  0 (  ( ¡ (  and use the   ©  ¨  ¨  © ¥  ©   ¡ 0 Figure VII. In other words. It is a good model of the set of rotations in . The axis of the corresponding rotation is the line common to the two planes. The above relationships provide a convenient conversion between unit quaternions and axis-angle pairs. although we usually prefer because it is easier to imagine.and south-poles correspond to the identity. The composition of two rotations represented by the unit quaternions and is x1 x2 Figure VII. The space ¨ ¥ ¨ If we substitute and identities and then we obtain the formula of Rodrigues. they just need to pass through the axis of rotation and enclose half the angle of rotation. we write each as the composition of two reflections. The two planes defining the reflections are not unique. A more direct geometric description of the composition of two rotations uses the fact that every rotation can be written as the composition of two reflections. Note that the same rotation as and that non-antipodal pairs of unit quaternions represent different rotations.4. or for short. as illustrated in Figure VII. The middle two reflections cancel and we are left with two reflections. and the angle of rotation is twice the angle enclosed by the planes. and points on the equator correspond to rotations by . composition of rotations corresponds to multiplication of quaternions. and from the product it is easy to again get the axis and the angle.u u r r’ which can be seen from Figure VII. and as given above. The three dotted vectors correspond to the terms in the formula of Rodrigues.3 illustrates the correspondence with a picture in one lower dimension. the unit sphere in is a double cover of the space of rotations in . obtained by identifying antipodal points of is usually referred to as the real projective three-dimensional space.104 VII M ATCH AND F IT Table VII. We have . and . ¦   ¢£ ¢   ¡£   ¤ ¥£ ¤ ¦   ¢ (   (  (   ¡ ( ( ¡  ( ¢ ( ¢ ( ( (  ( ¡ ( ¢(¡( ( (  ¢ ( ( ©  § P  ¡ ¥     ¥  ¥     (    © ©0    P§ 0 ¥ ©   ¡ 0      ¥   ¡£   ¥   (¡(  ¢ ( (    (  ¡ ( ¢ ¢ ( ¡  (  ( ( ( (  0   0 ¡  0 ¥    ¤ ¥ ¡ ¥ ¢ £  ¡ ¥  ¥ ¢   ( ( ( (  (  ¡ (  p p ¢  ¢ ¢     ¥   ¢       ¡ ¡ ¢   ¦ ¡ §£ § ¢ ¢ ¢ ¢  ¡ ¢ ¦ § ¢ ¡ ¡ . Tedious but straightforward calculations show Composing rotations. Figure VII. We show that this can also be written in the form .  ¡¡ £ x0 p(p (  ¡    ¦ ¢ ¥ £ ¢ ¥     © ¥ p ( ' ( p ¢       ¦¥ ¦ ¢ £ ¢  p ( ¢ ¡ ( p ¥   ¥     © !§ the direction of the rotation axis. where .     £     © 0     ux r To prove the claimed correspondence. we write the vector rotated by about the axis defined by using the formula of Rodrigues. The dashed great-circle through the two poles represents the set of rotations about a fixed axis.1: Product of matrices in the representation of a rotation by composite multiplication with unit quaternions.2. Thus.4.3: The north. and the real part deterrepresents mines the angle of the rotation. To compose two rotations. θ r.

Bibliographic notes. Closed-form solution of absolute orientation using unit quaternions. B. New Jersey. P. H AMILTON . On a new species of imaginary quantities connected with the theory of quaternions. [2] B. [5] O. Oxford. Amer.1 Rigid Motions 105 w ρ ψ v ϕ u Figure VII. Visual Complex Analysis. It is commonly acknowledged that quaternions have been discovered by Hamilton in 1844 [1]. [4] T. Clarendon Press. It is less well known that a few years earlier. J. We recommend the primer by Kuipers [3] for background on rotations and the text by Needham [4] for background on the more general context provided by complex analysis. The exposition of quaternions and their connection to rotations chosen for this section follows [2]. A 4 (1987). K UIPERS . 380–440. Princeton Univ.4: We see three rotations defined by the axis-angle pairs . N EEDHAM . # ¤ &¥  ¢ £ # ¡   ¦¢  !¡  #   ¢ . Math. Each rotation is the composition of two reflections illustrated by the great-circles at which their planes meet the sphere. Press. [3] J. Proc. Rodrigues studied the composition of rotations in space and gave a purely geometric explanation that is equivalent to Hamilton’s algebra [5]. Opt. and . 1999.VII. Irish Acad. 5 (1840). Quaternions and Rotation Sequences. H ORN . 424–434. Soc. J. R. Des lois g´ om´ triques qui r´ gissent les e e e d´ placements d’un syst` me solide dans l’espace. Even earlier. Pures Appl. 629–642. Gauss recorded his discovery of quaternions in his unpublished notebook in 1819. 1997. 2 (1844). RODRIGUES . England. et de la e e variation des coordonn´ es provenant de ces d´ placements e e consid´ r´ s ind´ pendamment des causes qui peuvent les proee e duire. K. [1] W.

is a quadratic function with a unique minimum. Problem specification. We consider rotations and translations separately.2 Optimum Motion In this section. In other words. While entertaining the possibility that the two collections are structurally the same or at least similar. the latter sum vanishes iff . Note that rotating and taking the centroid commute.5. Quite the opposite is true. the translation that minimizes the root mean square distance between and is defined by . A crucial insight used in proving this fact is that the centroid is the only Optimum rotation. § ¨¢ Given a rigid motion . the motion can be optimal only if translates the centroid of to the centroid of . We begin by showing that the best translation moves to . the centroids of and are and . In other words. After formulating the optimization problem. and it might seem that computing the particular rigid motion that minimizes would be hopeless or at least difficult. . to . More formally. the centroid of is . We use the root mean square or RMS distance to assess how similar the two collections are.5: After moving the shaded points to the origin. we are interested in moving one collection so it best matches the other. ¦ ¡ # Figure VII. We need some notation to make this precise.       § B     B !p   #B ¢  p     §  § ¡  £       ¥  ¥ ¥   #B ¡  £   ¡  ¡  §       ¥   ©B ¢ e    ¡  ¢ ¡  §        ¡ £ ¥ ¢ ¡ ©   ¥ £ ¦ £   §   £   ¥ ¢   ¦   ¡     p  §    p ¡   ¥ £     ¤  ¢  §   ¡ ¥   ¢ ¡ ¥    §  B § ¡ £ #  ¢ £¡ #  ¡¡ ¢   ¢ £¡ © © § ¡ ©B  ¥      £ #  ¤  ¢      © ¡ © ©  ¥ ¡ £ ¤¡ £   § § £   §       £  # ¡ . as §  ¨ § ¥      §  §     ¡  ¥ ¡    £       §     p  §     p ¥ ¥  ¥  ¥      Note that minimizing the root mean square distance is equivalent to minimizing the sum of square distances. we may apply it to the first collection and recompute the root mean square distance. is also the sum of square distances of the points from the origin. We are now ready to prove that the best translation is the one that moves to . the (solid) difference vectors all radiate out from the origin.106 VII M ATCH AND F IT point for which the sum of the vectors to the points in the collection vanishes:   VII. The translation minimizes the sum iff the origin is the centroid of the points : ©   ©   #       ¡  # ©     Optimum translation. Recall that the centroid of a collection of points is the average of the points. Let us move every point to the and move the translated copy of with it origin of to . and the main reason for this is the convenience provided by quadratic functions. We may therefore simplify our problem by translating and independently translating such that   This implies that the best translation moves claimed. . Indeed. Let and be the two collections and assume that corresponds to . Then the sum of square distances between the correspond- ing points. This measure is the square root of the average square distance: This implies that minimizes the sum of square distances from the . We are interested in finding the rigid motion that minimizes the root mean square distance between and . we solve it using quaternions representing rotations in three-dimensional space. The space of rigid motions is therefore six-dimensional. we study an optimization problem that arises when one attempts to match two molecular structures or to fit two structures snug next to each other. This operation is illustrated in Figure VII. Recall also that every rigid motion can be decomposed into a rotation followed by a translation. Since every rigid motion can be written as a rotation followed by a translation. for each . namely . That minimum is characterized by a vanishing gradient: As mentioned earlier. Suppose we are given two finite collections of points in and a bijection between them.

we have . The corresponding eigenvectors are the unit vectors such that . we could of course try all bijections.6. We can interpret geometrically as a quadratic function over four-dimensional Euclidean space. The corresponding quaternion is . The sum that we have to maximize can now be rewritten as £   ¥  ! !   ! ¦ ¡    "  ¢    #! " "B   ¢         ¡    ¡ ¢ #   ¢     §¡ ¢ ¡ $  ¤ ¡ The sums of the and the are not affected by is equivalent to maxithe rotation.6: The plane represents . we have . Assuming and contain points each. Since the sum of symmetric matrices is again symmetric.    £    ¢ ¢   ¡¢ © . and because we are only interested in unit quaternions.2 Optimum Motion both centroids lie at the origin. If there is no bijection specified between the two sets then the problem of finding the best rigid motion seems significantly more difficult. the optimum rotation is defined by the unit eigenvector that corresponds to the largest eigenvalue. The sum of the square distances after the rotation is 107 for which the quadratic function gives a maxpoint imum. Our goal is to find a By the assumed ordering of the eigenvalues. Without bijection. Recall that the eigenvalues of a square matrix are the complex numbers for which the determinant of vanishes.1. Letting . we have . Short of being able to draw the graph of this function in . Hence where .VII. so minimizing mizing the sum of the . we illustrate the idea in Figure VII. we can express the rotation of a point as . the partially dotted circle represents . The corresponding eigenvectors are pairwise orthogonal and therefore span . A more effective algorithm alternates between improving the     ¡ B   "     B ¡     "   ¢         B ¡    B ¢ ¢   ¢©     ¥ ¢  £ ¢ The two matrices are skew symmetric as well as orthogonal. the surface represents the graph of the quadratic function over . but that would take a long time. Eigenvalues and -vectors. In other words. It is convenient to order them as . which drops two of the dimensions. we have . We can compute such a with a modest amount of linear algebra. we may assume . the eigenvalues are all real. Take a moment to verify that each matrix in this sum is symmetric. We can thus write any quaternion as a linear combination of the eigenvectors. and this maximum is attained for . we have four eigenvalues. the dashed lines represent the zero-set and the boldface curve represents the graph of the restriction of that function to . Since multiplication with a unit quaternion preserves scalar products. Using quaternions. . Equivalently. and because is symmetric. where is a unit quaternion and is the pure imaginary quaternion that corresponds to . Recall from the previous section that       ¡  ¥ " ¢  £ ¢ © ¡ ¡   ¢  ¢  ¢  £ ¢ ¢   ¢ ¢  ©  ¡¢    ¥ ¦ ¢  ¡¢    ©¢ ¦ ¢  ¡¢   ¢ " ¢  p  ¦p  p    p  ¢ §¨ p  ¦p   ¢¥ ¦ ¢   ¡¢  ¥  p    p ¡ ¤  ¡ £     p  ¢£¥   ¦ ¢  ¡¢ p  ¡ ¡ £   ¢  "     ¢   ¢   ¢ £  ¢ £ ¢     ¡   £  ¡  £ ¢ £  ¢  ¤ ¤ ¥£ ¤ £ ¤  ¦ ¦  ¢     ¢ £  £  ¢     § §      §  ¥ ¥ ¥ ¥  ¢  ¡ ¡   § ¢ §  §   ¢        ©  ¦ ¢ £ ¥ ¥ ¥ ¥   ¡¢          ¢    ¢        § § © §    £ ¥ ¥ ¢  ¢ ©  ¡ ¥ ¥   ¢ ¡  ¥ ¡ ¢ §  §   ¢    §     ©      ¡    ¡ ¡ ¢  ¡ ¡ ¢      §  ¡          Figure VII. as explained in Section VII.

which determines for each the point closest to . In the algorithm. Bibliographic notes. J. ROTATE . In com- § ¡       # © ¢ ¡        $©  §    © ©  ¡    # © © loop       identity.108 root mean square distance by changing the bijection and by changing the motion. Massachusetts. The problem of finding the rotation that minimizes the root mean square distance between two point sets with given bijection in has been studied in various fields. [2] O. H EBERT. Intell. recognition. IEEE Trans. A 34 (1978). endif forever. Acta Crystallogr. Introduction to Linear Algebra. 5 (1986). A discussion of the solution for the best rotation to relate two sets of vectors. S TRANG . Sect. We use three subroutines to describe the iterative algorithm. the version that works with injections rather than bijections is known as the iterated closest point or ICP algorithm [1]. and locating of 3-D objects. M ATCH returns the permutation that minimizes the root mean square distance between and . For background on linear algebra and how to compute the eigenvalues and eigenvectors of a symmetric matrix. Sometimes this change is motivated by the purpose of the computation. [5] G. we refer to Strang [5]. [1] P. Finally. In this section. D. we replace M ATCH by A SSOCIATE and do the remaining operations as before. A 4 (1987). Int. 629–642. A method for registration of 3-D shapes. So we may again assume that both centroids are at the origin and restrict ourselves to rotations. K ABSCH . Patt. B ESL AND N. ROTATE returns the rotation that minimizes the mean square distance under this permutation. J. then else exit ©        $©  # #       ©   £     © . Amer. Closed-form solution of absolute orientation using unit quaternions. Wellesley. Note however that we neither have a polynomial bound on the number of iterations nor a guarantee that the algorithm finds the globally optimal solution. RMSD returns the root mean square distance. This implies that no permutation is tried twice. 827–828. Given a permutation. WellesleyCambridge Press. if RMSD . J. A popular version of the above algorithm uses injections from to instead of bijections. it follows that the algorithm halts. After each iteration. M C K AY. K. Note that independent of the bijection. given a permutation and a rotation. 27–52. The algorithm that attempts to minimize the root mean square distance between two point sets without specified bijection has also been described in several fields. Since there are only finitely many permutations. [3] B. at other times by the fact that finding the best bijection is not entirely straightforward. PAMI14 (1992). The representation. H ORN . except that is replaced by the multi-set of points in that are closest to some point in . 1993. 239–256. the best translation always moves the centroid of to the centroid of . Opt. we may use a subroutine A SSOCIATE . [4] W. Given a rotation. Mach. D. P. M ATCH . Soc. FAUGERAS AND M. Anal. including x-ray crystallography [4] and computer vision [2]. the root mean square distance decreases. VII M ATCH AND F IT puter vision. For a given rotation. Robotics Res. we follow the exposition of the solution given by Horn [3].

. For embedded in . at least not in the straightforward manner from sections between parallel plane to sections between parallel hyperplanes. Indeed. The angle of rotation about the axis is twice the angular distance from the identity on . In other words.VII. Hence. Pick uniformly at random in Step 2. The perimeter of the circle in which the plane cuts the sphere is . . as long as both intersect the sphere. We now extend this method to and thus to an algorithm for picking a rotation uniformly at random. This projection is illustrated in Figure VII. Hence. we sweep a plane normal to the -axis and compute the area by integrating infinitesimal slices. ©B ¡   ¢ £¡   ¡  ©B ¡ B ' ¡ ©B ¡ ©      B ¥        ¦ ¡         #B  ¡ B . in the quaternions near the identity would be more likely than those far away from the identity. The area of a slice is with . we get the volume by integrat- © ¤   §  ! ¢ Figure VII. The total volume of the 3-sphere is therefore . Pick uniformly at random in Define . Sweeping a three-dimensional hyperplane normal to the -axis. Think of as the axis of rotation.3 Sampling and Covering .   © ¤  © Uniform sampling.7. © ¦¦ B   ¢         ¢    ¡ ¡   B ©  ©  Area which we get by substituting .7: Illustration of Archimedes’ theorem implying that the sphere and the enclosing truncated cylinder have the same area. The method may be viewed as picking a point on the enclosing cylinder and projecting it back to the 2-sphere: . namely how to sample uniformly at random and how to cover the space of motions most economically. It would not be correct to pick an angle uniformly at random since this would favor small dislocations of . we return to what we learned from the above volume computation. we study two questions on rigid motions.  ¤ ¢  ¢ ¡ ¡ ¡  ¢ ¡   ¡   ¤ B  ¤   ©  ¢ ¡ ¡ ¦     © ¦ © ¦   The size of a sphere.3 Sampling and Covering ing the infinitesimal slices. the density is . Note also that Archimedes’ theorem does not extend to the 3-sphere. .  B¢ 7 B¢ ¡ ¢  5 ¢     ¢     © ¦ © ¦ Vol ¡ B 7 B¢ ¡¡ ¢  5 ©  ¤ P§   ¡ ¢ ¢ ¡ © ¤ ¤¥ P§ ¡ ¢  £ ¡ ¡ #B  ¡   ¡           In this section. We treat translations and rotations separately and spend most of our time on the more complicated case of rotations. We prepare the discussion of sampling rotations by measuring the unit 2-sphere and the unit 3-sphere. so we just need to pick the angle of rotation about this axis. not uniformly but from a density that favors angles near the middle of the interval. Return ¥     ¢            ¢  ¢ ¥ ¢ ¢      ¤ ©  © ¤ R  ¢   B   The total area of the 2-sphere is therefore . We need to pick the angle from . normalized to have unit total integral. with the square radius equal to . Archimedes’ theorem can be used to pick a point uniformly at random on . Specifically. ©B)¡   © ¦  ¡ B  ¥     ¡ VII. as before. But note that the derivation shows more.  109 ¡  ¤ ¢ ¥  § ¤ ©  ¢ ¡ P ¡ G    ©   ©B ¦ We use the same method to compute the volume of embedded in . The corresponding distribution function is B ¤    &  ¥  © © § £  ¤ P§ ©(P§  ¤ t©R (¨©§©© ¦!£ ¨ © ¥   ©  Step 1. This fact has been known already to Archimedes and is often expressed by saying that the axial projection from the sphere to an enclosing cylinder preserves area. namely that the area of the slice between two parallel planes at a constant distance is the same for all such planes. To pick the angle correctly.

Pick uniformly at random in Let . we get a random rotation by using . We will later analyze how these notions of distance relate to the effect of the motion on the root mean square distance between two sets in . Covering the spaces of translations and rotations. It is convenient to measure the distance between translations and between rotations using the Euclidean metric. The covering radius is the smallest radius we can assign and still have the balls cover . we pick a number uniformly at random in . the root mean square   ¢   ¢¢ ¦   ¢ # §      © ¥ # © ¢ ¢   ¢¢ ¦  ¥ ¦     ©  B B  §    ¢ ¡ ¥  ¢ ©   ©        £¥ ¢ ¡  ¢ ¢ ¡ !    ¢ ¦£  ¢     © § § ¡ B     ! P © £ ! !§§ ¤¤ P§§  ©  ¤ © § (© ©  ¤   © (© © Step 3. all of radius . and the volume is the total vol© Sensitivity to small translations.094 BCC 2 0. we are guaranteed that every translation has a selected translation at a distance at most from . ume of the balls divided by the volume of the space they inhabit. To get a point uniformly at random on .720 FCC 4 0. If we believe that we cannot cover more economically than the BCC lattice in .680 0.353 0. By counting fractions. This implies that the vectors add up to 0 implying that the sum of scalar products with any vector vanishes: . We see that the FCC lattice leads to an effective C UBE 1 0. with a bijection that maps to . We turn our attention to selecting a collection of rigid motions such that every possible motion has a selected motion nearby. Figure VII. We need infinitely many balls has infinite volume. let and be two collections of points in . Indeed.110 at which monotonically increases and reaches . Return ©  © ¤ ©  .523 0. Recall that its volume is . which we represent by 3-vectors or points in . the volume of a ball with radius in is about .740 0. both are known to be the respective best packing and covering lattices.500 2.433 0. but we are usually just because only interested in bounded portions of space.866 2.8: From left to right: the cube. and the volume is the fraction of the space covered by the packed balls. Next. After translating along . packing while the BCC lattice leads to an effective covering. we can use a straightforward volume argument to show that we need at least balls to cover the 3-sphere.2. we note that the FCC lattice has four times and the BCC lattice has twice as many points as the cube lattice. The cube lattice consists of all integer points. we append Steps 1 and 2 above with      VII M ATCH AND F IT We get a random rotation by using as a unit quaternion. Alternatively. To pick an angle. The idea of guaranteeing that every possible motion has a nearby selected motion can be expressed by covering the space of motions with neighborhoods. We study three lattices of points in some detail. To simplify the analysis. If we use the centers of the covering balls as selected translations. the FCC or face-centered cube lattice adds all centers of cube faces.463 points per cube packing radius volume (fraction) covering radius volume (fraction) Table VII.     ¤ ¢ ¡   ©B  ¢ ¡ B ¦  ¢ £¡ ¡ B     £¢     ¡ ¦    ¤ ¢ !¨§    ¡     ¢ ¡  ¤  © ¦          ¤   ¢ £¡        ¤    ¥  ¢ ©       ¢ ¡ B   Figure VII. The points with maximum distance to the lattice points are the cube centers.500 0. Consider first translations.2: Numerical assessment of how well the cube.2 lists some of their pertinent properties. As in Section VII. The packing radius is the largest radius we can assign to the points to get non-overlapping balls. Let and let be a collection of closed balls. the edge centers and the midpoints between the face and the edge centers. for each . Recall that the root mean square distance between and is the square root of the average square distance between corresponding points. Assuming is very small. we assume that the centroids of the two collections are both at the origin: . We call a covering if and we call the covering radius. . and as Euler angles. the FCC and the BCC lattices pack and cover. and we compute its preimage under the distribution function: .559 1. and the BCC or body-centered cube lattice adds all cube centers to the cube lattice. we address how translations affect the root mean square distance between two point sets. As an exercise we may estimate the number of balls we need to cover the unit 3-sphere.8 shows the portion of each lattice inside a cube of unit side-length and Table VII. the FCC and the BCC lattices. .

It is geometrically obvious that the total distance increases the fastest when each point moves in a direction straight away from . Figure VII. except that the constant now depends on the collection of points.9 illustrates this result by comparing the graphs obtained for equal and for nonequal corresponding points. in particular to their radii of gyration. including statistics.  ¥    ¥ ¦ ¥ Sensitivity to small rotations. Bibliographic notes. The length is 1 if and only if for all . In this case. the root mean square as a function over the three-dimensional space of translations satisfies a Lipschitz condition with constant 1. We repeat the analysis for rotations. we consider a function over :  p © ¥   B      ¥ 0   ¥ ¦      !p      ¥ ¢   ¡           ¡ #   #  ¢ which implies . Note that . To measure how fast the root mean square distance changes with varying translation vector. We have if and only if for all . As for translations. we have and the root mean square distance between and the rotated copy of is Let be a unit quaternion. The effect of the rotation represented by is best viewed in the   and We see that the rotations satisfy a Lipschitz condition that is similar to that for translations. Since the length of the gradient never exceeds . This is possible in the limit and characterized by the velocity vector of being parallel to . its gradient and the length of the gradient:    ¨ 0 ¢     ¢ £ ¢ B ¢     B      B  ¢ 0 0 0 ©  0  ¥       0  ¥   B     ©  ¥   B    ¡ ¡ ¡  ¢ ¡ #  ¢ ¥    ¥ 0¥     ¡        0 ¦ ¡    B  p ©  ¢ Pp e #  ¢ e The gradient is defined everywhere except at and its length is . the difference between the root mean square distance for two rotations is no more than that multiple of the norm of the difference vector: 0  #  ¢  £  ¢ B ¢ 0   B  0  B ¢ 0   B  0   ¢ p¥   9§ rp 0    B  00  B0 ¤ ¢       ¡     ¢ e ¤ p #  ¢ Hp    © ¢ e ©  ¢   dient never exceeds 1. where is the radius of gyration of projected into the plane in . the difference between the root mean square distances for two translations is bounded from above by the norm of the difference vector:  ¢     #  ¢ ¢  B   Figure VII. the length of the gradient is maximized if for all . For the purpose of computing the gradient and its length. which includes the possibility that . the radii of gyration of and are    In words. we observe that the eigenvalues are and . for . Since we assume . we compute the the gradient: § £ £ ¡  §      ¢0   ¤0   0 §    §     ¥   ¡ ¡    ¥ (¢'       §  p  § p   ¥     B© B¢ §   purp   #B ¢ B pB urp £ ©B ¢    §¨ urp p B   p      p   ¥     § ¤ p  § B     p    ¥     ©B ¢   B    p     ©B ¢  B   § ¥  ¥  ¥    B rp ¤ ¢  ¢   ¥ £ ¤ # 0   ©   B £  ¢ XB   XB ¡ B  p    p   ¥    £ e ¤ p #B ¢ Hp    ¡ ¡ ¥ ¡  £ ©B ¢ ¢    § ©B ¢ e            ¡ ¢ 0  . Call the root mean square distance from the centroid the radius of gyration.9: The hyperboloid approaches the graph of the norm function at plus and minus infinity.VII. Since the length of the gra where are the eigenvalues of the matrix defined in the previous section. Using we simplify the expressions for . Going back to the definition of . The problem of sampling motions has been studied in various fields.3 Sampling and Covering distance is 111 direction opposite to the rotation axis.

[2] J. Lattices and Groups. C ONWAY AND N. [4] G. Math. Stat. Various methods for picking a rotation uniformly at random have been published but not all are correct. 645–646. B ERMAN AND K. auf der Kugel und im Raum. ´ [3] L. S LOANE . The problem is challenging even in the relatively simple case of the 2-sphere. Second edition. it is important to notice that first picking a rotation axis and second a rotation angle favors quaternions close to the identity if we pick the angle uniformly at random in . Comput. many of the main questions in this area are still open. New York. For example. New York.112 crystallography and molecular modeling. Lagerungen in der Ebene. In particular. it is not known whether or not the BCC lattice is the most ecowith congruent balls. Very little nomical covering of is known about optimal packings and coverings in nonEuclidean spaces. Sphere Packings. 3]. and for most numbers of points (or caps) only approximate solutions are known [1]. Ann. Springer-Verlag. 31 (1977). Springer-Verlag. 43 (1972). 1006–1008. M ARSAGLIA . © ¢   VII M ATCH AND F IT  ¢ £¡ . Optimizing the arrangement of points on the unit sphere. Math. F EJES T OTH . [1] J. 1988. H ANES . 1972. Surprisingly. Packing and covering problems have been studied within mathematics and have generated a large body of literature [2. Choosing a point from the surface of a sphere. A popular method that is correct and different from the one described in this section is due to Marsaglia [4] and is reproduced in the exercise section of this chapter.

where is the total number of spaces. We turn the recurrence relation into an algorithm: integer LCS : .     © ¡ ¥ £ £§       ¥   §§ ¥     £ £ # ¡   ¡ # ¡ §   © ¦        ¡        #  § §         . we consider the related problem of docking a protein with its substrate. without using that match we end with an insertion or a deletion. The path starts at the source in the upper A Q R Q A C R C R To verify the recurrence relation note that every alignment ends with an insertion.  £  ¥  £   £    ¡ £ ¡  ©   ¡ ¡ . . In each case. spaces to achieve ¥  #  ¡  © X   ¡   £ £  ¡ ¡ ¡    §                #    ¡ #  ¥   ¨  £      £   . we may keep track of the decisions made by the algorithm. An insertion is a column with a space at the top and a deletion is a column with a space at the bottom. We model a protein as a string over the alphabet of twenty amino acids: and .10. return   # VII. Thereafter. and a mismatch is a column with two different non-space characters. Letting be the length of the longest common subsequence. left corner. Then   #    £    ¨  #  §    § ¢       ©     £ § ¡  ¡      ¡  © X    ¡   ¡ £ £    ¡    ¡ ©  §    £    Table VII. and we may move the last match to the end without de-   if if A A C C deletion: insertion: match: mismatch: Figure VII. and define for all and . For the moment. and its length is the number of matches. is the minimum number of insertions and deletions needed to transform to . but it permits spaces on both sides. and with this extra information. removing the last column leaves an optimal alignment of shorter strings. A match Q Q R A A A C C £ A R C C R £ This algorithm is a typical example of the dynamic programming paradigm. The common subsequence between two strings consists of all matches. which implies that the total running time is proportional to . we represent an alignment by a matrix consisting of two rows and columns.4 Alignment In this section.VII. we can reconstruct the longest common subsequence itself.3. we get We can think of every alignment as a directed path in the so-called edit graph of the two strings. Using a second array of the same size. An alignment maps the to the in sequence. The general alignment problem permits mismatches and assesses the score by rewarding each match and penalizing each mismatch. for to do for to do if then else endif endfor endfor. horizontal and diagonal edges B   §       ¡   ¡    §   ¡¡   ¡  ¡©      ¡       ¡ #B ¡       #    is a column of two equal non-space characters. the algorithm uses an array of tries. we restrict ourselves to alignments without mismatches. Indeed. which we illustrate in Figure VII. and not just compute its length. To enstore the solutions. Let be the length of the longest common subsequence of and . In the third case. Each entry takes constant time. We begin by studying how to match proteins and develop an algorithm that measures the similarity between two chains of atoms. Assuming gives the score for having and in a single column. Columns with two spaces are disallowed. As illustrated in Table VII. Longest common subsequence. we need to show that the length of the common subsequence cannot increase if we do not use the match between and . takes vertical. Consider first the combinatorial (as opposed to geometric) version of the sequence alignment problem. we briefly discuss the two problems of match and fit for protein structures.10: The edit graph for the strings in the above example and the path that corresponds to the given alignment.3: The alignment uses matches. insertion and deletion. which constructs an optimal solution from pre-computed optimal solutions to sub-problems. We compute by dynamic programming. a deletion or a match.4 Alignment 113 creasing the length. Sequence alignment.

say . Γ where is the score of the best alignment that ends with an insertion and is the score of the best alignment that ends with a deletion. We can use the same algorithmic ideas to compute alignments between two sequences of atoms. we compute the best alignment for each of a dense sample of motions. Improving the approximation by decreasing comes with a cost. The norm of the gradient of a single term in this sum is bounded by a constant . Chains of atoms. and the best alignment is for which . it does not lend itself to the dynamic programming algorithm and. for . we can again compute the best alignment with dynamic programming in time proportional to . The dynamic programming algorithm can still be used to identify the best in a collection of exponentially many alignments. we need a score function that balances the contributions of length and distance. It is common to penalize a gap separately for its existence and an additional amount that depends on its length. This may be done by penalizing an insertion or deletion an amount when it starts a gap and an amount when it continues a gap. it prefers shorter over longer sub-chains. Let the and the be the centers of the -carbon atoms along the backbones of two proteins. This strategy makes sense in practice since in any case the locations of atoms are only known up to some precision. We first consider translations . for some . First. Running time.114 and ends at the sink in the lower right corner. The other parameters entering the analysis are the lengths of the chains. Instead of computing the best motion for each alignment. second. The upper envelope of the graphs is the motion-wise maximum of the score functions. Using three arrays. and . we reward a match between and by adding (VII. and hence . To decide how dense we have to cover the space of rigid motions. For each alignment between and . A gap in the alignment is a sequence of contiguous insertions or of contiguous deletions. We need some notation to formalize this idea.11. ¢¡ $    £ ¥ ¢ ¡    ¥ £ ¡ ¡  ¥ ¡  ¦¡££  #  £ ¡ ¡ ©   ¦     © ¦  £   # ¡ ¡ ¡ ¦   ¥ ¦ ¦ ¦     # ©  ¡ ¡  &  §     ¡      ¡  X ¡    ©      ¡ #  £ ¡ ££ ££  § R  © X¡ ¡  #  £ ¡¡ £ £  © ¡     p  §    p    §  R $ ¥ ¥ ¥    ¥  ¡  £  # ¢ £¡ R ¥ ¥ §  ¥  ©  ¡¦   £ R ¤¡ ¤  #   ¡  §     ¡   £ £ £ ¦ ¢ ©       ¦ ¡  ¡  # £ §    ¦¢  ©   ©  © £     ¡ § ¦   # . We may therefore assume that the radii of are both roughly equal to and the radii of are both roughly equal to . We quantify the dependence by analyzing the running time depending on . For now. The score of the best alignment between and is then . Consider the function defined as the motion-wise maximum ¢ ¡ ¡ B to the score. namely higher running time because we evaluate for more rigid motions. This gives rise to the following recurrence relations: VII M ATCH AND F IT . Letting and be positive constants. Ignoring penalties for gaps. The idea of the algorithm is to sample the space of motions dense enough to guarantee an alignment with a score at least . Proteins tend to have globular shapes packing their atoms around their centroids. we assume a fixed embedding in and consider the alignment problem without applying any rigid motion. and we penalize for gaps as before. we get where is the length of the alignment and the points are re-indexed so that maps to . Instead. the radii of the smallest spheres enclosing and and the radii of gyration of the two sets. It does this in time proportional to .1)  µ Figure VII.11: The horizontal axis represents the six-dimensional space of rigid motions. we permit a rigid motion be applied to one of the chains. ¥¢  ¤ ¥¢  ¤ p   ¥ £ p B §  ¥ ¤   B© ¢ Pp e ¢  §   ¥      p   ¡     ©B ¢   ¡ Next. Using the root mean square distance between two sub-chains is problematic for two reasons. We thus aim at computing an approximately best alignment. we determine the sensitivity of the score function to small motions. but we may decrease and thus get arbitrarily close to the optimum. One such function is obtained by combining square distances with gap penalties as follows. We further simplify the discussion by assuming . This construction is of all : illustrated in Figure VII. Let be the motion that maximizes . we get a function that maps a rigid motion to the score between and .

This question makes sense if we use space-filling representations of the protein and the ligand. Instead of protein docking. and it is strongly repulsive for colliding van der Waals spheres. By choosing the balls in the cover small enough. There are many possibilities. we consider the simpler re-docking problem. The collections of colliding and of close pairs are ¡ # where is a small positive constant. We need some Analysis. For 115 notation to lay out the rules for this problem. some of which will be mentioned at the end of this section. Protein re-docking. we get again a contribution of at most to the error.  if if 0 © ¡ # ©  & ¡  ¡     ¡   00 © © F ¥ ¤¢  § ¥ ¥ ¢  ¦  ¥ ¢¢   ¦ £ ¡ ¡ F ¥ £¢ ¡ ¦ ¡   p  §   p ¨  p  §    p ¢¢ ¨  ££  ¡¡ ¥ § ¡ ¡ §  ¥ ¥ ¢£¥ § £  # ©   ¡ ¡ © ¢ ©  F ¥ £ ¡ ¡ ¡   ¦ © ¡ # ¡ ¦     ¡     ¡ ¥ ¥ ¦ ©    ¡ ¡  ¡ F ¥ ¤¢ # # ¡ ¥ ¢¡¦ ¦ ¥         ¥ ¢¡¦ ¦      ¤ p ©  ¢ p    ¡ ¥  © ¢  ¡¥   ¢   ¦ ¥ ¢ ¢ ¥ d   . and the volume of the rotations is . We interpret this question as asking how similar the substrate is to a portion of the complement of the protein. and let be the protein after applying a random rigid motion. and one is the approximation of the van der Waals potential by counting the pairs of spheres at small distance from each other. we can guarantee that the root mean square distances between and and between and are less than some ¦ Figure VII. we will need to build knowledge about flexibility into the score function. Multiplying this with the running time of the dynamic programming algorithm gives a total running time of . Experiments show that this score function is a good indicator of good fit. We cannot use the root mean square distance to guide our reconstruction of the complexed form and thus need a score function that assesses how well a motion does in generating a good fit.4. This is of course not practical and we need faster alternatives. The goal is to find a rigid motion such that and fit well. but one weakness is its sensitivity to collisions. but not if we represent them combinatorially or as chains of points in space. The sensitivity of to small rotations depends on the radii of gyration. The geometric fit between the two proteins thus becomes a significant factor in making the interaction possible or. We may account for this fact by allowing a few collisions in the definition of . After is computed. By covering the space of rotations with balls of radius . the region of local complementarity is frequently fairly large. We thus define  © ¡ ¡ © #  ¡   ¡ # #  F ¥ ¤¢ ¡ protein-protein interactions. As mentioned in Section I. This idea is illustrated in Figure VII. In protein docking.12: The shaded local complement of the left shape is similar to the shaded portion of the right shape. but to get a good approximation of the reality. and we get . Here we are given the complexed form of a protein and its substrate and we attempt to reconstruct that form while suppressing any knowledge of the solution.12. We think of the and as the centers and write and for the van der Waals radii of the spheres in and . Actual proteins are flexible and can avoid minor collisions by small deformations. the basic question is how well a proteins and its substrate fit to each other. Improvements of the running time are possible. which can be done directly or by computing the root mean square distances between and and between and . Given a rigid motion . The input to the reconstruction algorithm consists of and and not knowing the solution means we can not use any information on and on . the volume of translations we need to cover is proportional to .VII. The substrate could be another protein or a small ligand. Let and represent the protein and the substrate in complexed form. We cover the space of rigid motions by cross-products of these balls and thus get a constant times rigid motions. we compute by comparing all pairs of spheres in time proportional to . the van der Waals force is weakly attractive within small distances of maybe up to four Angstrom. we can test how well we did by comparing with . By assumption on the shape of the protein. we need a constant times balls. It follows that having a translation that is not quite the optimum contributes at most to the error. In each case.4 Alignment We cover the space of translations with balls of radius . The general algorithm for re-docking is similar to the one for geometric alignment: we explore the space of rigid motions and evaluate the score function at the centers of the balls used to cover the space. in not making that interaction impossible. more accurately.

In other words. In this section. Among other things. [8] A. H UBBARD AND C. Biol. M AO . this improves the running time to roughly . [1] S. [5] L. The structural alignment problem refers to comparing the backbones modeled as curves or chains of spheres in three-dimensional space. and in these cases the geometric fit is an important factor. 1504–1518. and Sequences. Algorithms on Strings. L AURENTS AND M. W OLFSON AND R. We can design cases in which has arbitrarily narrow high spikes and our algorithm has little chance to ever recover the complexed form. [3] D. [6] L. Stanford Univ. The particular score function given in Equation (VII. B ESPAMYATNIKH . V. 409–443. H. Nucleic Acid Res. Proteins 47 (2002). we have followed the second approach and presented the work of Kolodny and Linial [7]. where and how proteins interact with each other and with other molecules. V. Mol.116 . we may cover the space of translations with balls of radius and the space of rotations with balls of . 233 (1993). whether or not the algorithm recognizes as close to depends on the shape of in this neighborhood. C HOTHIA . who explore rigid motions in the outer loop and optimal alignments using dynamic programming [3] in the inner loop of their algorithm. L EVITT. The material in is this section is based on the work described in [1]. According to the sensitivity analysis in the previous section. H. L INIAL . Duke Univ. J. Mol. ¢ ¥ ¥   ¢¡ #   ¥ ¦  ¢     ¥ ¥ © £ . Manuscript. Phys. 123–138. H OLM AND C. For the translations.1). The total number of rigid motions to be ex. However. S UBBIAH . with constants and . 3 (1993). 2003. Computer simulation of protein-protein interactions. [9] S. Press. Trees. E DELSBRUNNER AND J. We refer to [4] for a recent survey of the extensive literature on computational approaches to protein docking. the surface area of the interface during the interaction is substantial. Cambridge Univ. [2] A. but it is the only algorithm that guarantees a good approximation of the optimal alignment in polynomial time. H OLM AND C. J. The goal of protein docking is the prediction of whether. The FSSP database of structurally aligned protein fold families. M URZIN .. research on this problem has lead to the creation of structural databases [6. For constant . B 105 (2001). C HOI . S ANDER . There are two main computational approaches to structural alignment: one represents a chain by its matrix of internal distances [5] and the other uses rigid motions to align the chains embedded in space [9]. N USSINOV. Since is typically in the thousands. D. In many cases. Durham. Bibliographic notes. An improvement by a factor is possible if we compute for all translations composed with a single rotation in one sweep. Indeed. we need to cover a volume of about requiring about balls. It should be mentioned that the presented algorithm is significantly slower than the currently most commonly used DALI software [5]. KOLODNY AND N. [4] I. SCOP: a structural classification of proteins database for the investigation of sequences and structures. H ALPERIN . Chem. 2002. H. A. even this is not practical and we need faster alternatives. Manuscript. where is the radius of gyration of either radius or . we get a total running time proportional to . we need to cover a constant volume also requiring about balls. There is. 247 (1995). there are cases with smaller interaction area in which forces unrelated to geometric shape outweigh the importance of shape [2]. RUDOLPH . 8]. Biol. 22 (1994). North Carolina. Let us return to the question how to cover the space of motions to guarantee a root mean square distance of at most . Principles of docking: an overview of search algorithms and a guide to scoring functions. Stanford. Protein docking by exhaustive search. Current Biol. 141– 148. Note that this does not necessarily imthreshold ply that is large. B RENNER . we simplify the analysis by setting and assuming that the radii of the smallest enclosing spheres and the radii of gyration are all roughly equal to . M C C AMMON . T. S ANDER . 1997. B. Protein structure comparison by alignment of distance matrices. S EPT AND J. E. however. England. Approximate protein structural alignment in polynomial time. For the rotations. G USFIELD . California. S. [7] R. J. and multiplying this plored is thus proportional to with quadratic running time for evaluating the score function . experimental evidence that such configurations do either not exist or are rare for actual proteins. 536–540. D. Its importance within structural molecular biology derives from the observation that evolution preserves structure better than amino acid sequences. G. 3600–3609. E LCOCK . Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. As before. it could be zero because motions with high score value tend to be right next to motions that generate collisions. was sug£ ¦  VII M ATCH AND F IT    ¦ ¥ F ¥ £¢   ¡        ¢ ¥     ¢   ¥ ¡ ¥     0   ¥   ¦ ¦   0   ¦  ¡ ¦   gested in [9]..

Recall that an alignment beand -carbon atoms that tween two chains of uses spaces can be represented by a matrix with two rows and columns.        B We know that the perimeter of is .     ¥ £ ¢   3. Let us mark a point on the unit 2-sphere. £ ¥   ¥ ¡ ¡    ¡¥  £ ¡ ¡ ¥ ¡   ¡ ¥¡ ¤ ¡ ¡  £ (i) Show that the above claim holds for any three planes that pass through and pairwise enclose a right angle. Assuming . The -dimensional unit sphere consists of all points at unit distance from the origin of the -dimensional Euclidean space:  (i) Pick numbers dom in . . The reflection through a plane maps every point to the point such that crosses the line segment orthogonally at its midpoint. The central reflection maps every point to its . The remaining spaces are distributed over equally many insertions and deletions. For a rotation .Exercises 117 5. the root mean square distance to the is the root of the square distance to the centroid plus a constant:  (i) Show that is a necessary and sufficient condition for the number of spaces in any alignment of the two chains.          ¥ B ¢      B ¥£     §   ¢   "      '  ¥      ¢   § ¡      (     B B 2. is also the sum of square distances from the three planes parallel to the coordinate planes that pass through . (iii) Further extend the construction to a collection of lines in . Any density function over the space of rotations implies a density function over the 2-sphere. What is the -dimensional volume of ? 7. we define and note that we need insertions just to make up for the difference in length. Sampling the 3-sphere. Square distance from planes. Sum of square distances. £  ¤ ¡ ¤  4. Suppose Function U NIFORM picks a real number uniformly at random in . antipodal point       Exercises      B ¥ ¥ ¥   $ B ¥  ©B        urp ¢ pB p B   6¡ !p ¢ ¡ ©   ©   ¡  ¡ ¡ B ¡ £  ¦ $   ¥      ¢ ¥  B p  $   ¦ $ ©B  ¡       ¥ B ¡ B¡     ¡ B rp    p $ ¢ ¡ ¢ ¡ ¥ B t   ¥ #B ¢ ¢ ¡ B rp ¡ B  $ $ ¢ ¡   ¡ ¦    ¥ ¢ ¡ © ¦ $   ¥ ¥ ¢      #B . Random rotation. else let return and uniformly at ranthen repeat Step and ¢   (i) Show that every rigid motion is the composition of two plane reflections. the area of is and the volume of is . (i) Prove that for every point in space. (i) Show that the minimum of two numbers picked by Function U NIFORM is distributed according to the triangle density function . the median and the maximum of three numbers picked by Function U NIFORM distributed? 6. (ii) What is the number of different alignments with a fixed number of spaces? (iii) What is the total number of different alignments? What exactly is the constant? (ii) Extend the construction to a collection of planes in . Prove that the uniform density of quaternions over implies the uniform density of points over the 2-sphere. In other words. (ii) If or 1. Number of alignments. (ii) Area there triplets of planes enclosing non-right angles for which is equal to the sum of square distances from to the three planes? 8. (ii) How many plane reflections do you need to represent the central reflection?   ¥ B ¥     ©B   1. Reflections. so we define . Sizes of spheres. (ii) How are the minimum. Consider a collection of points in and let be its centroid. let be the image of under that rotation. The square distance from a point . prove that there are three planes for which a similar formula gives the sum of square distances to the planes. Prove that the following method picks a point uniformly at random on : ¢   £ ¡ . Biased probability.

118 VII M ATCH AND F IT .

3 VIII.4 Molecular Dynamics Spheres in Motion Rigidity Shape Space Exercises 119 .1 VIII.2 VIII.Chapter VIII Deformation VIII.

[ Numerical integration. [Weighted area and derivative (forward pointer to Chapter IX).] Kinetic data structures. Beeman. predictor-corrector).] ¡ ¢£   . different numerical methods (Euler. Delaunay triangulation or dual complex (forward pointer to Section VIII.] . leap-frog.] Hydrophobic surface area. [Close neighbor lists.2 and IX).120 VIII D EFORMATION VIII. [Taylor expansion.1 Molecular Dynamics  Newton’s second law. Verlet.

Geom. 344–351.] Bibliographic notes. A. BASCH . [3] M.. E DELSBRUNNER AND E. L. Urbana. R AMOS . Sci. FACELLO . Linear motion in instead of . D. G UIBAS AND L. ¡     . 17 (1997). 1996. Illinios.2 Spheres in Motion 121 VIII.)] [This topic relates to the possibility of drawing non-straight Voronoi like decompositions [2]. A. Proximity problems on moving points. Illinois. Ph.2 Spheres in Motion [Explain the slack in the Pie Volume Formula (with a forward pointer to Chapter IX. Geom. 13th Ann. Geometric techniques for molecular shape analysis. Comput. thesis.. In “Proc. Dept. Discrete Comput. [1] J.] [Define cross-sections of the complex of independent simplices and proof that each cross-section gives a different pie formula but the same measurement. Z HANG . Comput. Report UIUCDCS-R-961967. Inclusionexclusion complexes for pseudodisk collections.VIII. J. [2] H. 287–306. Sympos. Univ. 1997”.] [Predict collisions of spheres.] [Dynamic Delaunay triangulations [3].

3 Rigidity [Discuss the pebble algorithm that analyzes the rigidity of a graph in three dimensions.122 VIII D EFORMATION VIII. .] Bibliographic notes.

which often does not exist. we show the skin curve together with the dual complex. except the last three in the sequence. Figure VIII. That method can be   ¥   ¥  ¥ VIII. In this section. [1] H. Comput.4. There is a third type of change not seen in Figure VIII. Geom. P. Geom.1 shows the deformation of a skin curve defined by four into one defined by three circles. In “Proc. C HENG . Graphics Internat. where we discuss notions of similarity between two molecular skins. The problems of (1) finding a good basis. 205–218. The Morfi software has been used in [2] to explain two-dimensional skin geometry and to illustrate its use in deforming two-dimensional shapes into each other. P. which in the he complex is caused by adding a vertex and in the body by creating a component. [3] G. Comput. 19 (2001). (2) finding the best approximation within the spanned space. 19 (2001). For the complex we observe two types of changes caused by adding an edge or a triangle. The Morfi software creates a few-to-few correspondence through geometric considerations rather than working towards a one-to-one correspondence. Bibliographic notes. Theory Appl. Theory Appl. The corresponding changes in the body are caused by creating a handle or filling a hole. 1996”. We note that these deformations are similar but also different from the image morphs studied in computer graphics [3].VIII. which are probably discussed in the approximation theory literature. [Explain the mixing of two or more shapes as a generalization of 1-parametrized deformation. F U AND H. F U AND K. W OLBERG . Recall that the homotopy types of the body and the dual complex are always the same. Design and analysis of planar shape deformation.1. Similar to two dimensions. They are similar to fundamental questions on function representation. H. it deforms the skin of one set of circles to the skin of another. We note that any two contiguous bodies. 64–71. Shape space from deformation.4 Shape Space . E DELSBRUNNER . A canonical such method is explained in [1]. are both difficult. The details of this deformation will be explained in Section VIII. Recent advances in image morphing. The goal there is photo realism and possibly the most difficult problem towards achieving it is the construction of a one-to-one correspondence between features of the initial and the final images.-L. For each snapshot. differ by at least one change in homotopy type. we merely illustrate the deformation and mention some of its features in passing.] The main functionality of the Morfi software is that it can smoothly morph between one skin curve to another. C HENG . which implies that they change their type the same way and at the same time.-W. P. Comput. 191–204. In other words. L AM . we can deform skin surfaces into each other by continuously changing the defining spheres.4 Shape Space ¡ 123 skin surfaces and thus create a shape used to mix space that encompasses -variate deformations.. [2] S. E DELSBRUNNER .

1: Ten snapshots of a deformation with skin and dual complex displayed. . The skin in the fifth snapshot is the same as in the figures above.124 VIII D EFORMATION Figure VIII.

The sequence is defined by a set of seven spheres forming a question mark at time and a set of eight spheres forming a human-like figure at time .VIII. ¡ ¡   £ ¡ ¡ ¡ ¢ ¡ ¡ ¡ ¤  ¡   ¡ ¡ ¡ ¡ £ ¡ ¡ ¡ ¡ £ ¡ .4 Shape Space 125 Figure VIII.2: From left to right and top to bottom: the shapes at times .

 ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Section of triangulation.126 VIII D EFORMATION Exercises The credit assignment reflects a subjective assessment of difficulty. 1. Let be a line that avoids all point. Prove that intersects at most edges of and that this upper bound is tight for every . Every question can be answered using the material presented in this chapter. Let be a triangulation of a set of points in the plane. (2 credits).

1 IX. This chapter will study three aspects of size: volume. IX.Chapter IX Measures There are various reasons why biologists want to measure the size of molecules. surface area. From these we will derive short inclusion-exclusion formulas for size measurements.4 Indicator functions Volume and surface area Void formulas Measuring Software Exercises ¦ ¡ 127 . Volume is important in the calculation of free energy and in estimates of populations given a bound on the available space. and arc length for such diagrams. Our general approach to measuring the size begins with indicator functions for convex polyhedra in .3 IX.2 IX. Surface area is a resource consumed by molecular interactions and is probably even more relevant to research in structural biology than volume.

Namely if then it sees a facet from for the singleton the outside and we have set containing the half-space whose bounding hyperplane contains that facet.     £¥ § ¢ £ ¨     £          ¦    ¢¡   ¢ §¡   ¨ ¡        ¡   ¥        7£  £ 5   ¥  ¡         ¡  ¦ Let be a convex polyhedron in and assume it has non-empty interior. In the unbounded case. we define    (IX. the polyhedron is the convex hull of finitely many points.128 IX M EASURES Below we will construct indicator functions of from Euler characteristics of subcomplexes of the boundary complex. the dual of the boundary complex is a simplicial complex and the Euler-Poincar´ Theorem stated in Sece tion IV. is a -face of itself and the facets are the -faces. The Euler characteristic of is the alternating sum of faces. the boundary is a -dimensional topological sphere whose only non-zero Betti numbers are .1. Assuming general position. The boundary is decomposed into faces of various dimensions. Let be the number of faces. which comes from the empty set. and . A hyperplane supports if it intersects the boundary but not the interior. In words. is the most important dimenkeeping in mind that sion since polyhedra in relate to molecules in . Each face is the intersection of the polyhedron with a subset of the hyperplanes bounding half-spaces in . We show that the non-zero terms cancel unless there is only one non-zero contribution to the sum.1 Indicator Functions The Euler relation for convex polyhedra is a special case of the Euler-Poincar´ theorem for complexes. it extends to infinity. Most of the terms in the exponentially long formula (IX. It is either bounded or unbounded. which are usually prefixed for clarity. .1: A bounded convex polyhedron in an unbounded one to the right. There are e elementary proofs for this special case. A face of is the intersection with a supporting hyperplane. which is the alternating sum of subsets of . Specifically. if otherwise. that leads to We form an alternating sum of the an indicator function for the convex polyhedron. The straightforward way of doing this is called the principle of inclusion-exclusion. To see this define and . Note that . The sum ranges over all subsets of . and in the second. to the left and     ¡   ¦    ¢¡ B  ©  $ $ F ¢ ¥¤   ©       ¦    ¥   ©   ¢¡ $  ©    ¢ £¡      $  ¡ ¦ ¦         ¦ §   ¦ §¡  $ ¡      ©     ¦   £ $   ¥ ¢ ¡  ¢  ¡ ¦ ¦ £  $   ¢   ¦ ¡   $ ¢   ¥  ¡   ¦    ¡    ¡ ¦   ¦ ¡ ¦ ¦ ¡ ¦ ¦ ¦ ¡ 0   ¥    ¥           ¦  ¥ ¤ ¡          ¥        ¡ ¦ ¦ ¦ 0 ¡ ¦     ¤ ¡ ¡ 0 $ . and both cases are illustrated in Figure IX. Inclusion-exclusion. A convex polyhedron is the intersection of finitely many closed half-spaces.1) Truncation.3 implies the Euler relation for convex polyhedra:   ¥ ¦  ¦ ¡   if if          ¦  ¦ ¦ ¢¡   ¡    £           ¦ ¦ ¡ provided . the boundary is an open -dimensional topological ball whose only non-zero Betti number is . Let be the finite collection of half-spaces such that .1) are redundant and can be removed. and this section presents one that is inductive. For we get and is an indicator function for . In the first case. This sum is    © Figure IX. For example. including the empty set for which for all points . we only keep the terms that correspond to faces of . The Euler relation will follow from elementary proofs of properties of these indicator functions. Particularly. For a subset and a point we define  ¦ IX.   Note that is outside iff for at least one nonzero subset . We study polyhedra in -dimensional space.    ¢ ©  ©  © "   ¦ if if   is bounded is unbounded       § ¡  ¦    ¦ In the bounded case. Convex polyhedra. as we will see later.

and the ones contained in . The Pie Theorem A implies the Euler relation for unbounded polyhedra. The convex polyhedron is obtained by removing the constraint .  The introduced systems partition . . the faces on the silhouette are not visible. We have for all .1) to the system is (IX. Define sets of half-spaces and . The second term vanishes because all sets in contain . The faces of are defined by sets in . ¡¡ © ¦ )   ¡ §¡ ¦  ¦  §¡  ¢¢  ¦ ¡¡ ¦  &¡ © and &  ¢¢   ¡ " ¦ ¦ )¡ )   "¢ ¢¢   ¡ "  &  " ¢¢ E£ ¡  ¦ ¡¡ ¦ ¦ ¦ ¦ ¡ £ £ © © © ©  ¡¡  ¡ ¡ ¡ ¦ ) ¡ ¡ ) ¡ ¡ ) ¡ ¡     ¡ ¡¡ E£ ¡ ¦   ¦ ¦   ¡  ¦ ¡ ) ¦       ¡¡ ¡ ¡ ¡ ¡ ¦ ¦   © ¡ ¦ ¡ ¦   " ) ¡    © " ¦         © ¦ ©  $ "  $   ¢  " ¢   ¢¡¡   $ F ¢ ¥¤   © © © ¡ ¦ © £ ¡ ¢ ¡      ¡ ¦ ¡     ¡ ¡)    ¦ ¥  ¡    ) ¦        ¢  ¦ ©    ¢  ¡         ¦ ©   ¦   "        ¡ ¦ ¦       ) ¦      ¦     ¡¡ ¡   ¦   ¡¦ $  ) ¨  ) ¥                  ones crossing the hyperplane shared by and . as shown in Figure IX. By assumption. The corresponding systems are and . which is a half-space that contains . and therefore . as in Figure IX. Assume . Notice that according to this definition. where . the ones contained in . We argue that all three terms on the right side of the equation for vanish. The induction hypothesis thus applies. in which case and . The restriction of the inclusion-exclusion formula (IX. Both and have one less half-space not containing than does. . it is still an indicator function of . The corresponding systems form the partition . We use induction over the cardinality of the set . and define as the closed complement of . . and . The third term vanishes because iff .2: The half-spaces and share the hyperplane and are complementary to each other. Therefore because the values cancel pairwise. The union of and is . and rewrite the formula in the Pie Theorem  ¡   ¡¡ ¦   $  £    © ¡  $  ¦   ¤ ¥  ¤ ¦ ¥ ©    ¤ ¥  ¦    ¦ ¡    ¡¡      ©  $ ¢¦  ¡      ¡¡ ¤ ¦ ¢¦ ¦ ¡  ¦¡ ¦   ¡    ¡   §¦ ¦ £ ¡     P ROOF. It is convenient to assume general position. To see this. . Let be the system of subsets that define non-empty faces. . which we consider an imFor proper face but still a face of . By assumption of general    _ g g Unbounded convex polyhedra.2. which is again defined as the collection of half-spaces that do not contain . we fix a point outside all half-spaces in . ¤   ¤     ¢ ¦  ¤ ¢ ¦   ¤¤ ¢ ¦  ¤ ¥§¦ ¥ ¨§¦    ¡¦  ¤ ¦ ¤  ¤¤ ¢ ¦  ¤ ¢¡¦   ¤ ¥  ¦ ¡  ¦ ¢ ¦¦  ¤¤ ¢ ¦¦  ¤¤ ¡¦¦   ¥    ¦ ¦  ¢  ¦   ¤  ¦ ¤ ¡ ¦ We claim that even though is much shorter than .3.IX. where  ¡ ¡¡ ¡ ¡     ¦   ¦  ¦ ¦ . . This claim is sufficiently important to warrant a complete proof. as required. the A in terms of face numbers ¢  Figure IX.2) © 129 ¦ Note that . . which implies that iff and therefore . ¡ Figure IX. let . and the faces of are defined by sets in . We distinguish £  ¢   ¡ ¡¡ ¦ ¡¡ ¡¡ ¦ £  ¦     £  ) £ ¦   ¡ ¡ ¦ )  ¡     if if   and hence . where ¡ P IE T HEOREM A. which in this context means that there are no two subsets of that define the same face.1 Indicator Functions we get . We can therefore write their values as sums of values of the subsystems.  ¦   P ’’ P y ¦ ¡ three types of faces of . Consider visible from if sees all facets around from outside . The basis of the induction is covered by . Then iff is visible from .3: The point lies in the intersection of the complements of the half-spaces. For sets there is an intuitive interpretation of .

130 with cardinality position. As illustrated in Figure IX. We construct a convex polyhedron that contains and approximates in the sense that . On the right side it is counted times. We first weaken the theorem by restricting the points to lie within a convex body . this time for a bounded convex polyhedron . and the same edges and vertex intersect the interior of . that of is solid.5: The boundary of is dotted. We need a slightly stronger version of the Pie Theorem A to prove the Euler relation for bounded convex polyhedra. is an indicator function for . ¦     Restricting body. this par- ¤ £    £ ¡ ¦  £   ¥    £ )¥ ¡    ¡¦ ¤ ¦ §£ ¦  ¥   ¤ )  Y z  . Let be the number of -faces in the silhouette. and then strengthen it by further reducing the set system. ¢       P IE T HEOREM B. For we have   ¥  ¡     ¡   ¥  ¡  ¡    ¦  ¡ ¦   ¥    ¡   ¥  ¡  ¡  ¦  By choice of . Let be the number of -faces of that have nonempty intersection with the interior of . and the silhouette is indicated by the two hollow vertices. By the choice of . is the number of sets . ¦ Z y Figure IX. Observe that this sum counts the -face the same number of times on both sides. We choose a line not parallel to any face of and points and sufficiently far in opposite directions on the line. The system contains exactly all sets for which . .4: Three edges and one vertex of intersects the interior of . Define and let be the corresponding sum of values. Hence if . Hence for all points and therefore also for all points .5. ¢   ¦ ¦     ¦   ¢    ¥  ¤      ¤ ¡ ¡ ¤¤ ¡         ¥ ¦ ¦  © ¡ $ "¢ ) ¡ ¡ ¤ $¡ ) h © ¢¡¡ ¨ ©       © ¦      $       ¡¦ ¡   ¡    ¦ ¡   ¢¦     ¥    ¦        ¤   ¤    £  £   ¡     ¡¦     ¤ ¤ ¤ ¦   ¨ ¤   ¤   ¤ ¢ Bounded convex polyhedra. Define   ¥ ¢     ¥ ¦      £   £     )   £ ¤¦ ¦ ¥ ¡ £ £¢¦ £ ¢¦ ¡ ¤     if if   titions into the set of half-spaces that do not contain and the set of half-spaces that do not contain . Each proper face of either belongs to or to or to the silhouette as seen in a view parallel to the chosen line. ¤   £  ¢  ¦ ¡ £    £¡   ¦ ) ¡  ¡  ¤ ¤   ¡ ¡     if if        ¢   ¥  ¡ ¦          ¥  ¡ ¦  and use the Pie Theorem A to get ¦ ¦   ¥     ¡ ¦ ¡ ¥ ¦ ¢  ¦   ¦   ¥  Figure IX. as in Figure IX. We get  ¡      ¢ ¦     ¥  ¡   ¦ PA      ¦   ¥  ¦  ¤ A    ¤   ¢    ¦    ¦        P     ¢  ¥  ¡ ¦   ¥     ¦ ¡ P ROOF. We return to the computation of the Euler characteristic. We can now argue inductively that the Euler characteristic of is . we have for all and therefore . using the respective other convex polyhedron as the restricting convex body . We show that for points . This implies the Euler relation for unbounded convex polyhedra. which establishes the induction basis. is a closed interval with . by the Pie Theorem B. The projection of the silhouette onto a hyperplane normal to the line is a bounded convex polyhedron of dimension .4. and define symmetrically. IX M EASURES For . every point is contained in all half-spaces of . same as on the left side. Furthermore.

6]. 109–140. Scand. 4].  131 ¡ ¦   ¥       ¦  ¤ ¡     ¢   . Geom. 140–160. this principle also yields the Euler relation for convex polyhedra. [5] H. Shellable decompositions of cells and spheres. Eulers Charakteristik und kombinatorische Geometrie. 41–46. 101–110. Acad.IX. Elementa doctrinae solidorum. quibus solida hedris planis inclusa sunt praedita. Sci. 1–237. Sci. Petropol 4 (1752/53). Imp. . [2] H. Math. B RUGGESSER AND P. S CHL AFLI . 13 (1995). There are e many proofs of that relation. ¨ [7] L. [4] L. H ADWIGER . and the historically first one for the general -dimensional case goes back to the work of Ludwig Schl¨ [7] in the middle of the nineteenth cenafli tury. E ULER . E DELSBRUNNER . Written a 1850–52 and published in Denkschrift der Schweizerischen naturforschenden Gesellschaft 38 (1901). N EF. The discovery of that relation for convex polyhedra in three dimensions is usually attributed to Ludwig Euler [3. We note that all authors of papers referenced in this section are Swiss. 415–440. Most of the material in this section is taken from [2]. Bibliographic notes. He implicitly assumes that the boundary complex of every convex polyhedron is shellable. Novi Comm. As demonstrated. Indeed. although there is evidence that Ren´ Descartes knew about it a century earlier. Zur Einf¨ hrung der Eulerschen Charakteristik. Novi Comm. who thus filled the gap left in Schl¨ afli’s proof. where the inclusion-exclusion approach to measuring the union of balls is laid out. M ANI . 194 (1955). E ULER . except for one who has a Swiss grandmother. 29 (1972). as required. u Monatsh. Reine Angew. Imp. Theorie der vielfachen Kontinuit¨ t. which has not been established until 1972 by Bruggesser and Mani [1]. The union of balls and its dual shape. [6] W. 197–205. J. [3] L. Discrete Comput.1 Indicator Functions by induction hypothesis. finding elementary proofs of the Euler relation for convex polyhedra seems to be a favorite topic for Swiss mathematicians [5. Petropol 4 (1752/53). Acad. Adding the alternating sums of the . Math. Math. [1] H. 92 (1981). Demonstratio nonnullarum insignium proprietatum. and implies .

©   B     ¦ B ¢ ¥   ¢ ¡ ¢   ¦ B  ¥¡ ¢ ¡ ¥ ¢   ¡§ ¡         ¢ ¡ ¢   ¦ ¡ ©B ¥ ¡ $  ¡   1¥ ¡ ¦ ¡ 0 ¦    ¡   ¤ ¤¤ Figure IX. The half-space lies on the side of its to . Recall that is the unit sphere centered at the origin . and total arc length of a space-filling diagram. Let be a set of three half-spaces whose bounding planes pass through 0. Instead of computing the volume of directly. Let be the collection of half-spaces that contain the north-pole. the indicator function of a geometric set is 1 inside and 0 outside the set. ¢ ¡   which implies that the area of the spherical triangle is . Let be the system of subsets of that appears in the statement of the Pie Theorem B in the last section. Let be the unit 3-sphere with center at the origin and identify with the hyperplane . The stereographic projection maps a point to the point collinear with and . and the intersection with the ball bounded by is a pyramid whose base is a spherical triangle. subtracting three half-balls. We can therefore compute its volume by integration. and be the dihedral angles between the planes. We transform the question into one about half-spaces in . Consider for example a bounded convex body and a convex polyhedron . we compute the volume of   ¥ ¢     ¦ The area of the spherical triangle is three times the volume divided by the radius of the sphere. For measuring molecules. The half-spaces intersect in an unbounded triangular cone.7: Stereographic projection from hyperplane that does not contain the north-pole.132 IX M EASURES IX. Then is the stereographic projection of the portion of that is not contained in the interior of . By definition. We now turn to the problem of measuring the union of a finite set of balls in . so does contain . This is illustrated in Figure IX. . the angles of the spherical triangle. Call the north-pole of . the above formula gives a proof of the area formula for spherical triangles. which is the intersection of the 3-sphere with a half-space . in which the volume is a sum of terms each involving four or fewer half-spaces. The volume of the pyramid can now be computed by taking the ball. Stereographic projection. the sets contain or fewer half-spaces each. we get a cap of . we are mostly interested in the case . Let . Union of balls.2 Volume and Surface Area In this section. or equivalently. The volume of the intersection of the two convex bodies is N Figure IX. Volume by integration. If applied to all points of a ball in .6. as shown in Figure IX. and subtracting the reflected pyramid. It follows that the volume is  ¢ £  ¢ ¤  where is the closed complement of the half-space . adding three sectors.1 to derive inclusion-exclusion formulas for the volume. area.7. The map is bijective and therefore has an inverse. ¢        ¢ ¥  ¢¦ ¡ £  £    ¡ ¡¢     ¢   ¦ ¡   ¦ §    ¥   ¢ ¢   In dimensions. That radius is one.    ¢ ¡      ¢   ¢   ¤ ©    © ©  § ¡ ¦   ¡    © ¥  © ¤ ¡  $ % 0  $ $ F ¢ ¥¤   ¤  ¤ £ £ ©  ¢ $ © ¢  ©    © F¢¤ ¥    © ©   ¢    F¢¤ 0 $  0  ¥    ¥         ¦ ¦   ¥    ¥  © ©       ©    © ¦ £ £ £ £        ¢ © © ¦      © © ¡  ¤ ¢¢ ¥          ¢ a© ¥ ¢ ¥ a©       ¢¢ ¦    . we use the indicator functions developed in Section IX. . Assuming general position.6: A pyramid cut out of a ball by three half-spaces.

We observe that the index system in the Pie Volume Formula is an abstraction of the dual complex of . plus the sum of two triple-wise intersection areas. we explain the connection in geometric pictures. which contains the north-pole in its interior. except that the summation is done over all circles that are intersections of two   § §¦¡ § £ £ F  ¢ ©       ¥    ©  ©       ! §   §¦¡ §  ¦ §¡ ¢ ¢ Start with and embedded in as suggested in Figure IX. P IE VOLUME F ORMULA .IX. the area of is the area of minus the alternating sum of the areas of cap intersections. But this is also the condition for the projection of to have non-empty intersection with the interior of . Instead of the system of half-spaces we now use a system of balls obtained by substituting for . we get a Pie Area Formula for the surface area of . the volume formula becomes an area formula.      ¢     ¦ ¤    ¢   ¢  § a© £ £ F    ©    ©     ¢ ¢ ©    ¢ ©  § ¡ ¢    a© £ £ F ¢ ¡©   ¢        © $ ££ F   $ F ¢ ¥¤   © © $ © ¢ ¡ ¦ ¡ ¢ £   ¢   §¢   ¥   ¢         ¥  ¥  © © ¤ ££     ¡     i¡ ¡    ¦   ¡ ¢   ©  © F   ¤  © ¦ © ¤ ¡    ¥     ¥  ¦ ¥   ¥   ©   ©    ¢ ¡       ¤ ¤ ¢   £¡ ¢ ¡   ¢ ¡   ©   ©    © © © © ©  £ £ £   ¤ ¢ ¤        ¦ ¦   ! ©   ¦ © ¢    ¥ "   ¢             ¢ © "     ¢   ¤ ¡ ¤¡ ¦ ¦  ¢ ¤ ©   ¤ ¢ ¡ ¤  . For each ball we get a half-space . For convenience. The proof of the formula is similar to the one for area. A more straightforward derivation of a formula for the ball union translates the inclusion-exclusion formula from to . and the intersection of the half-spaces is a convex polyhedron . Figure IX. . Since the caps are two-dimensional. we can get a Pie Length Formula that measures the total length of the circular arcs in the boundary of the union of balls. Use to project the boundary complex of to . We have arrived at a simple interpretation of the Pie Volume Formula: construct the dual ¤ ¥    ¥  £ ¡ ¦£  £ £ F ¤   ¥  ¥   ¤ ©   ¦       ¤ Dual complex. Area and length. By summing over all balls. namely for the system of balls and for a generic set in . This is the weighted Voronoi diagram of . For each set of caps in the system . The volume of the union of a finite set of balls is  Similarly.  The sets with one or no half-space are redundant because in these cases.    ¢   §   £ ¡ ¥ £ © ££ F   ¢   ¥  ¢ ¡     ©  ©        !  £ ¡ ¥ £ We could now get a formula for by scaling the volume by the distortion factor of . revisited. Instead of proving this algebraically. This is illustrated in Figure IX. a non-empty set of half-spaces is in iff the corresponding set of balls defines a simplex in the dual complex.8. where is the abstraction of the dual complex of . A subset belongs to iff its corresponding face of has non-empty intersection with the ball bounded by . Let be the 4-ball bounded by and the system of subsets of that appears in the Pie Theorem B.7. Letting be the sphere and the set of caps. we use the same notation. For a single sphere.2 Volume and Surface Area . The volume of the portion of outside the polyhedron is 133 complex of and do inclusion-exclusion with a term for every simplex in the dual complex. we use the Pie Volume Formula on the set of caps defined by intersecting balls. we have the corresponding set of balls together with the ball of in the system of . For we get and therefore a zero contribution to the area. Similar to volume. Hence. To prove this formula. we add the contributions of individual spheres.8: The area of the union is the sum of eight disk areas minus the sum of nine pairwise intersection areas. we get the Pie Area Formula given above.

Inclusion-exclusion Bonferroni identities and inequalities for discrete tube-like problems via Euler characteristics. Edited by S. 1995”. 20 (1992). T HURSTON . For each such circle. Naiman and Wynn proved that the volume of a finite union of congruent balls can be expressed by an inclusion-exclusion formula whose terms correspond to the simplices in the Delaunay triangulation of the centers [4]. Just as a union of balls in corresponds to a convex polyhedron in . For two or fewer balls we have no vertices. 13 (1995). Ann. Statist. P. Springer. 1957. Bibliographic notes. E DELSBRUNNER .. For each triple in we have a three-sided spindle with two vertices. Vorlesungen uber Inhalt. Found. The latter is Hadwiger’s notion of a not necessarily convex polyhedron [3]. Oberfl¨ che und ¨ a Isoperimetrie. Princeton Univ. a union of intersections of balls corresponds to a union of intersections of half-spaces. Discrete Comput. In “Proc. 36th Ann. The inclusion-exclusion formula suggests that this number is the alternating sum of vertex numbers of common intersections of balls. It follows that in the generic case. Levy. we apply the (one-dimensional) Pie Volume Formula and thus get an expression whose terms correspond to the simplices in the star of the pair. That projection is conformal (preserves angles) and has a number of other nice properties. IX M EASURES ¦ ¡ ¢ £¡ ¢ £¡ ¤   §¢   ¢ ¦ ¡ ¤ . Geom. Inclusion-exclusion formulas for such polyhedra can be found in [2]. the number of vertices of is twice the number of triangles minus four times the number of tetrahedra in the dual complex. 1997. [3] H. P. [4] D. In 1992. 248–257. Berlin. E DELSBRUNNER . and for each quadruple we have a rounded tetrahedron with four vertices. We might even go one step further and consider the number of vertices of . New Jersey. Sci. The material in this section is taken from that paper. 415–440. Three-Dimensional Geometry and Topology. The proof of the volume formula uses the inverse of the stereographic projection to transform balls in to half-spaces in . Volume 1. [1] H. Press. Comput. W YNN . IEEE Sympos. Edelsbrunner generalized the formula to allow for different size balls and strengthened it by using the dual complex as the index system [1]. [5] W. NAIMAN AND H. The union of balls and its dual shape. many of which can be found in the book by Thurston [5]. H ADWIGER . [2] H.134 spheres forming a pair in . Q. 43–76. Algebraic decomposition of nonconvex polyhedra.

Figure IX. In we refer to the two  vertices . If we change the meaning from area to perimeter we get . We use similar conventions for triangles. or both points. ¡R R R R ¡ R  ¡   R  ¡ R  $  ¡  s ¡ R ¡ R  R  ¡ R   ¡   R  ¡ #R  $  ¡  s   ¥ 0     R ¨ ¢ © ¡ Angles of revolution. For example. we define the angle as the fraction of directions around along which we enter . and vertices. The left drawing suggests that the area of the triangle is .10 indicates that there are cases where the formulas are not as obvious as to the left. but the right drawing in Figure IX. and .9: The solid angle at a vertex.IX. and the zero-dimensional angle of a triangle. The new collection leads to formulas for voids. To simplify the notation.  ¢       ¢       IX. for the area of the intersection of the disks with centers and . we also define the angles of the improper faces of as and . The only zero-dimensional angles are therefore 0. we let denote an independent set of four balls and. .3 Void Formulas 135 . which are bounded components of the space outside the union. . and arc length of a union of balls in . the 0-sphere is a pair of point with possible subsets the empty set. Let . Similar to the two-dimensional case.10: Both triangles are spanned by the centers of three independent disks. edges. and we will see shortly that this convention makes perfect sense when we compute volume using angles. is the volume fraction of a sufficiently small ball centered at an interior point of that lies inside the tetrahedron. and the one-dimensional angle at an edge as a dihedral angle. surface area.3 Void Formulas  ¥ ¤ ¨ ¢ £¡ c b ¢ ¡  ¨     ¥ ¡ ¦ ¤   © ¦¦ ¤ ©               ¤ ¨  ¨ ¤ . A two-dimensional angle is the area of a piece of the unit 2-sphere and can assume any value between 0 and . The volume of an independent tetrahedron is ¥  ¥    ¥      ¥  Independent triangles and tetrahedra. a single point. the tetrahedron spanned by the four ball centers. This definition can be used in any dimension . and 1. there is a point inside every disk ery subset in the subset and outside every disk not in the subset. and so on. a b a Figure IX. and be the angles at the c We generalize the formulas for independent triangles to independent tetrahedra. Both formulas hold whenever the three disks are independent. A (one-dimensional) angle is by definition the length of a unit circle arc and can assume any value between 0 and . at the same time. Recall that a collection of three disks in is independent if for ev  § ¦ ¡ § ¤ ¥ ££ F  ¥  ! £ § ¡ ¥ £ ¤ ¥ £ £ F ¢       ¥       ¥      dimensional angle at a vertex as a solid angle.9 illustrates the definition. Equivalently.10. It is convenient to normalize so that in both cases the full angle is 1 and every angle is a fraction of the full angle. The zero-dimensional angle of a triangle is always . . the dihedral angle at an edge.     ¥ Figure IX. For convenience. This condition is equivalent to the three circles decomposing into eight regions in the way shown in Figure IX. we drop the distinction between abstract and geometric simplices.  The proof of the formula is somewhat technical and omitted.   § a© ¢   ¤ ¥   ££ ¥ F   ¥  ¨ 0    ¨    Consider for example a tetrahedron . I NDEPENDENT VOLUME F ORMULA . $ 0    ¡ ¥ ¥ This section derives another collection of inclusionexclusion formulas that express the volume. where we write for the area of the disk with center . For each face . . Specifically. we get sums that evaluate to zero if we replace volume by area or length.

 Voids. It is therefore not surprising that we can rewrite the Angle-weighted Pie Volume Formula to get an expression for the volume of a void of . First we compute the volume of the underlying space of itself. Strictly speaking. We need some notation to continue. It is convenient to cover the portion of outside the Delaunay triangulation with tetrahedra. of    ¢   §¢ ¥  § ¦¡§  § £¡ ¥ £ ¢ ¡ ¢   ¥ ¤ ¥   ¥ ¤ ¥ ££ F  ££ F   ¥    ¥ ¤ ¢      £© ¢  ¤  £© ¢ ¢     ¤          ¥ ¥       !   ! § £ ¡ ¥ £ ¦¡ ¢   ¢    a©  ¨ ¤ )  § a©   ¥ ¢  § a©  ¢  ! § a© ¥ £ £ F  ¢ ¡ ¥ ¢   ¥ )    ¤ ¤ ¥ ¢ ¤  §   ¥ ¢ ¤ ¥   ££ F ¢ ££   ¢¢ ¤ ¢¢   ¥  ¢ F ©     ¥    ¥   ¥  ¤ ¢      £© ¢ ¡ ¨  ¤ ¨ ¢ ¥ ¥¤¢      £© ¢ ¢   ¥       © © ¥    ¢    ©    ¦        ¡¡     ¤ ¡    ¥ ¨     ¥  § a©   ¢¢ ¤ ¢¢             ¢       ! a©    ¥   ! © ¥ ¤   ! a© ¨ ¤  ¢   ¢ ¤ ¢  ) ¢ ¨ . Figure IX. edges. We get VOID VOLUME F ORMULA . The volume of a void with dual set is #   ¦¤¢ ¦    £© ¥ ¢ ££ F     ¢ © ¨ a© ¦  )  ¤ ¤ A NGLE . except that the first sum vanishes:  and decompose into the parts defined by the tetrahedra that contain as a face. for a subcomplex . the only coface in is . As defined earlier. missing the simplices that bound the void in . we use the Independent Volume Formula to make a substitution. This can be done by adding four points viewed as degenerate balls to the set . some might reach into the interior of . is not a triangulation because it is not even a complex. and vertices . the contribution is split up into as many pieces as there are angles around . We start with the Pie Volume Formula. From a point inside the void. With this notation we can rewrite the Pie Volume Formula as    § a© ¢   ¥ ¤ ¥ ¥ ¢         ¥   # ¢ ©   £¢ The new formula suggests we compute volume in two steps. the angle is as before.WEIGHTED P IE VOLUME F ORMULA . This results in the new volume formula. Let denote the set of tetrahedra in a simplicial complex . We derive a new volume formula for a union of balls by combining the Pie Volume and the Independent Volume Formulas. and the contributed term is . a void of a union of balls is a bounded component of the complement space. We write for . We first make the Pie Volume Formula more complicated and then simplify by cancelling terms. the union of balls looks a lot like from a point outside all balls and voids.11: Both voids in the union of disks is contained in a corresponding void of the dual complex. IX M EASURES the same formulas for area and length. The most straightforward translation of the angle-weighted formula suggests we compute the volume of by first computing the volume of the corresponding void in and then subtracting the volume of the fringe that reaches into that void. Observe that not all pieces considered in the second sum are subsets of the fringe. The volume of the union of a finite set of balls in is     £¢ # # ¤ where is the Delaunay triangulation of . Figure IX. Whenever is a tetrahedron in . and second we add the volume of the fringe. Furthermore.11 illustrates the fact that every void of is contained in a void of . the second sum is exactly the volume of the fringe. The corresponding void in is triangulated by a subset of the Delaunay triangulation. For example for a tetrahedron .136 Angle weights. . Nevertheless. let denote the collection of pairs with and . For triangles. . .

Hence. chapter 14]. Convex Polytopes. FACELLO . hence as required by (ii). but the sum of solid angles minus the sum of dihedral angles is. Let be the set of centers and note that the dual complex of is just together with finitely many isolated vertices. as required by (iii). (ii). we get formulas for the area and the total arc length of by substituting for in the corresponding formulas of :  137 to radius . we construct so that (i). . minus 1. 256–264. We require that faces In . London. Geom. Discrete Comput. Interscience. ¤ ¥¤ ¤ £ £ F      0       § a© £ ¥    ¤  £ # ¢  § a©   £ ¢ ©   ¢   £¤   ¤      ¤   ¥   ¥ ¤ ¥ ¤ ¥   ¡ 0 ££ ¤ ¢      £© ¥ ¢ ££ F      ¢ © ¨ a©     ! a© ¤ ¢ ¤    ¤   ¢£©  ¥       ©  F ¥   ¥ ¡ (ii) be a subcomplex of . ¦  £ ¨ a© ¤       £ ¢   ¢ ¤ (i) be finite. and they have the same dual complexes by the choice of . which also contains a proof of the dimensional version of the Independent Volume Formula. Finally. where the second containment follows because is obtained from by growing every ball of radius £ ¥ #¥     ¥  £ ©  ¥  £ ¥  ¢ ¡ ¡ Assuming these three conditions. where is obtained from by reducing every ball with radius to radius . the balls in are contained in and thus cannot contribute to the union of balls in any other way than covering . © The difference gives the Void Volume Formula. Bibliographic notes. ¨ [3] B. for the edges. Wiley. we have . 415–440. System Sciences. The Angle-weighted Pie Volume Formula is related to Gram’s angle sum formula. The Angle-weighted Pie Volume Formulas for the two unions are    ¥  ¤  (iii) ¢ . L IANG . for the gon. The implementation of the formulas are part of the Alpha Shapes software and their use in structural biology has been described in [2]. The first complex is the sequence is and the last is . E DELSBRUNNER . A treatment of Gram’s angle sum formulas can be found in Gr¨ unbaum [3.   ¡ ¤  ¤ £   ¡ #  ¤   0 ¤ ¤¡ ¥   ¤ ¢  ¡    ¡ ¤   0 ¡   ¢ £   ¤  £    £  §  § £¡ ¥ £ ¦¡ § ¢   ¥ ¤ ¥   ¥   ¤ ¥ ¡ ££   ¢ F  ££ ¤ F   ¥    ¥ ¢ ¤ ¢ ¦    £© ¢  ¤ ¢ ¦    £©  ¢ £ ¢   ¡      £     ¢ £     ¤   ¢   §§¥ ¡ §¢          £ £            ¤        ¥ ¥   ¢ a© ¡   ¤     £¢ # ¢  ¤ £      ¡ ¤ ¤     ¡   ¡     ! © # ¥ ¡   ¡  # #   £ ¡ ¥ £ ¦¡   ¢ a© ¤ ¢ ¢ #   § ¢ #     . 28th Ann. Assuming general position. Let be a finite set of balls of radii with centers in the void that covers . The material of this section is taken from [1]. vol. The union of balls and its dual shape. G R UNBAUM . Hawaii Internat. [1] H. 1967. M. V: Biotechnology Computing. A. E DELSBRUNNER .IX. England. By choice of . Measuring proteins and voids in proteins. this implies that the sum of angles at the vertices of a convex -gon is . Conf. The main idea in the proof is to cover the void with small balls and measure the difference between the new and the old union. which states that the alternating sum of angles in a bounded convex polyhedron always vanishes. In “Proc. In . 13 (1995). P. Expressed in radians. and . there exists a positive with . and (iii) are satisfied. £ £ [2] H. F U AND J. and consider . Let be the set of balls we add. 1995”. and have the same Voronoi diagrams and Delaunay triangulations by the way we changed the radii. the sum of angles at the vertices is not longer determined by the combinatorial structure of the polyhedron. Define and note that the underlying space of is the void in that corresponds to the void in . ¡   ¢¢ ¢ ¢¢  ¥   # Proof of void volume formula. this is .3 Void Formulas Similarly.

.12: There are eight voids in the -complex of cdk2. and total arc length of a ball union and its voids. we look at the wirefor frame of the dual complex defined by the balls with radii .504511e+02 void volume: 1. We use a partition of the Delaunay tetrahedra into the dual complex and the various voids. To measure a union of balls using the Pie Volume. which we do by typing > alvis name & > volbl name on the command line. It is part of the Alpha Shapes software and can be used to compute the volume. The software will start with a dialogue narrowing down the options of what to compute. While the largest void is more than ten times as large as any of the others (in volume).4 Measuring Software [Should we add a short discussion of Patrice’s new software that also computes derivatives?] Volbl stands for the ume of a union of a ls. As an example consider the measurements of voids in cdk2. the software calculates for each ball its contribution to the void area and outputs the result in a new file.4. Algorithms and data structures. The software uses the files generated by delcx and by mkalf that represent the Delaunay triangulation and its filtration. which endfor. Before exploring any of the other options in volbl. index 845: number of tetrahedra: 26 tetra volume: 2. as appropriate. ˚ for A. Figure IX. A . The voids shown in Figure IX.12 occur for the solvent accessible diagram defined ˚ A. which confirms out intuition about the size difference between the two representations. It is not necessary but a good idea to execute volbl in parallel with visualizing the alpha shapes of the same data. The corresponding void in the dual complex is more than twenty times as large. While measuring the voids.contrib.776804e+01 number of corners: 34 The index of the void is a unique but fairly arbitrary integer assigned during the process of collecting the tetrahedra ˚ ˚ ˚ in the dual set. Some of the voids have (open) dual sets that seem connected in the image but are not because of missing triangles. which is an enzyme involved in the control of the growth process of a body cell.  ¥   ¥               we get as from alvis.4.880316e+01 arc length: 5. we pick the middle of The implementation of the Area and Length Formulas is similarly straightforward. we take a brief look at the algorithms used and the data structures these algorithms require. where is ¡ ¨  © ££ F ¢     ¥  E£  E £ E £ ¡ E £ ¢ ¢  ¢ ) ¤ £ ¦£  ¢   ¥ §   £ ¦£  ¢ ¨    ¥ ' § ¨   for ¤  ¤   ¤ ¢  )    ¨  ¤ ¥ ¢         £  ¤£ ¤ ¢  ¨ £   ¡§   ¢   ¢ £ ¢   ¢   0    ¤  £   £ ¢ ¡ ¡ ¡ ¢   ¢¡      ¢   £   0 a©   ¢   . it is still only of the order of one van der Waals ball. and Length Formulas. and A. In other words. Area. The output for the largest void in this example is measurements of void. After entering the index of the -complex. as explained in Sections II. We simplify the actual situation insignificantly by assuming that the simplices in are stored in an array .009809e+01 surface area: 3.138 IX M EASURES the corresponding interval of -values. The Angle-weighted Pie and Void Volume Formulas use the masterlist and in addition require a representation of the voids. Running volbl. we need a list of the simplices in the dual complex of .    IX. surface area. Measuring voids takes about seconds on the author’s SGI Indigo II.2. name. . and volbl outputs the measurements of all voids. to do . where is the van der Waals radius of the -th ball.3 and II. The following pseudo-code is then a direct implementation of the Pie Volume Formula of Section IX. This list is a prefix of the masterlist mentioned in Section II. The measurements are in A .

its voids.Vtiv Asf Lsf Csf Vsh Atv Ltv Ctv Vof Aof Lof Cof = = = = 0.100959e+04 Lsf = 1. which is apparently rather small. It does this for the spacefilling diagram . and the outside fringe. The difference is the volume of the dual complex. We compute the lists by maintaining a union-find data structure while scanning the masterlist from back to front. We have voids. The following pseudo-code is a direct implementation of the Void Volume Formula of Section IX. for case . The surface area.915391e+04 Lof = 1. Figure IX. length. U NION F IND F IND endfor. The complex has vertices and no voids.1 lists the main measurements made. which in turn should be equal to the sum of volumes of the dual complex.13. and the envelope (defined as the space-filling diagram union all voids). Options. and number of corners are of course the same for both.0 0. As an example consider the van der Waals diagram of cdk2. We fix this problem by adding a dummy tetrawhenever is hedron to the system and setting a triangle on the boundary of the Delaunay triangulation. Table IX.915391e+04 Csf = 6388 Cof = 6388 Note that the volume of the space-filling diagram is insignificantly higher than that of the outside fringe. . the voids in the dual complex.IX. the software computes all terms in Table IX.1: Cumulative measurements made by the Volbl software. In the considered example.13: The dual complex of the van der Waals diagram of cdk2.1 and prints a summary of the results. let be the first and the second Delaunay tetrahedron that has as a face. and also the number of vertices in the boundary.962563e+04 ¡   ¡!)    © forall faces if then do ¨ ¢   ¥   ¤ ¥  £   ¨ ¤  ¢    £ ¤£   ¡§   £     ££ F   ¤£ Y ¡§ ¢ ¡¢ ¨  ¤ ¨ ©    ¦£   ¢ ¥§    ¤£   ¢ ¡§ £   £   ¡ ¨   ¨   ¥      ¥ ¨       ¨    ¥   ¢       ¤ space-filling diagram voids outside fringe envelope dual complex dual sets of voids ¢¢ ¤ ¢¢ ¨ ¨  £ £    ¢  ¢  £  ¦£   ¢ ¥§ £       £ Table IX.3.4 Measuring Software the set of tetrahedra in the unbounded component of the complement of .034036e+04 Vof = 2. The only trouble with this algorithm is that tetrahedra in the unbounded component may be scattered in more than one list. total arc length. it reports that there are no voids and it prints the sizes of the space-filling diagram and the outside fringe as Vsf = 3. . In the checking option. The specific relations checked by the software are Vsf + Vtv . case . The software computes the volume. downto do . forall tetrahedra  139 vol Vsf Vtv Vof Ve Vsh Vtiv area Asf Atv Aof Ae lgth Lsf Ltv Lof Le crns Csf Ctv Cof Ce do ¢ . endif endfor endfor. Asf = 3. the outside fringe (defined as the portion of the unbounded component of the complement of that is covered by the balls).0 0 The implementation of the Void Area and Length Formulas is similarly straightforward. the sum of volumes of the space-filling diagram and its voids should be equal to the volume of the envelope. each represented by a linear list of tetrahedra. For example. area.100959e+04 Aof = 3. whose dual complex is shown in Figure IX. The software also checks a few linear relations that should vanish provided the computations are correct. A DD . which we refer to as corners.0 0.

14. where the sum adds all angles in the -gon. who shows that there is a short inclusion-exclusion formula for the area of the intersection of a finite set of disks in the plane. Since all simplices in are independent. A detailed documentation of the Volbl software is given in [3]. Note that the formulas give the precise area of the intersection of two or three caps since the approximating spherical -gon is only a tool in the proof and not used in the formula. we would decompose the molecule into simple pieces and give a formula for the size of each piece. We plug the values for and into the formula for the area of and get ri ρj pj wj ϕ pk ϕ Consider now the intersection of two caps. and symmetrically .      Figure IX. Hence . the shaded cap has radius width . IX M EASURES whose edges are by definition great-circle arcs. By construction. we can find infinitely many integers so that the two -gons share two vertices near the vertices of the . In the checking option. Furthermore. which is . and the outside fringe. The idea of using inclusion-exclusion for size computations goes back to Kratky [4]. It also checks whether the sum of contributions really add up to the total area. . Depending on the type of area measurement. as illustrated in Figure IX. but the lack of an explicit expression occasionally leads to miscalculations [2]. We define the width of equal to the distance between the two planes that cut from . The area of the cap is then times the area of the sphere .14. $ $ $    ¥ ¤ £   ¡ ¦ ¤     ¤ ¤ ¥     ¥ &¨        ¤   ¤  ¤ 0 $ ¥   ©  Let be the radius of and the radius of the circle bounding .140 Another form of output is the description of the total measurement as a sum of contributions over individual atoms. the area of the approximating -gon is the same. To the right. the software compares for each atom the area contribution to the space-filling diagram with the sum of contributions to the voids and the outside fringe. The points are placed slightly outside the circles so that the areas of the -gons are exactly the areas of the caps. the -gon has vertices with angle and vertices with angle . the shaded bigon has angles and arc lengths and . where . Let and be the angles in the two -gons. All analytic formulas needed to measure the common intersection of up to four balls are straightforward. £ and and Bibliographic notes. the cap contains all points whose power distance from is no less than that to . We then have two shared vertices approach as goes to infinity. for the intersection of three caps with angles . after eliminating the terms that vanish when goes to infinity. except possibly the area of the intersection of up to three caps. The angles at the bigon. Assuming that and are rational. and we get for the area of . This makes sense for volume and area but is done only for the latter. The structural biology literature distinguishes between numerical and analytical approaches to measuring molecules. Similarly. we approximate each of the two circles by a regular spherical -gon. ¨  ¨   ¡  ¡        ¥  "    ¡    ¢£ ¦ ¤    £ $ ¥    ¥ ¨ ¨ $   $ ¥  0       £    ££   ¦ ¤    ¨  ¨ ¥ 0     0    ¨   ¥ ¥ ¤ ¥  ¥   £   ¡ $ ©  ¥   $ ¡ ¥ ¥ £ $   $   $ ¨ ¥ ¥ ©      $ 0 ¥ ©  £ ¤    ¡    0 ¨ £ ¨ £ ¥ ©    ¥ £ ¥ ¥ £ ¥ ¨ ©¥  0 $ ¥ ©  ¡ ¡ ¡ $ £        ©B  © ¤ #B  © ¢  ¡ ¡ B ¡ $ ¡    ¢    0   ¢ ¤ ¦   0 ¦ ¡  0     ¡     0 0 ¡  ©    ¡ ¡ ¡ ¡  © ¦  ¡  ¨ ¢ ¥    ¤    ¢ ¥   ¢ ¤ 0    .14: To the left. . all measured as fractions of a full circle. . The cap on a sphere consists of the portion inside the sphere . Area formula. We approximate the bigon by a spherical -gon. We let be the angle at the two vertices and and the lengths of the two arcs.1 and IX. An example is Connolly’s work [1] on computing the area of a molecular surface. Scheraga and coauthors [5] implement an inclusion-exclusion formula for a union of balls based on Kratky’s work. the software outputs a file name.2. we may assume that the intersection is a bigon. To compute we recall that the area of the cap is . and and arc lengths . but we prefer to derive it with elementary means. To construct the -gon. His proof is existential and superceded by explicit formulas that can be derived by the same methods as described in Sections IX. The area of that -gon is . as shown in Figure IX. This is because a triangulation produces spherical triangles each contributing one half times the sum of the three angles minus one quarter to the area. the voids. For the latter approach.contrib that contains the contribution of each individual atom. and it does this for the space-filling diagram. namely . Equivalently. A formula for the area follows from the GaussBonnet theorem in differential geometry.

NAYEEM . Beckman Inst. Analytical molecular surface calculation. 16 (1983). D. W.IX. Univ. J. The area of intersection of equal circular disks. M AIGRET AND H. Illinois. PALMER . 1313–1345. Comput. B. Measuring space filling diagrams and voids. J. [4] K. Appl. K. K RATKY. Cryst. J. G IBSON . 1994. Phys. V ILA . [2] L. S CHER AGA . [3] H.4 Measuring Software 141 [1] M. 548–558. Urbana. 1–11. [5] G. 1017–1024.. C ONNOLLY. C HENG . F U . A. T HEODOROU . Analytic treatment of the volume and surface area of molecules formed by an arbitrary collection of unequal spheres intersected by planes.   . Rept. L. 13 (1992). P ERROT. J. A. UIUC-BI-MB-94-01. B. Illinois. A: Math. Gen. 11 (1978). E DELSBRUNNER AND P. Molecular Physics 72 (1991). N. MSEED: a program for rapid determination of accessible surface areas and their derivatives. A. R. D ODD AND D. Chem.

142 IX M EASURES Exercises The credit assignment reflects a subjective assessment of difficulty. 1. (2 credits). Every question can be answered using the material presented in this chapter. Prove that intersects at most edges of and that this upper bound is tight for every .  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  . Section of triangulation. Let be a line that avoids all point. Let be a triangulation of a set of points in the plane.

In the case of van der Waals or solvent accessible diagram. it is related to the length of the circular arcs in the boundary. X.4 Implicit Solvent Model Weighted Area Derivative Weighted Volume Derivative Derivative Software Exercises 143 .1 X.Chapter X Derivatives The derivative of surface area under deformation is an important term in the simulation of molecular and atomic motion.3 X.2 X.

144 X D ERIVATIVES X.] .1 Implicit Solvent Model [Give a general introduction and work out the relationship with area and volume derivatives.

] [Explain the results and disucuss the continuity issue of the functions.2 Weighted Area Derivative [Talk about the unweighted and the weighted area derivatives. H.2 Weighted Area Derivative 145 X. 2002. P. KOEHL AND M.] [1] R. Durham. . North Carolina. Duke Univ. B RYANT. The area derivative of a space-filling diagram. E DELSBRUNNER . L EVITT.X. Manuscript.

Manuscript. .] [Explain the results and disucuss the continuity issue of the functions. 2003. The weighted volume derivative of a space-filling diagram.146 X D ERIVATIVES X. Durham.] [1] H.3 Weighted Volume Derivative [Talk the unweighted and the weighted volume derivatives. North Carolina. Duke Univ. E DELSBRUNNER AND P. KOEHL .

] .4 Derivative Software [Discuss Patrice’s ProShape software.X.4 Derivative Software 147 X.

Every question can be answered using the material presented in this chapter.148 X D ERIVATIVES Exercises The credit assignment reflects a subjective assessment of difficulty. Section of triangulation. Prove that intersects at most edges of and that this upper bound is tight for every . Let be a line that avoids all point. (2 credits). Let be a triangulation of a set of points in the plane. 1.  ¤  ¥ ¤ ¥ ¡ ¥ ¦ ¥ ¥ d  .

69 edge contraction. 49 chromosome. 40 electron. 60 backbone. 19 Connolly surface. 45 convex combination. 32 cycle group. 45 Delaunay triangulation. solid. 48 critical point. 3 genome. Gaussian. 96 Euler relation. mean. 48 facet. 48 length scale. principal. 48 inclusion-exclusion. 48 Helly’s theorem. 29 codon. 9 atomic weight. 49 homology group. 63 interval tree. 51 Betti number. weighted. 62 dihedral angle. 24 face (of a polyhedron). 48 Johnson-Mehl model. 96 coordinate system. 23 group. 28 convex hull. 44 contractible. 32 . 96 independent collection. 20. 21. 36 Euler characteristic. 57 homomorphism. 32 Gaussian curvature. persistent. 116 Hessian. 5 coherent triangulation. 9 attachment. 96 e exact arithmetic. 60 Corey-Pauling-Koltun model. 45 homotopy. 100 Lennard-Jones function.S UBJECT I NDEX 149 Subject Index active site. 28 alpha complex. 7 affine combination. 3 closed ball property. 1 chain. 35 . 65 basis (of a group). 100 atom. 116 canonical basis. 32 . 28 convex polyhedron. 51. 61 indicator function. 60 central dogma. 2 dual complex. 96 Euler-Poincar´ theorem. 24 isomorphism. 32 . 5 barycentric coordinates. 32 gene. 40 edge flip. 44 homology class. 48 chain complex. 49 . 61 . 21 Alpha Shape software. 28 affine hull. 57 cell (in a complex). 114 . 23 . persistent. 16 join. 32 . 36 length. 60 Gouraud shading. 44 image (of a function). 9 atomic number. restricted. 5 angle. 24 fundamental theorem of linear algebra. 44 homotopy type. 45 kernel. 18 diffeomorphism. 11 linear algebra. 32 gluing map. 63 graphical user interface. 96 face (of a simplex). 103 Dirichlet tessellation. 49 deformation retraction. 20. 103 area. 23 amino acid. 96 integral line. 21 alpha shape. 61 homeomorphism. 16 coset. 49 boundary homomorphism. 59 curvature (of a curve). 44 homotopy equivalence. 103 . 29 boundary group. 35 coaxal system. 49 Brunn-Minkowski theorem. dihedral. 61 critical point theory. 62   . 103 index (of a critical point). 40 gradient. 16 continuous function. 48 homotopic map. 51. 96 filtration. 51 Gauss map. normal. 101 dual set. non-degenerate. 2 geodesic. 57 body (inside a skin). 19 DNA (deoxyribonucleic acid). 20 independent simplex. 60 differential topology. 9 element. 9 -sampling.

107 unstable manifold. 18 union-find. 3 residue. 69 neutron. 60 smooth map. 19 . 57 persistent homology group. 71 simplex. 55 normal form algorithm. 57 piecewise linear. 60 solid angle. 106 volume. 55. 61 Morse theory. 65 pocket. restricted. 96 protein. 65 manifold. 41 Minkowski sum. 23 pencil (of circles). 35 restricted Voronoi diagram. 44 matrix (of a homomorphism). 44 topological type. 17 x-ray crystallography. 72 orthogonal spheres. 24 principle of inclusion-exclusion. 48 . 96 tangent space. 51 lower star. 32 principal simplex. 55 mean curvature. 64 mouth (of a pocket). 23 normal curvature. 14 specificity. 64 triangulation. 35 ribosome. 68 polyhedron. 103 solvent accessible surface. 15 . 3 signature. 32 nucleotide. 32 spherical triangle. 32 mesh. 100 stable manifold. 84 Morse complex. 35. 96 potential energy. 10 molecular skin. 27 molecular surface. 6 rank (of a group). 28 persistent Betti number. 5 Protein Data Bank. 60 topological equivalence. 116 mixed cell. 4 transversal. 60 tangent vector. 39 morphing. 65 stereographic projection. 25. weighted. 63 star. 61 regular simplex. 44 topological subspace. 24 regular triangulation. 64 Morse function. 15 space-filling diagram. 19 replication (of DNA). 11 power diagram. 35 metamorphosis. 23 . 23 van der Waals surface. 2 open ball. 40 smooth manifold. 104 Volbl software. 23 proton. 18 orthosphere. 48 simplicial complex. 9 Morfi software. 22 parametrization. 48 Ramachandran plot. 114 normal vector. 40 void. 9 NMR (nuclear magnetic resonance). 10 van der Waals radius. 51 regular point. 5 restricted Delaunay triangulation. 32 vertex insertion. 15 molecule. 44 open set (of simplices). 15 vector field. 24 singular simplex. 63 velocity vector. 44 supporting hyperplane. 7 speed (of a curve). 30. 44 open set. 63 van der Waals potential. 44 topology. 59 Morse-Smale function. convex. 60 map. additively weighted. 9 quotient group. 100 subspace topology. 69. 19 . 35 . 29 Skin Meshing software.150 S UBJECT I NDEX linear independence. 17 principal curvature. weighted Delaunay. 32 normal form. 48 simulated perturbation. 32. coherent. 102 . 24 skin. 30 molecular mechanics. 100 Voronoi diagram. regular. 30 mixed complex. 17 power distance. 44 transcription (of DNA to RNA). 44 topological space. 18. 56. 69 pdb-file. 6 RNA (ribonucleic acid). 60 partial order.

38 Dirichlet. 8 Alexandrov. 8 Lewontin. 26. 70. V.. R. 105. 77 Pauling. C. 16. 46.-G. 115 London.. 83 Berman. 42.. K. 11 Clifford. 58 Naiman. 114 Creighton. F. 26 Billera.... 93 Bhat. E. J.. V. 87.. 42..-L.. H. H. T. 4 Liang. L. 99 Facello.. M. E. 19 Bondi. 114 Levitt. 46. F.. 42.. J. 109 Kuntz. 16 Leiserson... S. Q. 31 Fu. 22. B. D. 102 Harer. 105 Griffith. N. 11 Bourne..... 8 Delaunay.. 82. J.. 8 Johnson. 93 Mehl.. 74. B. 42. 50. 70. 115 Feiner. D. P. 54 Euler. 54.. B. 26 Gr¨ unbaum.. M. N... 102. 115 Frobenius. C.. 109 Gauss. 11 M¨ ucke. 58 McKay.. 84... 16 Lee. J. W. 34 Palmer. 84 Chew. L.. M.. 46 Letscher. P. 87 Cheng. A. L. B. 58.. 26 Bern. F.. 70 Lam. W.. 70... 84. 117 Kratky. 4 Mermin. B. R. N. 93 Giblin. W. 114. F.. 19... J. L. 16. F. D. 115 Eilenberg.. 34. R. K.... J. 11 Miller. 32 Gelbart.. 84.. P. 19 Delfinado. 31 Connolly. T. P.. 8 . L. J. 22 Klee.. H. 65. D..... J. 42 Forman. J. 26. M.. 113. E. J. 70 Cheng. M... I. L. 84 Leach.. 19 Dodd.. J. M. 31 Darby.. F. H. J.. 62 Morse. T. 31. 38. 114 Leray.... H. L. 8 Crick. H. 11 Aurenhammer. 38 Chothia.. 109 Pascucci. B. N. G. P. F.. M. A... 65 Basch. 117 Hughes. V. 11. P. 11 Kapranov. 42. 16 Jorgensen.. A. J. 83 Guillemin. 26 Maigret. 76. 34.. R. S. H. S... H.. I. 109 Gilliland. D. W. M.. P. 34. L.. 22 Amenta. 16 Alberts. 8 Bronson. 19. 4 Gromov. M. 38 Ashcroft. 109 O’Neill. 105.. 8 Cormen. N.. 117 Casati. 77 Helly. 4 Gelfand..AUTHOR I NDEX 151 Author Index Akkiraju. M. 65. C.. 50 Gibson. 42 Feng. 16. 77 Banchoff. 26 Bray.. D.. G. D. 70. 11. R. E. A. A. C. 19 Kelley. 46 Kirkpatrick. 99 Martinetz. 4 Milnor.. 109. A. R. 99. 79 Bajaj. 87.. 42 Johnson. K.. 38 McCleary.. 117 Guibas. 58. 19 Gerstein. 109 Maillot.. W. 54. J. 54 Dey. R.. 38.. J. 77. L. C.. P. (also Delone). 93 Lewis.. 74. T. N...-W. 109 Edelsbrunner. M. T. P. G. 38 Besl. V.. W. W.. 22. E.. A.. J. W.. N. J. 62 Hadwiger. M. 99 Capoyleas.. T. 4 Darboux.. E. 102 Nef. Z. A. S. 105. R. J.. K.. K. B. R. 109 Cheng. J. 74.. G... 16 Bader. M. C.. H. G. L. B. 22. P. 92 Mani. 34 Bruggesser. 62 Munkres.. F. 25. A. J. G. W. 26 Foley. 16 Mendel. 99 Neyeem. D.. 54. 76. C. 8 Bruce. 109 Corey. 105. 34. P. 32. 99... R.

16 Woodward. 8 Sturmfels. 74. 66 Steenrod. A. H. 77 Watson.. R. J. 102 Tirado-Rives. J.. 74 Wynn. Pollack.. M.. R. 8 Ramos.. 109 Poincar´ H. N. 8 Rotman. A.. 117 Wallace. 26 Westbrook.. R. J.... N. G. 83 Zomorodian.. 11 Theodorou. 77... 26 Will... 66 Schikore... W. 77 Varzi. S. H.. 113 Van Krefeld. 38 Sharir. 76. 117 Schulten.. N. A. V.. 8 Ramachandran. D.. 62 Thurston. A. K.. 4 Storjohann.. J. L. 19 Wagon. 70 Veltkamp. 91 Vila... J. P... 58. 109 Vleugels. 8 Sch¨ utte... 11 Van Dam. R. 82 Richards. K. B. S. B.. R. F. C. V. 114 . 62 Walter. A. H. 109 Threlfall. J.. L. J. C.. A.. A. 113 Scheraga. J.. 38 Taylor. 99 afli. R. M. G. P... 54 e. Y. 26 Rivest. G. I. A. J. 19 Sullivan..-M... 46. 54 Stern. P. W. 77 Van Oostrum. E. 58 Strang. 4 Shindyalov.. D. L... 114 Roberts. 109 Schey.. K. 62 Shah.. 50 Sasisekharan. 31 Perrot.. 102 Zelevinsky. D. 54.. H... 38 Seidel. 4 Weissig. N. M. 62 Stryer. M. 11 Van der Waerden. A. R. 34.. 62 Qian. 22 Seifert. L. 19 Zhang. R. N. Schneider. 42 Van der Waals. H... H. 113 Sherwood..152 AUTHOR I NDEX Pedoe. 91 Voronoi... D. 8 Wang. 16.. R. M. J. 26 Smale.... 46. G. E. C. 65. 11 Tsai. 16 Raff.... 77 Schl¨ L..

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times