Goedele Roos - Theory Meets Experiment: A Combined Quantum Chemical-Experimental Study of The Reaction Mechanism of Pi258 Arsenate Reductase

Theory meets experiment: a combined quantum chemical-experimental study of the reaction mechanism of pI258 arsenate reductase
Vrije Universiteit Brussel Faculteit Wetenschappen Onderzoeksgroep Algemene chemie Laboratorium voor Ultrastructuur VIB Departement Moleculaire en Cellulaire Interacties
Cys89 a-helix
Hy = Ey Hy = Ey
redox helix Cys82
Cys10 P-loop
Promotoren:
Prof. Dr. Paul Geerlings Prof. Dr. Lode Wyns Dr. ir. Joris Messens
Goedele Roos
Mei 2007
Proefschrift voorgelegd tot het behalen van de wettelijke graad van Doctor in de Wetenschappen
Vrije Universiteit Brussel Faculteit Wetenschappen Onderzoeksgroep Algemene chemie Laboratorium voor Ultrastructuur VIB Departement Moleculaire en Cellulaire Interacties
Theory meets experiment: a combined quantum chemical-experimental study of the reaction mechanism of pI258 arsenate reductase
Promotoren:
Prof. Dr. Paul Geerlings Prof. Dr. Lode Wyns Dr. ir. Joris Messens
Goedele Roos
Mei 2007
Proefschrift voorgelegd tot het behalen van de wettelijke graad van Doctor in de Wetenschappen
Thank you
Thank you for breaking my heart Thank you for breaking me apart Now Ive a strong, strong heart Thank you for breaking my heart
(Sinad OConnor)
ii
Im sincerely grateful to
Prof. Dr. Paul Geerlings, Prof. Dr. Lode Wyns and Dr. ir. Joris Messens, for supervising this work. Prof. Dr. ir. Remy Loris, for teaching me the fundamentals of crystallography. Prof. Dr. Frank De Proft, Dr. ir. Stefan Loverix, Dr. ir. Lieven Buts and Abel Garcia-Pino, for fascinating scientific discussions. Mr. Jan Moens and Ms. Lies Broeckaert, the students I supervise. Ms. Elke Brosens, ir. Khadija Wahni and Ms. Karolien Van Belle, for outstanding experimental work. Mr. Wim Cossement, for magic help with IT problems. Ms. Maria Vanderveken, Ms. Nadine Desmaels and Ms. Martine Vandeperre, for administrative help. Mr. Bruno Janssens, for technical support. Ms. Diane Sorgeloos, for pleasant collaboration during student lab classes. All people from ALGC and ULTR, for useful discussions and collegiality. The VUB/ULB computer centre and the FWO, for computation time and financial support. Ms. Evelyne Namenwirth and Dr. Wim Vandendooren, for hearing me. Bonneke & Bompa, for not giving up on me. Julianna and Christa, my Dear Friends. .and my Dearest Friend for his never ending TLC
Thank you Goedele

iii
iv
Abstract
Although studied for decades, enzymatic catalysis remains one of the most intriguing biochemical phenomena. A remarkable example of a catalytic mechanism has been documented for pI258 arsenate reductase (ArsC). This enzyme combines a unique disulfide cascade mechanism involving Cys10, Cys82 and Cys89 with the functional unfolding of a flexible redox helix. This study was started to unravel all the subtle, important details used by pI258 ArsC to reduce arsenate to arsenite. Disentangle means digging into the heart of the enzymatic reaction mechanism. The limitations of todays known biochemical approaches made us to use quantum chemical tools. Central in quantum chemistry is the Schrodingers time independent equation H=E with the wave function and E the energy. For many electron systems, several approximation methods are available to obtain and E out of this equation, and from , a variety of molecular properties (for example the electron density function, atomic charges). Throughout this work, Density Functional Theory (DFT) is used as a tool to calculate these properties. Among other things, these molecular properties have taught us that subtle changes of the structural environment of ArsC determine the probability of a cysteine residue to function as a nucleophile. More in detail, in this thesis, we focussed on the onset of the nucleophilic attack of Cys10 on arsenate, leading to a covalent Cys10-arseno intermediate, and of Cys82 on this adduct. Additionally, the nucleophilic attack by Cys89 on Cys82 leaving ArsC in its oxidized form at the end of a single catalytic cycle, is studied. The central questions are: Is arsenate bound as mono- or as di-anion in the Michaelis complex?, Is the covalent Cys10-arseno intermediate mono- or di-anionic? and How does ArsC activate the leaving groups (water and arsenite) and the nucleophiles (Cys10, Cys82 and Cys89) in the reactant state? Further, thioredoxin regenerates arsenate reductase in its reduced form for a subsequent catalytic cycle. An answer is given on the question What makes thioredoxin a reducing agent? Here, the focus goes to the intriguing role of the highly conserved proline in the active site of this ubiquitous redox enzyme. The strength of the presented work is in the multidisciplinary approach. All essential intermediates in the reaction mechanism of ArsC are available, providing a unique data-set of high-resolution X-ray structures. These structures offer the opportunity to perform theoretical studies via model systems, giving insight in problems which are experimentally difficult to access. Moreover, when applicable, our theoretically obtained results are discussed in the light of experimental data. As such, the interplay between theory and experiments has permitted us to gain full insight into the enzymatic reaction mechanism of pI258 ArsC. v
Samenvatting
Van alle biochemische fenomenen is enzymatische katalyse misschien wel de meest intrigerende. Hierop inspelend hebben we het opmerkelijke reactiemechanisme van pI258 arsenaat reductase (ArsC) ontrafeld. ArsC katalyseert de reductie van arsenaat tot arseniet gebruik makend van een disulfide cascade, waarbij Cys10, Cys82 en Cys89 betrokken zijn en dit in combinatie met de functionele ontvouwing van een flexibele redox helix. De beperkingen van huidig gekende biochemische methoden hebben ons ertoe gebracht om via kwantumchemische weg het reactiemechanisme van ArsC te bestuderen. De tijdsonafhankelijke Schrodinger vergelijking H=E, met E de energie en de golffunctie, staat centraal binnen de kwantumchemie. Voor veel-elektron systemen zijn er verschillende benaderingsmethoden beschikbaar om E en uit deze vergelijking af te leiden en om uit een groot aantal moleculaire eigenschappen (elektron densiteit, atomaire lading) te berekenen. In dit werk werd Density Functional Theory (DFT) gebruikt als middel om deze eigenschappen te bepalen. Voor ArsC hebben deze eigenschappen ons onder meer toegelaten om te bepalen of een cysteine residu als nucleofiel kan fungeren. In deze thesis wordt de nadruk gelegd op de aanzet van de nucleofiele aanval van zowel Cys10 op arsenaat met de vorming van een covalent Cys10-arseno adduct als van Cys82 op dit adduct. Verder wordt de nucleofiele aanval van Cys89 op Cys82, waardoor ArsC in geoxideerde vorm voorkomt aan het einde van n katalytische cyclus, bestudeerd. Centrale vragen hierbij zijn: Wordt arsenaat als mono- of als di-anion gebonden in het Michaelis complex?, Is het Cys10-arseno adduct mono- of dianionisch? en Hoe activeert ArsC de vertrekkende groepen (water en arseniet) en de nucleofielen (Cys10, Cys82 en Cys89)? Thioredoxine regenereert de gereduceerde vorm van ArsC, zodat een volgende katalytische cyclus gestart kan worden. De vraag Wat zorgt ervoor dat thioredoxine een reductans is? zal beantwoord worden. De functie van het sterk geconserveerde proline residu in de actieve site van thioredoxine zal hierbij onder de loep genomen worden. De kracht van het voorgestelde werk ligt in de grensoverschrijdende aanpak. Voor ArsC beschikken we over een unieke dataset in de vorm van hoog-resolutie structuren voor alle essentile interactieintermediairen. Deze X-straal structuren bieden de gelegenheid om via modelsystemen theoretische studies uit te voeren die inzicht geven in experimenteel moeilijk toegankelijke problemen. Waar mogelijk worden bovendien alle theoretisch berekende resultaten getoetst aan experimentele gegevens. De wisselwerking tussen theorie en experiment laat dus toe om het reactiemechanisme van pI258 ArsC volledig te ontrafelen. vi
Contents
Thank you ................................................................................................................................................................................i Abstract ...................................................................................................................................................................................v Samenvatting ......................................................................................................................................................................... vi Contents................................................................................................................................................................................ vii
Chapter I
An introduction
1. Toxicity and defence mechanisms against arsenic compounds..........................................................................................3 2. Enzyme mechanisms: interplay between theory and experiment .......................................................................................4 3. Outline.................................................................................................................................................................................5 References..............................................................................................................................................................................6
Chapter II
Biochemical and structural characteristics of pI258 arsenate reductase from Staphylococcus aureus
1. ArsC has a PTPase I fold....................................................................................................................................................9 2. Kinetics and active site flexibility .......................................................................................................................................11 3. Catalytic mechanism .........................................................................................................................................................11 4. Overall objectives in relation to the in depth study of the reaction mechanism of pI258 ArsC ..........................................13 References............................................................................................................................................................................15
Chapter III
Theoretical background
17
1. Fundamentals....................................................................................................................................................................19 2. Hartree-Fock theory ..........................................................................................................................................................21 2.1 The Slater determinant..............................................................................................................................................21 2.2 The variational method..............................................................................................................................................22 2.3 Closed shell systems.................................................................................................................................................23 2.4 Open shell systems ...................................................................................................................................................24 2.5 Solution of the Hartree-Fock equations.....................................................................................................................26 3. Density Functional Theory.................................................................................................................................................26 3.1 Introduction................................................................................................................................................................26 3.2 The Hohenberg-Kohn theorems................................................................................................................................27 3.3 The Kohn-Sham method ...........................................................................................................................................29 3.4 The exchange-correlation energy functionals............................................................................................................31 3.4.1 Introduction.......................................................................................................................................................31 3.4.2 Hybrid methods ................................................................................................................................................32 3.5 The chemical potential ..............................................................................................................................................32 3.6 Chemical potential derivatives...................................................................................................................................34
vii
3.6.1 Hardness and softness.....................................................................................................................................35 3.6.2 Fukui function...................................................................................................................................................37 3.6.3 Electrophilicity ..................................................................................................................................................38 3.6.4 Nucleofugality...................................................................................................................................................40 3.7 Hard and soft acids and bases (HSAB) principle ......................................................................................................42 4. Basis sets..........................................................................................................................................................................43 4.1 Slater and Gaussian type orbitals .............................................................................................................................43 4.2 Minimal basis sets.....................................................................................................................................................44 4.3 Split valence basis set...............................................................................................................................................44 4.4 Polarization functions, diffuse functions ....................................................................................................................45 5. Molecular quantities ..........................................................................................................................................................45 5.1 The electron density function ....................................................................................................................................45 5.2 The atomic electron population .................................................................................................................................46 5.2.1 Orbital-based population analysis methods: the natural population analysis method ......................................46 5.2.2 Electrostatic potential derived charges.............................................................................................................46 6. Solvent effects ..................................................................................................................................................................47 6.1 The PCM model ........................................................................................................................................................48 6.2 The SCI-PCM model .................................................................................................................................................49 References......................................................................................................................................................................49
Chapter IV
A computational and conceptual DFT study on the Michaelis complex of pI258 arsenate reductase: structural aspects and activation of the electrophile and nucleophile
51
1. Introduction .......................................................................................................................................................................53 2. Model systems and Computational details........................................................................................................................55 2.1 Optimization of the Michaelis complex......................................................................................................................55 2.2 Interactions with the electrophile...............................................................................................................................56 2.3 Interactions with the nucleophile ...............................................................................................................................57 2.4 DFT Reactivity analysis.............................................................................................................................................59 3. Results and Discussion.....................................................................................................................................................59 3.1 Theoretically optimized Michaelis complex ..............................................................................................................59 3.1.1 Calculated model .............................................................................................................................................59 3.1.2 Enzyme-substrate interactions ........................................................................................................................60 3.1.3 Arg16 guanidinium group ................................................................................................................................61 3.1.4 Asn13Ala structure ..........................................................................................................................................62 3.2 Reactivity analysis by means of the HSAB principle .................................................................................................63 3.3 Activation of the electrophile .....................................................................................................................................64 3.4 Activation of the nucleophile .....................................................................................................................................67 4. Conclusion ........................................................................................................................................................................69 References............................................................................................................................................................................70
viii
Chapter V Intermezzo Gas phase stability of tetrahedral multiply charged anions: a conceptual and computational DFT study
73
1. Introduction .......................................................................................................................................................................75 2. Theoretical background.....................................................................................................................................................77 3. Computational details........................................................................................................................................................80 4. Results and Discussion.....................................................................................................................................................82 4.1 Electronically unstable MCAs ..................................................................................................................................82 4.2 Calculation of the RCB ..............................................................................................................................................83 4.3 Correlation between the RCB and S .........................................................................................................................86 4.4 Stabilization of MCAs ...............................................................................................................................................89 5. Conclusion.........................................................................................................................................................................90 References............................................................................................................................................................................91
Chapter VI Intermezzo Origin of the pKa perturbation of N-terminal cysteine in - and 310-helices: a computational DFT study
93
1. Introduction .......................................................................................................................................................................95 2. Model systems and Computational details .......................................................................................................................97 3. Results and Discussion ..................................................................................................................................................101 3.1 Gas phase study .....................................................................................................................................................101 3.1.1 Hydrogen bonds formed with the N-terminal cysteine....................................................................................101 3.1.2 Helical length..................................................................................................................................................101 3.1.3 Protonaffinity and pKa .....................................................................................................................................103 3.2 Study in aqueous solution .......................................................................................................................................105 4. Conclusion.......................................................................................................................................................................107 References..........................................................................................................................................................................107
Chapter VII
The activation of electrophile, nucleophile and leaving group during the reaction catalysed by pI258 arsenate reductase
109
1. Introduction .....................................................................................................................................................................111 2. Model systems and Computational details......................................................................................................................112 2.1 Electrophile..............................................................................................................................................................113 2.2 Nucleophile..............................................................................................................................................................114 2.3 Leaving group activation .........................................................................................................................................114 3. Results and Discussion...................................................................................................................................................114 3.1 Protonation state of the covalent arseno-enzyme adduct .......................................................................................114 3.1.1 Structural considerations ................................................................................................................................114
ix
3.1.2 Thermodynamic considerations ....................................................................................................................116 3.1.3 Reactivity analysis .........................................................................................................................................117 3.1.4 Optimized product structure and structural comparison with the ArsC-arsenate complex .............................118 3.2 Leaving group activation .........................................................................................................................................120 3.2.1 Nucleofugality ................................................................................................................................................120 3.2.2 Strength of the scissile S-As bond ................................................................................................................121 3.3 Activation of Cys82 .................................................................................................................................................122 4. Conclusion ......................................................................................................................................................................123 References..........................................................................................................................................................................124
Chapter VIII
Interplay between ion binding and catalysis in the thioredoxin-coupled arsenate reductase family
127
1. Introduction .....................................................................................................................................................................129 2. Model systems and Computational details......................................................................................................................132 3. Results ............................................................................................................................................................................135 3.1 The kinetics of Sa_ArsC and Bs_ArsC....................................................................................................................135 3.2 The kinetic parameters of Sa_ArsC H62Q..............................................................................................................137 3.3 Structure of Sa_ArsC H62Q....................................................................................................................................137 3.4 The cation-binding site in Bs_ArsC .........................................................................................................................139 3.5 The binding of potassium .......................................................................................................................................140 3.6 Lysine 33 on the surface of Bs_ArsC......................................................................................................................141 3.7 Negative charges on the surface of Trx-coupled arsenate reductases ...................................................................142 3.8 The link between the P-loop and the cation-binding site.........................................................................................142 4. Discussion.......................................................................................................................................................................145 5. Conclusion ......................................................................................................................................................................147 References .........................................................................................................................................................................147
Chapter IX
The conserved active site proline determines the reducing power of S. aureus thioredoxin
149
1. Introduction .....................................................................................................................................................................151 2. Material and Methods......................................................................................................................................................155 2.1 Expression and purification .....................................................................................................................................155 2.2 Crystallization..........................................................................................................................................................155 2.3 Crystal structure determination ..............................................................................................................................156 2.4 Determination of the pKa of the nucleophilic cysteine .............................................................................................156 2.5 Fluorescence spectroscopy ....................................................................................................................................157 2.6 Analysis of urea-induced unfolding data .................................................................................................................157 3. Results ...........................................................................................................................................................................159 3.1 The X-ray structures of Sa_Trx ..............................................................................................................................159 3.2 The pKa of the cysteines .........................................................................................................................................161
3.3 The redox potential of the proline and tryptophan mutants .....................................................................................162 3.4 Disulfide reducing activity of Sa_Trx mutants..........................................................................................................163 3.5 Thermal stability ......................................................................................................................................................163 3.6 Urea-induced unfolding ...........................................................................................................................................165 4. Discussion.......................................................................................................................................................................167 5. Conclusion.......................................................................................................................................................................170 References..........................................................................................................................................................................170
In conclusion
173
Protonation state of the enzyme-bound substrate and the Cys10-arseno adduct...............................................................175 Leaving group activation and activation of the electrophile.................................................................................................175 How redox cysteines are activated......................................................................................................................................176 Cysteine 10 ...................................................................................................................................................................176 Cysteine 82 ...................................................................................................................................................................176 Cysteine 89 ...................................................................................................................................................................176 The reducing power of thioredoxin ......................................................................................................................................177 The story continues.............................................................................................................................................................177
Appendices
179
Abbreviation list...................................................................................................................................................................181 Publication list .....................................................................................................................................................................183
xi
xii
CHAPTER I An Introduction
Something unknown is doing we don't know what.

(Sir Arthur Eddington)
Chapter I: Introduction
1. Toxicity and defence mechanisms against arsenic compounds

Arsenic, a group V element of the periodic table, is rarely encountered as a free element (metalloid in oxidation state 0). Its most dominant forms in living organisms are oxyanions: arsenate (As(V)) and arsenite (As(III)). Both arsenate and arsenite are natural inorganic toxicants, but have different mechanisms of action. Arsenate mimics orthophosphate and, consequently, uncouples oxidative phosphorylation reactions. Arsenite covalently interacts with vicinal sulfhydryls, thereby inhibiting essential enzymes. By binding to sulfhydryl groups of proteins and dithiols arsenite disrupts the intracellular oxidation-reduction homeostasis1. The most abundant mechanism producing arsenate resistance combines the reduction of arsenate to arsenite by arsenate reductases (ArsC) with transport systems that extrude arsenite from the cell2,3. ArsC is unusual among well-studied enzyme classes, in that there is not a single family of evolutionarily related sequences4. The Staphylococcus aureus plasmid pI258 ArsC and the Bacillus subtilis ArsC coded in the skin element of the bacterial chromosome need thioredoxin (Trx), Trx reductase and NADPH to start a second catalytic cycle5,6,7. In contrast, the Escherichia coli plasmid R773 enzyme requires reduced glutathione (GSH) and glutaredoxin (Grx) during the catalytic cycle8,9. A third class of cytoplasmic arsenate reductase using a thiol cysteine cascade for redox chemistry was recognized in the yeast Saccharomyces cerevisiae and called ACR2p10. The Grx-coupled enzyme ACR2p was the first eukaryotic arsenate reductase to be identified11. In mammalian species, including humans, arsenate is first chemically (by GSH) and/or enzymatically reduced in both blood and the liver and subsequently methylated by S-adenosyl-L-methionine arsenite methyltransferase from the hepatocytes before being excreted as methylated arsenical12. There is no known specific enzyme to be connected to the reduction of arsenate to arsenite. Nevertheless, it is known that apart from their acknowledged functions, purine nucleoside phosphorylase, glyceraldehyde-3-phosphate dehydrogenase and arsenic methylase also catalyse the reduction of arsenate13. In addition to arsenate reduction as part of intracellular arsenic detoxification mechanisms, a fifth totally different arsenate reductase enzyme is thought to play a role for growth. Particular micro-organisms found throughout the bacteria domain respire arsenic oxyanions (for instance Chrysiogenes arsenatis)14. This anaerobic respiration uses arsenate as the terminal electron acceptor instead of oxygen under aerobic conditions.
2. Enzyme mechanisms: interplay between theory and experiment

In this work, we unravel the catalytic mechanism of S. aureus pI258 arsenate reductase (ArsC). Traditionally, enzymatic catalysis is studied by X-ray crystallography, kinetic assays, site-directed mutagenesis, and isotope effects15,16. On the other hand, computer modelling provides a powerful arsenal that can be applied to study subtle details of complicated enzymatic processes17. Theoretical studies can provide a solution for experimentally difficult accessible problems and are especially important in the interpretation of experimental observations. In the past, interplay between theory and experiment was necessary to understand ligand binding in serine proteases (for instance oligopeptidase B18 and subtilisin19) and ribonuclease T1ref.20. Aromatic stacking as alternative for general acid catalysis in nucleoside hydrolase was recognized by the combination of experimental and theoretical methods21. It was computational chemistry that led to the discovery of the His--Asp ion pair in -chymotrypsin22. In the case of ArsC, the experimental determination of isotope effects to assess the protonation state of the enzyme bound substrate is impossible due to the instability of arsenate esters23. This is exactly the point were theoretical studies came into the picture. The central equation of the quantum chemistry is Schrodinger's time independent equation H = E of which the exact solution is only accessible for one-electron systems. For many electron systems, several methods are available to approximately solve this equation. In recent years, an approach different from the traditional wave function based methods gained importance in quantum chemistry. Based on the Hohenberg and Kohn theorems24, Density Functional Theory (DFT)25 considers the electron density function (r) as the carrier of all information on the system it describes. Using (r) as the fundamental property leads to a better quality/cost ratio when evaluating molecular properties. On the other hand, conceptual DFT26 offers many concepts to describe the reactivity of reaction partners. Global reactivity descriptors of this type26 are electronegativity, and global hardness and softness. The local counterparts are the local hardness and softness and the Fukui function26. For a long time a quantitative treatment was hampered due to a lack of methods for quantifying hardness and softness. A breakthrough was reached in Parr and Pearsons seminal work27, identifying the chemical hardness as the difference between the ionization energy and the electron affinity of a species. Both the experimental determination and the quantum chemical evaluation of these properties were thereby made possible. Within the same context of the conceptual DFT, Parr, Lee, and Chattaraj28 presented evidence for the hard and soft acids and bases principle, a more detailed treatment was presented later by Gzquez and Mndez29. Starting from model systems based on X-ray structures, the application of conceptual DFT based reactivity descriptors has permit us to study the subtle, important details used by pI258 ArsC to reduce arsenate to arsenite. Moreover, when applicable, our theoretically obtained results are discussed in the 4
Chapter I: Introduction light of experimental data. As such, our work fits in a multidisciplinary approach, combining theoretical and experimental studies to gain full insight into the enzymatic reaction mechanism of pI258 ArsC.
3. Outline
After this general introduction, Chapter II gives an overview of the biochemical and structural characteristics of pI258 arsenate reductase from Staphylococcus aureus. In Chapter III, the fundamentals of quantum chemistry are discussed, with special attention to Density Functional Theory. In Chapter IV we focus on the phosphatase-like nucleophilic displacement reaction carried out by a nucleophilic cysteine on arsenate, leading to a covalent enzyme-arseno adduct. In Chapter VII the nucleophilic attack on this enzyme-arseno adduct and the looping-out of an -helix is studied. After one catalytic cycle, ArsC is in its oxidized form and needs to be regenerated to its reduced form by thioredoxin. In Chapter IX, the reduction power of thioredoxin is discussed. In Chapter VIII the special feature of potassium binding in pI258 ArsC is treated. Chapter V and VI are intermezzi, handling more fundamental work on the metastable nature of multiply charged anions and on the origin of the pKa perturbation at the N-terminal of helices, respectively. While I was doing the quantum chemical parts of this thesis, the experiments reported in Chapter VII and VIII were designed and coordinated by Joris Messens and carried out by Lieven Buts (ITC measurements), Karolien Van Belle (kinetic studies), Elke Brosens (construction of ArsC mutants), Remy Loris (crystallographic data collection and solving the ArsC structures) and Joris Messens (solving the ArsC structures). I performed parts of the experimental work of Chapter IX (pKa measurements, chemical unfolding, solving the thioredoxin structures) under supervision of Joris Messens, who designed and coordinated the experiments, and Remy Loris, who taught me how to solve the thioredoxin crystal structures. For the rest of the experimental work, credits go to Abel Garcia-Pino (DSC measurements), Elke Brosens (construction of the thioredoxin mutants and seeding experiments), Karolien Van Belle (kinetic studies and redox potential measurements), Remy Loris (crystallographic data collection) and Guy Vandenbussche (mass spectrometry). In Chapter VIII and IX, the results and discussion section are separated to improve readability.
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. Healy, S. M., Wildfang, E., Zakharyan, R. A., Aposhian, H. V., Biol. Trace Elem. Res. 1999, 68, 249. Rensing, C., Ghosh, M., Rosen, B. P., J. Bacteriol. 1999, 181, 5891. Dey, S., Rosen, B. P., J. Bacteriol. 1995, 177, 385. Messens, J., Martin, J. C., Van Belle, K., Brosens, E., Desmyter, A., De Gieter, M., Wieruszeski, J.-M., Willem, R., Wyns, L., Zegers, I., Proc. Natl. Acad. Sci. USA 2002, 99, 8506. Ji, G., Silver, S., Proc. Natl Acad. Sci. USA 1992, 89, 9474. Ji, G., Garber, E. A., Armes, L. G., Chen, C. M., Fuchs, J. A., Silver, S., Biochemistry 1994, 33, 7294. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. Gladysheva, T. B., Oden, K. L., Rosen, B. P., Biochemistry 1994, 33, 7288. Shi, J., Vlamis-Gardikas, A., Aslund, F., Holmgren, A., Rosen, B. P., J. Biol. Chem. 1999, 274, 36039. Mukhopadhyay, R., Rosen, B. P., FEMS Microbiol. Lett. 1998, 168, 127. Mukhopadhyay, R., Shi, J., Rosen, B. P., J. Biol. Chem. 2000, 275, 21149. Challenger, F., Chem. Rev. 1945, 36, 315. Messens, J., Silver, S., J. Mol. Biol. 2006, 362, 1. Stolz, J. F., Oremland, R. S., FEMS Microbiol. Rev. 1999, 23, 615. Herschlag, D., Jencks, W. P., J. Am. Chem. Soc. 1989, 111, 7587. Hengge, A. C., Cleland, W. W., J. Am. Chem. Soc. 1990, 112, 7421. Nray-Szab, G., Theochem. 2000, 500, 157. Rawlings, N. D., Polgr, L., Barrett, A. J., Biochem. J. 1991, 279, 907. Baeten, A., Maes, D., Geerlings, P., J. Theoret. Biol. 1998, 195, 2711. Mignon, P., Steyaert, J., Loris, R., Geerlings, P., Loverix, S., J. Biol. Chem. 2002, 39, 36770. Versees, W., Loverix S., Vandemeulebroeke A., Geerlings P., Steyaert, J., J. Mol. Biol. 2004, 338, 1. Johannin, G., kellersohn, N., Biochem. Biophys. Res. Commun. 1972, 49, 321. Lagunas, R., Pestana, D., Diez-Masa, J. C., Biochemistry 1984, 5, 955. Hohenberg, P., Kohn, W., Phys. Rev. 1960, B136, 864. Parr R. G., Yang W., Density-Functional Theory of Atoms and Molecules, Oxford University Press, New York, 1989. Geerlings, P., De Proft, F., Langenaeker, W., Chem. Rev, 2003, 103, 1793. Pearson, R. G., Parr, R. G., J. Am. Chem. Soc. 1983, 105, 7512. Chattaraj, P. K., Lee, H., Parr, R. G., J. Am. Chem. Soc. 1991, 113, 1855. Gzquez, J. L., Mendez, F. J., J. Am. Chem. Soc. 1994, 98, 4591.
Chapter II Biochemical and structural characteristics of pI258 arsenate reductase from Staphylococcus aureus
Most of the fundamental ideas of science are essentially simple, and may, as a rule, be expressed in a language comprehensible to everyone.
(Einstein)
Chapter II: pI258 ArsC Arsenate reductase (ArsC) from Staphylococcus aureus plasmid pI258, a 14.8 kDa monomeric protein, plays a role in bacterial heavy metal resistance and catalyzes the reduction of arsenate to arsenite. ArsC is part of the ars operon coding for ArsR and ArsB in addition to ArsC1. ArsR is a regulatory protein repressing protein transcription in response to arsenite2. ArsB is a proton-driven transport system that extrudes arsenite3.
1. ArsC has a PTPase I fold

Despite the low sequence identity (< 20 %) of S. aureus ArsC with low molecular weight phosphatase (LMW PTPase), ArsC has the characteristic PTPase I fold: a four stranded parallel -sheet and three major -helices (Fig. 2.1A)4. Arsenate reduction is the third function associated with a PTPase I fold after tyrosine dephosphorylation5 and cellobiose phosphorylation6. The catalytic site of LMW PTPase is conserved in ArsC. In LMW PTPase, this site is composed of the oxyanion binding loop including the nucleophilic Cys13, the conserved Asn16 and Arg19 (numbering in LMW PTPase of Saccharomyces cerevisiae7), called the P-loop. In ArsC, the equivalent residues are Cys10, Asn13 and Arg16 (Fig. 2.1B)4. The Tyr or Ser residues lining the binding site in PTPase are conserved in ArsC (Ser17)4.
Figure 2.1: The structure of pI258 ArsC. A. Overall structure of the reduced form of arsenate reductase. The P-loop is shown in red and the catalytic important -helices are shown in yellow. B. Oxyanion binding P-loop including the conserved residues Cys10, Asn13, Arg16 and Ser17. The figure was generated by using PyMol (Delano Scientific LLC 2005) from the PDB coordinates of 1LJL.
During arsenate reduction and dephosphorylation, a water or tyrosine molecule is split off the substrate. The overall geometry adopted by the P-loop of ArsC in the first reaction step is different from that of PTPases, possibly because the smaller substrate permits a different orientation of the leaving group8. In ArsC, the leaving water is much more buried in the active site. It is cradled by N and N of Arg16 and close to (2.6 ) a water molecule that in turn is hydrogen bonded to Asp105 (Fig. 2.2)8. In LMW PTPase, strong hydrogen bonds are formed between NH/NH of the guanidinium group of the ArsC Arg16 homologue (Arg19 in PTPase of S. cerevisiae7) and two non-protonated oxygen atoms of the phosphotyrosine substrate9. The equivalent of Asp105 in LMW PTPase (Asp132 in PTPase of S. cerevisiae7) protonates the leaving oxygen in the dephosphorylation reaction9. In ArsC, mutating Asp105 to alanine (KM = 3.8 mM, kcat = 58.5 min-1)8 increases the KM with a factor of 55 and decreases its kcat about four times compared to wild type ArsC (KM = 68 M, kcat = 215 min-1)10. The respectice Asp/Ala mutation in LMW PTPase, however, decreases the kcat with a factor of more than 1,000, while hardly affecting KM11,12. Therefore, in ArsC, Asp105 might have a somewhat different function, stabilizing the transition state via a bound (protonated) water molecule8.
Figure 2.2: Orientation of the leaving group in the P-loop of ArsC (A) and LMW PTPase (B). The figure was generated by using PyMol (Delano scientific LLC 2005) from the PDB coordinates of 1LJU (product structure of pI258 ArsC) and 1D1P (LMW PTPase of S. cerevisiae in complex with 4-(2hydroxyethyl)-1-piperazine ethanesulfonic acid represented as methanesulfonic acid as substrate analogue).
10
Chapter II: pI258 ArsC
2. Kinetics and active site flexibility

pI258 arsenate reductase is the only arsenate reductase for which the kinetics are characterized by a very unusual biphasic Michaelis Menten profile13. By the application of the Selwyn test of enzyme inactivation to progress curves14, it is shown that the substrate arsenate and the tetrahedral oxyanions phosphate, sulphate and perchlorate essentially eliminate this behaviour at millimolar concentrations and increase the kcat of Sa_ArsC with a factor of approximately 5ref.10. Also for the complete resonance assignment in NMR, the binding of sulphate with residues located in the P-loop was shown to be necessary for arresting the dynamic character of the active site8,15. As concluded from NMR data8, tetrahedral oxyanions compete with arsenate for the same active site and have the ability to structure the active site in a more active conformation. As such, the two-step kinetic curve behaviour of ArsC can be explained by the stabilizing effect of oxyanions through interactions with the Cys10-X5-Arg16 active site. At low substrate concentrations, only a fraction of ArsC is available in its enzymatic active state, which results in an apparent low kcat value. Between pH 6.5 and 8.5, a linear increase of kcat is observed with increasing pH. The KM is in the low 50-80 M range up to pH 8.0 and steeply increases above pH 8.0 towards a maximum of 0.44 mM at pH 8.5. As such, the pH of maximum activity is tied on 8.0, yielding the highest rate with a low KM (KM = 68 M, kcat = 215 min-1, kcat/KM = 5.2 104 M-1s-1)10. The pronounced changes in the 1H-15N HSQC spectrum of ArsC above pH 8.0 suggest its kinetic behaviour may be affected by a dramatic conformational change10. Apart from its arsenate reductase activity, ArsC has also some phosphatase activity4. Its phosphatase activity is very low (kcat = 0.53 min-1; KM = 146 mM; kcat/KM = 0.06 M-1s-1)4 compared to that of acknowledged LMW PTPases, where it ranges from 1.4 to 100 M-1s-1 ref.16. ArsC catalyzing two independent reactions could be an example of moonlighting17 -that is ArsC could switch between different functions in different circumstances.
3. Catalytic mechanism
Cys10, Cys82 and Cys89 are identified as the essential cysteinyl residues necessary for reductases activity (Fig. 2.1A)18. The first step in the multistep catalytic mechanism of pI258 ArsC consists of a phosphatase-like nucleophilic displacement reaction carried out by Cys10 on arsenate by which a covalent Cys10-arseno adduct is formed4 (Fig. 2.3, Step 1). In the second step, another nucleophile, Cys82, attacks the covalent Cys10-arseno adduct leading to the release of arsenite and the formation of a Cys10-Cys82 disulfide bridge4 (Fig. 2.3, Step 2). After the second reaction step, when the Cys10-Cys82 disulfide intermediate 11
Figure 2.3: Scheme of the reaction mechanism of pI258 ArsC. 1. The reaction starts with the nucleophilic attack of Cys10 on arsenate leading to a covalent intermediate. 2. Arsenite is released after the nucleophilic attack of the thiol of Cys82. A Cys10-Cys82 intermediate is formed and the redox helix partially unfolds. 3. At the end of the reduction cycle, Cys89 attacks Cys82, forming a Cys82-Cys89 disulfide. The redox helix is looped-out and presents the disulphide bridge at the surface of the enzyme to thioredoxin. 4. Thioredoxin (Trx) regenerates the reduced form of arsenate reductase for a subsequent catalytic cycle. The figure was generated by using PyMol (Delano Scientific LLC 2005).
12
Chapter II: pI258 ArsC has been formed, the conformation of the redox helix has changed into a transitional conformation between -helix and loop8. The subsequent third reaction step (Fig. 2.3, Step 3) consists of the nucleophilic attack of Cys89 on the Cys10-Cys82 disulfide resulting in the formation of the Cys82-Cys89 disulfide to regenerate Cys10ref.4,8. The thioredoxin-coupled ArsC family is unique in that all three nucleophilic thiolates act intramolecularly through a reversible disulfide bond mediated conformational switch8. At the end of the reduction cycle, a short -helix bearing a cysteine at each end is looped-out and presents a disulphide bridge at the surface of the enzyme to thioredoxin (Trx)8. Trx regenerates the reduced form of arsenate reductase for a subsequent catalytic cycle4. Thioredoxin appears to be selective for oxidized ArsC (Cys82-Cys89 disulfide bridge formed) with a looped-out redox helix19. After the recognition of oxidized ArsC, with the formation of a non-covalent complex, a transient mixed disulfide between the two molecules is formed. With oxidized ArsC as substarte, Trx displays a kcat = 114 min-1, KM = 33 M and a kcat/KM = 5.8 104 M-1s-1 ref.19. In the absence of a reducing environment, the Cys82-Cys89 and the Cys10-Cys15 disulfide bridges are formed, the latter blocking the active site P-loop. While fully reduced ArsC can be recovered by exposing this double oxidized ArsC to Trx, the P-loop disulfide bridge itself is inaccessible to Trx19. To reduce this buried Cys10-Cys15 disulfide bridge, Trx reduces first the surface exposed Cys82-Cys89 disulfide bridge to release Cys82. Cys82 attacks Cys10 in the buried disulfide bridge. Cys15 is released and a Cys10-Cys82 intermediate is formed. In the next step, Cys89 attacks Cys82 of the Cys10-Cys82 disulfide, forming a Cys82-Cys89 disulfide. Finally, to completely reduce ArsC, a second thioredoxin molecule reduces the Cys82-Cys89 disulfide bridge on the looped-out redox helix19.
4. Overall objectives in relation to the in depth study of the reaction mechanism of pI258 ArsC
Enzymatic catalysis already starts with the binding of the substrate. We present a theoretically optimized ArsC-arsenate complex, which is experimentally not attainable, with a concomitant in-depth description of the enzyme-substrate interactions. Knowledge of the correct protonation state of the enzyme bound substrate and the covalent Cys10arseno adduct (product of first reaction step) is crucial for understanding enzymatic catalysis. In view of the acid dissociation constant for arsenate (pKa = 2.2, 6.97, 11.53), a mono- anionic as well as a dianionic substrate and adduct are likely to exist at the pH of maximum activity (pH = 8.0)10. Theoretical studies based on conceptual Density Functional Theory are used to gain insight into the protonation state of the enzyme bound substrate (Chapter IV) and the Cys10-arseno adduct (Chapter VII) in ArsC. 13
The experimental determination of the pKa of the catalytic important thiol groups (Cys 10, Cys82 and Cys89) is not straightforward because these redox active cysteine residues are all involved in the successive steps of the reaction mechanism4 (Fig. 2.3). At pH 8.0, free cysteine (pKa = 8.3) is largely present in the thiol form, which is a far inferior nucleophile than the thiolate form20. However, the acid/base properties of functional groups may be perturbed in a protein environment as compared to aqueous solution21. In addition to the nature of the nucleophile, the pre-organized environment of an enzyme might alter the leaving group as compared to the cases where these entities are isolated in gas phase or in solution21. Analysis of the interactions in the ArsC-substrate complex (Chapter IV) and in the ArsC-arseno covalent adduct (Chapter VII) provides insight into the structural features of ArsC related to its capability to activate both the leaving groups (water and arsenite) and the nucleophiles (Cys10, Cys82 and Cys89) in the reactant state. For its activity, pI258 ArsC benefits from the binding of tetrahedral oxyanions in the P-loop active site and from the binding of potassium in a specific cation-binding site. Further, in the P-loop the peptide bond between Gly12 and Asn13 can adopt two distinct conformations. These special features of potassium binding and - flipping in pI258 ArsC together with the tetrahedral-anion-dependent catalysis in pI258 ArsC are studied (Chapter VIII). Thioredoxin (Trx) regenerates the reduced form of ArsC for a subsequent catalytic cycle. What makes Trx a reducing agent? This question is answered in Chapter IX. The presence of a highly conserved proline in the WCGPC active site motif of Trx drew our attention. The role of this proline residue in the reducing force of Trx is unravelled.
14
Chapter II: pI258 ArsC
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Ji, G., Silver, S., Proc. Natl. Acad. Sci. USA 1992, 89, 9474. Silver, S., Plasmid 1992, 27, 1. Brer, S., Ji, G., Brer, A., Silver, S., J. Bacteriol 1993, 175, 3840. Zegers, I., Martins, J. C., Willem, R.,Wyns, L., Messens, J., Nature Struct. Biol. 2001, 8, 843. Denu, J. M., Dixon, J. E., Curr. Opin. Chem. Biol. 1998, 2, 633. Ab, E., Schuurman-Wolters, G., Reizer, J., Saier, M. H., Dijkstra, K., Scheek, R. M., Robillard, G. T., Protein Sci. 1997, 6, 304. Wang, S., Tabernero, L., Zhang, M., Harms, E., Van Etten, R., Stauffacher, C. V., Biochemistry 2000, 39, 1903. Messens, J., Martins, J. C., Van Belle, K., Brosens, E., Desmyter, A., De Gieter, M., Wieruszeski, J-M., Willem, R., Wyns, L., Zegers, I., Proc. Natl. Acad. Sci USA 2002, 99, 8506. Zhang, Z.-Y., Critical Reviews in Biochemistry and Molecular Biology 1998, 33, 1. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. Wu, L., Zhang, Z. Y., Biochemistry 1996, 35, 5426. Taddei, N., Chiarugi, P. Cirri, P., Fiaschi, T., Stefani, M., Camici, G., Raugei, G., Ramponi, G., FEBS Lett. 1994, 350, 328. Ji, G., Garber, E. A., Armes, L. G., Chen, C. M., Fuchs, J. A., Silver, S., Biochemistry 1994, 33, 7294. Selwyn, M. J., Biochim. Biophys. Acta 1965, 105, 193. Jacobs, D. M., Messens, J., Wechselberger, R. W., Brosens, E., Willem, R., Wyns, L., Martins, J. C., J. Biomol. NMR 2001, 20, 95. Ramponi, G., Stefani, M., Biochim. Biophys. Acta 1997, 1341, 137. Stolz, J. F., Oremland, R. S., FEMS Microbiol. Rev. 1999, 23, 615. Messens, J., Hayburn, G., Desmyter, A., Laus, G., Wyns, L., Biochemistry 1999, 38, 16857. Messens, J., Van Molle, I., Vanhaesebrouck, P., Limbourg, M., Van Belle, K., Wahni, K., Martins, J. C., Loris, R., Wyns, L., J. Mol. Biol. 2004, 339, 527. Dantzman, C. L., Kiessling, L. L., J. Am. Chem. Soc. 1997, 118, 11715. Fersht, A., Enzyme Structure and Mechanism, W. H. Freeman and Company, New York, 1984.
15
16
Chapter III Theoretical background
Any one who is not shocked by quantum mechanics has not fully understood it.
(Niels Bohr)
18
Chapter III: Theoretical background
1. Fundamentals
Quantum chemistry is based on an approximate solution of Schrodinger's time independent equation1,2 from which all electronic properties of atoms and molecules can be derived:
H = E
(3.1)
in which H is the Hamilton operator for a system of electrons and nuclei, is the wave function and E is the energy. For a system constituted of M nuclei and N electrons, the non-relativistic Hamiltonian, written in atomic units (a.u.), is given by:
N M N M N N 1 1 Z 1 M M Z Z H = i2 2 A + + A B A i =1 2 A=1 2 M A i =1 A=1 riA i =1 j <i rij A=1 B < A R AB
(3.2)
in which i2 and 2 are the Laplacian operators for the differentiation to the coordinates of A respectively electron i and nucleus A. The first and the second term are the operators for the kinetic energy of the electrons and the nuclei respectively (with MA and ZA being the mass and atomic number of the nucleus); the third term represents the Coulomb attraction between electrons and nuclei and the two last terms represent the repulsion between electrons and between nuclei respectively. The solution of the non-relativistic wave equation is generally achieved within the Born-Oppenheimer approximation in which the interaction between the electrons and electron-nuclei is considered at constant nuclear positions3,4. The nuclei are much heavier than electrons, consequently, the electrons move much faster than the nuclei. As such, we can regard the nuclei as fixed while electrons move in this field of fixed nuclei. Therefore, the wave function can be approximated by the product of the electronic and the nuclear wave function. Accordingly, the Schrodinger equation for the whole system can be reduced to the electronic Schrodinger equation:
H elec elec = E elec elec
(3.3)
Herein elec depends only on the electron coordinates and contains the nuclear coordinates as parameters since the electron distribution depends only on the position of the nuclei. Helec corresponds to the motion of the electrons in the field of fixed nuclei and is given by:
N M N N N Z 1 1 H elec = i2 A + i =1 A=1 riA i =1 j <i rij 1=1 2
(3.4)
19
In comparison to eq. (3.2) the nuclear kinetic and nuclear repulsion terms are absent. The total energy equals the sum of the electronic energy and the constant nuclear repulsion term:
Etot = Eelec + Enucl = Eelec +
Z AZ B A=1 B < A R AB
(3.5)
The exact solution of the electronic Schrodinger equation is only accessible for one-electron systems. For many electron systems, several methods are available to approximately solve eq. 3.3. They can be divided into two categories. The semi-empirical methods use simplified Hamiltonians and a set of parameters taken from experimental data. In contrast, ab-initio methods use the correct molecular Hamiltonian and no experimental data, except for the values of the fundamental physical constants. Abinitio methods can again be subdivided into two categories. The wave function based methods2,5,6 use the wave function , depending on three space and one spin coordinate of all N electrons (i. e. 4N variables), as the basic source of information of the system. In Density Functional Theory (DFT), the electron density obtained by integration of over 4N-3 variables is used7,8. The most common wave function based method is the Hartree-Fock (HF) method, which is based on the use of one-electron functions (spin orbitals) to construct the many electron wave function. A single determinant wave function is used in order to respect the Pauli principle. The electron-electron repulsion is taken into account in an average way: a given electron is considered to interact with an averaged field of all other electrons. Due to the incomplete treatment of electron correlation, the best single determinant wave function that can be obtained is not the exact solution of Schrodinger's equation. Several methods of correlated calculations begin with a HF calculation and then correct for the instantaneous electron-electron repulsion. Among them, the Mller-Plesset9 (MP) perturbation theory determines the correlation energy as a sum of second, third, fourth ...order contributions. Other methods such as Coupled-Cluster (CC)10 theory include a larger portion of the correlation energy. Their computational cost still makes their use prohibitive for relatively large systems as encountered in biochemistry. Therefore, these methods will not be discussed further. An overview of these postHartree-Fock methods can be found in ref. 1. Density Functional Theory7 is based on the electron density, which for an N-electron system depends only on three spatial coordinates, independently of the number of electrons. Therefore, DFT methods significantly reduce the calculation costs, but on the other hand, the explicit form of the Hamiltonian written in terms of the electron density is unknown.
20
2. Hartree-Fock theory
2.1 The Slater determinant
The wave function must satisfy the anti-symmetry principle (the Pauli exclusion principle), which states that a wave function must change sign when the spatial and spin components of any two electrons are exchanged. In the Hartree-Fock scheme, the simplest possible anti-symmetric wave function (i. e. a single determinant) is used to describe the ground state of an N-electron system. This single determinant wave function is the Slater determinant:
0 SD =
1 (1) 2 (1) K N (1) 1 1 ( 2) 2 ( 2) K N ( 2 )

K K N! K K 1 ( N ) 2 ( N ) K N ( N )
(3.6)
with
1 the normalization factor and i ( j ) the molecular spin orbitals depending on three spatial N! coordinates of electron j and one spin coordinate:
i ( j ) = i ( x j ) = i (r j , j )
(3.7)
To a very good approximation, the Hamiltonian in eq. 3.4 does not involve the spin variables, but is only a function of the spatial coordinates. Consequently, the molecular spin orbitals can be written as the product of a spatial orbital i (r j ) and a spin function ( j ) or ( j ) , corresponding to a spin up or a spin down situation:
( j ) i ( j ) = i (r j ) ( j )
(3.8)
21
A molecular orbital is defined as a mono-electronic wave function characterizing an electron in a molecular system. It can be expanded in a set of K basis functions , the atomic orbitals, with
{ }
expansion coefficients ci:
i ( j ) = c i
=1
(3.9)
2.2 The variational method

Hartree-Fock theory is based on a variational procedure11. If is any anti-symmetric normalized function of the electronic coordinates, then the energy associated to this function is:
E = * H elec d
(3.10)
in which the integration is over all the coordinates. If is the exact wave function of the ground state, E will be the exact energy of the ground state (E0). However, if is any normalized anti-symmetric wave function different from the exact ground state wave function, the associated energy E is larger than the exact ground state energy (E > E0). This calls for a variational method in which the parameters of the wave function should be varied until the energy associated to the wave function is minimal. The variational method can be applied to determine the optimum orbitals of a single determinant wave function. The coefficients ci in eq. 3.9 must be adjusted to minimize the energy, implying:
E =0 ci
(3.11)
22
2.3 Closed shell systems

Studies of closed shell systems (no unpaired electrons) can be performed in a restricted Hartree-Fock (RHF) calculation12 in which each orbital has two electrons, one spin up, the other spin down. In the assumption of identical molecular orbitals for and electrons, the variational condition (eq. 3.11) leads to a set of algebraic equations for ci, the so-called Hartree-Fock Roothaan-Hall equations12,13:
( F S )c = 0
=1 i i
= 1,2,,N
(3.12)
fulfilling the normalization condition:
c S c =1
* =1 =1 i i
(3.13)
In this equation, i represents the one-electron energy of the molecular orbital i ; S are the elements of the overlap matrix:
* S = (1) (1) d r 1
(3.14)
and F the elements of the Fock matrix:

K K 1 core F = H + P ( ) ( ) 2 =1 =1
(3.15)
core The matrix elements H are associated to the mono-electronic Hamiltonian describing the kinetic
energy of electrons and the electron-nuclei attraction (see eq. 3.2). It can be written as:
* core H = (1) H core (1) (1)d r 1
(3.16)
with
M Z 1 H core (1) = 2 A 2 A=1 R1 A
(3.17)
ZA is the atomic number of atom A and R1A is the distance from electron 1 to atom A. 23
The quantities (|) in the Fock matrix are the two-electron repulsion integrals:
1 * * ( ) = ( r1 ) (r 1 ) ( r2 ) ( r2 )dr 1dr 2 r12

These integrals are multiplied by the elements of the one-electron density matrix P:
* P = 2 c i ci i =1 occ
(3.18)
(3.19)
in which the summation is over all occupied molecular orbitals. The factor of two indicates that each orbital is occupied by two electrons. One finally gets the total electronic energy:
E =
el
1 core P (H + F ) 2 =1 =1
(3.20)
2.4 Open shell systems

In the case of systems with an odd number of electrons, the electrons cannot be assigned in pairs to molecular orbitals. The molecular orbital theory commonly used for open shell systems is the spinunrestricted Hartree-Fock (UHF) theory14. In this case different molecular orbitals are considered for the and electrons, i.e. two sets of molecular orbitals are defined with two sets of coefficients:
i =
=1
K
ci
(3.21)
i =
=1
ci
(3.22)
24

The coefficients ci and ci are varied separately, leading to the Pople-Nesbet equations: ( F S )c = 0
=1
i i K
= 1, , K = 1, , K
(3.23) (3.24)
( F S )c = 0
=1 i i
In the open-shell case, the Fock matrices are defined as:

F = H + [(P + P )( ) P ( )]
core K K
(3.25)
=1 =1
and
core F = H + [(P + P )( ) P ( )] =1 =1
K K
(3.26)
with the expressions for the density matrices:

P = ,occ
i=1
c i*ci
(3.27)
and
P = ,occ
i=1
ci*ci
K K
(3.28)
Finally, the electronic energy becomes:
1 core E = [( P + P )H +P F + P F ] 2 =1 =1
el
(3.29)
25
2.5 Solution of the Hartree-Fock equations

The Hartree-Fock Roothaan-Hall or Pople-Nesbet equations determine the molecular orbital coefficients, together with the molecular orbital energies. However, the Fock matrix itself depends on the molecular orbital coefficients. As such, the solution necessarily involves an iterative process. Since the molecular orbitals are derived from their own effective potential, the technique is called selfconsistent-field (SCF) theory. Equation (3.12) can be written as a matrix equation:
[F ][ci ]= [S ][ci ] i
(3.30)
with [F] the Fock matrix, [ci] the column matrix containing the coefficients ci of orbital i, [S] denoting the overlap integral matrix. Before the equations can be solved, they have to be transformed into a set of pseudo-eigenvalue equations. After the orthogonalisation of the orbitals we obtain:
[F ][c'i ]= [S ][c'i ] i
(3.31)
The SCF procedure starts with an initial guess for the molecular orbital coefficients ci (associated with the density matrix P0). From this guess, the Fock matrix is calculated and diagonalized, giving a new set of molecular orbital coefficients associated with the density matrix P. From this, the Fock matrix is again constructed, repeating the above procedure. This is continued until the set of coefficients used to construct the Fock matrix is equal to those resulting from the diagonalization.
P0
Ck
P
(3.32)
3. Density Functional Theory

3.1 Introduction
The electronic wave function of an N-electron molecule depends on 3N spatial coordinates and on N spin coordinates. This has prompted the search for functions that can be used to calculate energies and molecular properties involving fewer variables. The Density Functional Theory (DFT) based on the Hohenberg-Kohn theorems, uses the electron density (r) as the ground function containing physically significant information15-18. While the complexity of the wave function increases with the number of 26
Chapter III: Theoretical background electrons, the electron density is only depending on three spatial coordinates independent of the number of electrons. In addition, several concepts which are known in chemistry, such as electronegativity and hardness, find a theoretical foundation in this theory.
3.2 The Hohenberg-Kohn theorems

For an electronic system, the ground state energy and wave function are determined by the minimization of the energy as the expectation value of the Hamiltonian (eq. 3.11). However for an N-electron system, the external potential (r) completely determines this Hamiltonian. As such, N and (r) determine all properties of the ground state. Since (r) determines N ( (r )d r = N ), the first Hohenberg-Kohn
theorem legitimizes the use of the electron density (r) as basic variable. It states: The external potential (r) is determined, within a trivial additive constant, by the electron density (r)19.
0 { N , Z A , RA } H 0 E0
(3.33)
The external potential (r) is the potential due to the nuclei of the molecular system. It is the classical nucleus-electron attraction and can be written at a position r for a system of M nuclei as:
(r ) =
A=1
ZA r RA
(3.34)
with RA and ZA the position and charge of nucleus A respectively. The second Hohenberg-Kohn theorem provides the energy variational principle. It states: For a trial density ' (r ) with ' (r ) 0 for all r and ' (r )d r = N is E0 E [ ' (r )] , with E0 the exact ground
state energy.
The energy functional may be divided into several contributions. Since (r ) determines all properties of the ground state, these should all be functionals of (r):
E [ (r )] = T [ (r )] + Vne [ (r )] + Vee [ (r ) ]
energy and Vee [ (r )] the electron-electron repulsion energy. 27
(3.35)
in which T [ (r )] expresses the kinetic energy of the electrons, Vne [ (r )] the nucleus-electron attraction
The expression for E [ (r )] can be rewritten as:
E [ (r )]= FHK [ (r )]+ (r ) (r )d r

with
(3.36)
FHK [ (r )]= T [ (r )]+ Vee [ (r )]

and
(3.37)
Vne [ (r )] = (r ) (r )d r
(3.38)
The exact form of T [ (r )] for a system of interacting entities is unknown until now, but in the KohnShame approach it is approximated by the kinetic energy for a non-interacting system. exchange part, J [ (r )] and K [ (r )] implicitly including correlation energy: In analogy with the Hartree-Fock theory, the Vee [ (r )] term may be divided into a Coulomb and an
Vee [ (r ) ] = J [ (r ) ] + K [ (r ) ]
with:
(3.39)
J [ (r )] =
1 (r1 ) (r 2 )d r1d r 2 r12
(3.40)
and K [ (r )] an unknown non-classical term. The nuclear-nuclear repulsion being constant in the BornOppenheimer approximation is omitted. The search for the ground state electron density (r ) starts with the minimization condition:
E [ ( r )] =0 ( r )
(3.41)
together with the constraint that the electron density should integrate to the total number of electrons N present in the atomic or molecular system:
N = N [ ( r )] = ( r )d r
28
(3.42)
Chapter III: Theoretical background This leads to the following Euler equation:
( E [ ( r )] ( r )) = 0 ( r )
In which is a Lagrange multiplier. Combination of eq. 3.36 and eq. 3.43 leads to:
(3.43)
( F [ (r )] + (r ) (r )d r ) = (r ) HK
and, finally to:
(3.44)
(r ) +
FHK [ (r )] = (r )
(3.45)
This equation can be considered as the density functional analogue of the Schrodinger equation. It can be used to determine the ground state electron density (r ) .
3.3 The Kohn-Sham method

Earlier attempts to deduce functionals for the kinetic and exchange energies considered a noninteracting uniform gas such as the Thomas-Fermi Dirac (TFD) model dating from the 1920s20. However, in this model the approximate forms for T [ (r )] and Vee [ (r )] do not hold very well for atomic and molecular systems. In the model of a non-interacting uniform electron gas, the TFD theory does not predict bonding; molecules simply do not exist in this approach. The foundation for the use of DFT methods was the introduction of orbitals by Kohn and Sham21. In terms of these orbitals, the electron density becomes:
( r ) = i ( r, ) d
i
(3.46)
29
The unknown kinetic energy functional T [ (r )] can consequently be written as:
T [ ( r )] = ni i
i
1 2 i i 2
(3.47)
with i and ni the natural spin orbitals and occupation numbers respectively. Introducing these formulas in the Euler equation (eq. 3.45) yields the Kohn-Sham orbital equations:
1 2 2 i + eff (r ) i = i i
with:
(3.48)
eff (r ) = (r ) +
(r ' )
r r'
d r ' + xc (r )
(3.49)
In which the first term is the external potential; the second term, the potential due to electron-electron repulsion and xc (r ) , the exchange-correlation potential, given by:
xc =
E xc [ (r )] (r )
(3.50)
with E xc [ (r )] the unknown exchange energy density functional. The resulting orbitals i are the Kohn-Sham orbitals and are used to construct the electron density. As can be seen from equations 3.48 and 3.49, the Kohn-Sham equations are nonlinear and have to be solved iteratively. Computationally, solving the Kohn-Sham equations is not much more demanding than solving the Hartree Fock equations. The Kohn-Sham theory, exact in principle, differs from the HartreeFock theory in its capacity to fully incorporate the exchange-correlation effect of the electrons. Note however that at this stage the exchange-correlation part remains unknown.
30
3.4 The exchange-correlation energy functionals

3.4.1 Introduction
E xc [ (r )] has encountered tremendous difficulties and continues to be a great challenge in Density Functional Theory. The difference between DFT methods is the choice of the functional form of the exchange-correlation energy. E xc [ (r )] is generally separated into the exchange Ex and the correlation Ec parts: E xc [ (r )]= E x [ (r )]+ Ec [ (r )]
(3.51)
An explicit form of E xc [ (r )] is needed to solve the Kohn-Sham equations. The search for an accurate
The correlation between electrons of parallel spin is different from this between electrons of opposite spin. The exchange energy is given by the sum of contributions of the and spin densities, as exchange involves only electrons of the same spin:
E x [ ( r )] = E x [ ( r )] + E x [ ( r )]
(3.52) (3.53)
E c [ ( r )] = E c [ ( r )] + E c [ ( r )]+ E c [ ( r ), ( r )]
The total density is the sum of the and contributions. The exchange-correlation functional can also be written as follows:
E xc [ (r )]= (T [ (r )] TS [ (r )]) + (Eee [ (r )] J [ (r )])
(3.54)
Herein, the first term is the contribution to the correlation energy of the kinetic energy obtained as the difference between the kinetic energy for the non-interacting system TS [ (r )] as calculated in the Kohn-Sham approximation and the exact kinetic energy for the interacting system T [ (r )] . The second term contains both a correlation and an exchange contribution to the exchange-correlation energy.
31
3.4.2 Hybrid methods
In this work, we use the Becke 3-Parameter (exchange), Lee, Yang and Parr (correlation) (B3LYP)22,23 functional as exchange-correlation functional. This is a hybrid functional in which the exchangecorrelation functional is divided in an exact exchange energy term and an exchange energy term founded in a local density approach (LDA), but gradient corrected.
B3 LSDA HF B LYP VWN E XC LYP = (1 a0 ) E X + a0 E X + a X E X 88 + aC EC + (1 ac ) EC
(3.55)
In eq. 3.55, EXLSDA is the exchange energy obtained from the local spin density approximation, ECVWN is the standard local correlation functional obtained by Vosko, Wilk en Nusair, ECLYP is the gradient B corrected functional for the correlation energy obtained by Lee, Yang and Parr, E X 88 is a correction on the LSDA exchange energy and EXHF is the exact Hartree-Fock exchange energy. a0, ax and aC are empirical coefficients obtained by a least-square fit to experimental data. The B3LYP method is known to give good results for several physical observables and is less demanding than post-Hartree-Fock methods. Also, it is a widely used method allowing for direct comparison with other work.
3.5 The chemical potential

The physical significance of the Lagrange multiplier from the Euler equation (eq. 3.45) can be clarified by considering the total differential of the energy E [N , (r )] for the change from one ground state to the other:
E =
E E dN + d (r )d r N ( r ) (r ) N E E d (r )d r + (r ) d (r )d r (r ) ( r ) (r )
(3.56)
This expression must be the same as the total differential of E using (r) and (r) as the basic variables:
E =
(3.57)
32
Chapter III: Theoretical background The ground state (r) must satisfy the Euler equation (eq. 3.45), as such:
E = = constant (r ) ( r )
and:
(3.58)
d (r )d r = dN
Inserting eq. 3.58 and eq. 3.59 in eq. 3.57 gives:
(3.59)
E = dN +
E d (r )d r (r ) ( r )
(3.60)
Comparing eq. 3.56 and eq. 3.60, one obtains:
E N (r )
(3.61)
In analogy with the chemical potential in thermodynamics (replace E by G and N by n at constant p and T), the Lagrange multiplier is called the electronic chemical potential. The electronic chemical potential measures the escaping tendency of an electron from the electronic cloud (cfr. the chemical potential in thermodynamics measuring the energy change when infinitesimal amounts of a given substance are added or withdrawn from the system under certain conditions). With the interpretation of the Lagrange multiplier in the Euler equation as the chemical potential, the conceptual DFT was found. Assuming a quadratic relationship between the energy E and the number of electrons N (Fig. 3.1) the finite-difference approximation to for a system can be written as:
IE + EA 2
(3.62)
In which IE and EA indicate the ionization energy and electron affinity respectively.
33
E IE
slope = IE
slope = EA EA N0 1 N0 N N0 + 1
Figure 3.1: E versus N plot for a typical chemical species.
The expression for the chemical potential is the opposite of the expression proposed by Mulliken for the electronegativity:
M =
IE + EA 2
(3.63)
3.6 Chemical potential derivatives

After the introduction of the chemical potential, other reactivity descriptors were identified. They quantify the response of the energy of a system on a perturbation in the number of electrons and/or nE chemical potential. Figure 3.2 gives an overview of all derivatives m up to the second order N m ' (r ) (n 2) together with the identification or definition of the corresponding response function.
34
E [N , (r )]
E = = N (r )
E (r ) = (r ) N
2E 2 = = N N ( r )
2 E (r ) = f (r ) N (r ) = N ( r )
(r ) 2E (r ) (r ' ) = (r ' ) = (r , r ') N N
Figure 3.2: Energy derivatives and response functions in the canonical ensemble.
nE (n 2) m N m ' (r )
The response of the chemical potential to an external perturbation can be expressed as the total derivative of [N , (r )] :
dN + d ( r ) d r N ( r ) (r ) N
(3.64)
This is the basic equation for the definition of reactivity descriptors used by the interpretation of the reactivity of different reaction partners. In the next paragraphs we discuss the reactivity indices important in this work.
3.6.1 Hardness and softness
The first term of eq. 3.64 is the curvature of the plot in figure 3.1 and defines the global hardness:
1 1 2E = 2 N ( r ) 2 N 2 ( r )
(3.65)
The reciprocal of is the global softness S24:
35
S=
1 2
(3.66)
The finite difference approximation for and S is given by:
=
and:
IE EA 2
1 IE EA
(3.67)
S=
(3.68)
The expression for the global hardness (eq. 3.67) equals half of the reaction energy for a disproportionation reaction: M + M M+ + M (3.69)
Consequently, the global hardness is the resistance of the chemical potential to changes in the number of electrons of the system. The finite-difference approximation for the global hardness is approximately equal to the band gap, which is the energy difference between the lowest unoccupied molecular orbital (LUMO) and the highest occupied molecular orbital (HOMO) in the frontier molecular orbital theory. When the gap is large (high ), the stability of the system is high and the reactivity is low and vice versa. Looking at the definition of the global softness S as the inverse of the global hardness, a local counterpart of this quantity can be introduced24:
(r ) (r ) N s (r ) = = N ( r ) ( r ) ( r )
(3.70)
The local softness s (r ) gives the distribution of the global softness of the system. s (r ) integrates to the global softness S:
S = s( r)dr
(3.71)
36

3.6.2 Fukui function
The second term of eq. 3.64, the derivative of the chemical potential with respect to the external potential (r) yields a local quantity (i. e. varying from point to point) f (r) , the Fukui function25:
(r ) f (r ) = = N ( r ) (r ) N
(3.72)
The Fukui function can be viewed as the sensitivity of a system's chemical potential to an external potential perturbation at a particular point r. Alternatively, f (r) can be seen as the change of the electron density (r ) at each point r when the total number of electrons N is changed at a constant external potential. As (r ) is expected to be discontinuous with respect to the number of electrons N, the use of different reactivity descriptors was proposed for electrophilic and nucleophilic attacks. For a system of No electrons, the left derivative can be used when N increases from No to No + and measures the reactivity towards an electrophilic attack:
(r ) f (r ) = N ( r )
(3.73)
the right derivative can be used when N decreases from N0 to N0 - and measures the reactivity towards a nucleophilic attack:
(r ) f (r ) = N ( r )
+
(3.74)
In a finite difference approximation and for a system of N0 electrons, these functions become:
f ( r) N 0 N0 1 f ( r) N 0 +1 N 0
+
(3.75) (3.76)
37
When integrating the Fukui function over atomic regions one finds the condensed Fukui functions for the nucleophilic and the electrophilic attack on atom A:
f = q A ( N0 ) qA ( N0 1) A f + = q A ( N0 + 1) q A ( N 0 ) A
(3.77) (3.78)
qA(N0), qA(N0+1) and qA(N0-1) are the atomic populations for atom A in the neutral molecule (N0 electrons) and the corresponding anion (N0 + 1) or cation (N0 - 1), all evaluated at the geometry of the neutral molecule or more generally at the geometry of the N0 electron system (cf. the demand for constant external potential in eq. 3.64). Combining eq. 3.64, eq. 3.70 and eq. 3.72, the local softness can be written as:
s( r ) = f ( r ) S

(3.79)
As a direct consequence of eq. 3.77 and eq. 3.78 two types of local (condensed) softness are defined:
s (r ) = Sf ( r ) s + (r ) = Sf + ( r )
3.6.3 Electrophilicity
(3.80) (3.81)
A quantitative measure of the electrophilicity of a species provides another useful tool for the rationalization of chemical reactivity. Starting from the question to what extent electron transfer contributes to the lowering of the total binding energy by a maximal influx of electrons, Parr et al.26 provide validation for the qualitative suggestion made by Maynard et al.27 for the electrophilic power of a ligand. Based on a second order model for the change of the electronic energy E as a function of the changes of the number of electrons N, at constant external potential (r), namely:
E = N +
N 2 2
(3.82)
with the electronic chemical potential and the chemical hardness, the electrophilicity index may E be obtained by minimizing E with respect to N ( = 0 ). N 38
Chapter III: Theoretical background The maximum electron-transfer equals:
N max =
(3.83)
and the associated stabilization energy:
E =
2 2
(3.84)
which is identified as the electrophilicity . In a finite-difference approximation, using a quadratic model for the E versus N plot, can be written as:
( IE + EA) 2 8( IE EA)
(3.85)
Eq. 3.85 indicates that depends on the electron affinity. However, EA quantifies the ability to accept exactly one electron, while is related to the maximum electron flow. depends on the hardness and the chemical potential, both global properties, making also a global quantity. A local counterpart can be identified based on the additivity of the global softness:
2 + = 2 S = 2 sk 2 k
(3.86)
The local electrophilicity is then given by:
k+ = f k+
For recent extensions to the spin polarized case see ref. 28.
(3.87)
The electrophilicity measures the reactivity towards an electrophilic attack. The nucleophilicity can be considered as the analogous reactivity descriptor for a nucleophilic reaction. However, a suitable expression for the nucleophilicity has not been identified yet.
39
3.6.4 Nucleofugality
The nucleofugality29 represents the ability of a group of atoms (the nucleofuge) to act as a leaving group. It is related to the molecular fragments ability to accept an electron, since a nucleofuge takes an electron with it upon dissociation. Prior to its expulsion, the nucleofuge is covalently linked to the electrophilic part of the molecule. Consider this part as a perfect electron donor, which transfers its electron without a barrier to an acceptor. The nucleofuge will accept an amount of charge (qideal) upon contact with this perfect electron donor. Therefore, the charge on the nucleofuge equals q+qideal when covalently bound to the electrophilic part of the molecule. Upon dissociation, a nucleofuge must take an entire electron with it, and thus changes its charge from q+qideal to q-1, leading to a destabilization energy Enucleofuge defined as the difference in energy between the product q-1 and the reactant q+qideal:
E nucleofuge = E ( q 1) E ( q + q ideal )
This equation can be rewritten as:
(3.88)
( + )2 E nucleofuge = EA + = 2
indicating that the destabilization energy Enucleofuge is related to the electron affinity.
(3.89)
In a finite-difference approximation eq. 3.89 can be expressed in terms of the vertical IE and EA of the nucleofuge:
E nucleofuge =
( IE 3EA) 2 8( IE EA)
(3.90)
The nucleofugality is inversely related to the destabilization energy Enucleofuge (eq. 3.89) - which is a kind of activation energy to overcome when a molecule is forced to accept an entire electron29:
E nucleofuge
(3.91)
with = 1.841 eV-1 ( has been chosen so that the nucleofugality of the hydride anion is equal to 1, for further details, see ref. 29).
40
Chapter III: Theoretical background The ability of a nucleofuge to act as a good or bad leaving group depends on the position of the minimum of the E versus N curve. If qideal < -1 the nucleofuge is a perfect leaving group. If qideal > -1 energy is needed to split off the nucleofuge. A descriptor related to the nucleofugality is the electrofugality29. This is the energy needed to withdraw an electron from a molecular fragment as compared to the case of a perfect electron donor:
E electrofuge = E ( q + 1) E ( q + q ideal )
or in terms of IE and EA:
(3.92)
E electrofuge =
(3IE EA) 2 8( IE EA)
(3.93)
The nucleofugality and the electrofugality can be used to assess the thermodynamic stability of the electrofuge and nucleofuge. The nucleofugality indicates the relative stability of an electronacceptor Nq-1 compared to the acceptorfragment N q+qideal in the presence of a perfect electrondonor. Analogous, the electrofugality indicates the relative stability of an electrondonor Eq+1 compared to the donorfragment E q + qideal . The electrophilicity, electrofugality and nucleofugality form a complete set of reactivity indices. They quantify the relative energy of respectively the reference system N0, the corresponding cation N0+1 and anion N0-1 in contact with a perfect electrondonor. Figure 3.3 gives an overview.
41
Eelectrofuge = E(N0-1) - E(N0+Nideal) IE = E(N0-1) - E(N0)
E
Enucleofuge = E(N0+1) - E(N0 +Nideal) EA = E(N0) - E(N0+1) = E(N0) - E(N0+Nideal)
N0-1
N0
N0+1
Figure 3.3: E versus N plot indicating the relation between the ionisation energy (IE), electron affinity (EA), electrophilicity (), nucleofugality ( Enucleofuge ) and electrofugality ( Eelectrofuge ).
3.7 Hard and soft acids and bases (HSAB) principle

The concepts of hardness and softness of a system were already introduced by Pearson in the 1960s30. They were at that time used in the explanation of acid-base reactions in their most general form (Lewis acid/Lewis base). Pearson stated that: Hard acids prefer to react with hard bases whereas soft acids prefer to interact with soft bases. This is known as the hard and soft acids and bases (HSAB) principle. According to this principle and in analogy with earlier work by Gzquez31 and our group32 and its generalization by Ponti33, the (preferred) reactivity between the reaction partners can be based on the difference in local softness s(r) of the interacting parts (atoms, functional groups,...) of these reaction partners: s(r) = |s+(r) s-(r)| which should be minimal for optimal interaction, a criterion used throughout this work. 42 (3.94)
Chapter III: Theoretical background To quantitatively predict reaction rates, one should locate the transition states and compute activation energies, which is a difficult task. The HSAB principle offers the advantage that the characteristics of a reaction (mainly kinetic aspects) are described in terms of the properties of the reagents in the ground state, without explicit numerical calculation of characteristics along the reaction path. The use of the HSAB principle is indeed based on a perturbational ansatz as formulated by Parr7. Assuming that reaction paths of similar reactions (eg. differing only in substitution pattern of one reagent) wont cross (Klopmans rule34), the relative energies at the beginning of the reaction can be expected to predict a sequence of activation energies. As such, application of the HSAB principle allows the deduction of relative activation energies from information on the reactant properties only. This principle offers the possibility to interpret and to predict the results of reaction path calculations going along with Parrs dictum: To compute is not to understand35.
4. Basis sets
4.1 Slater and Gaussian type orbitals
As discussed in previous sections, the molecular orbitals in the Hartree-Fock treatment are developed as a linear combination of nuclear-centred basis functions. There are two types of basis functions used for electronic structure calculations. The first type being the Slater Type Orbitals (STO)36:
ST nlm (r, , ; ) = Nnor rn l e r Ylm ( , )
(3.95)
in which Nnor is a normalisation constant, n, l and m are the quantum numbers, the Slater orbital exponent, r, and the spherical coordinates of the electron relative to the nucleus, and the functions Ylm(,) are the spherical harmonics. The exponential dependence on the distance between the nucleus and the electron mirrors the exact orbital for the hydrogen atom. However the calculation of three and four-centred electron integrals cannot be performed analytically37. The second type of orbitals that is mostly used are the Gaussian Type Orbitals (GTO)38:
GT ( x , y , z ; ) = Nnor x p y q z s e r
(3.96)
with Nnor a normalisation constant, p, q and s positive integers (p + q + s = l) and the Gaussian orbital exponent. As can be seen from eq. 3.95 and eq. 3.96, these two types of basis functions show a different radial behaviour. The GTO behaviour near the nucleus and at long distances is however incorrect: the GTO 43
falls off too rapidly far from the nucleus and the r2 dependence in the exponential makes the GTO poor to represent the proper behavior near the nucleus. The STO shows the correct radial behaviour, but the two-electron integral evaluation using these basis functions becomes an extremely difficult task for polyatomic molecules. GTOs on the other hand are easier to evaluate as e.g. the product of two GTOs result in one GTO. In practice, a compromise is sought between the computational efficiency of the GTO's and the correct form of the STO's. Therefore, a number of Gaussians (called uncontracted or primitive functions) is contracted in a linear combination to fit a STO39:
CGT
= C j j ( x, y, z ; j )
j =1
(3.97)
with K the degree of contraction. The expansion coefficients and orbital exponents of the primitive Gaussians in eq. 3.97 can now be optimized so that CGT approximates a Slater type function, leading to an STO-KG basis set.
4.2 Minimal basis sets

The use of the minimal basis sets (i.e. a basis set containing just enough basis functions to accomodate all the electrons of the atom) constitutes the simplest level of ab initio molecular orbital theory40-42. The essential idea of the minimal basis set is selecting one basis function for every atomic orbital including all sub shells. For hydrogen, the minimum basis set is just one 1s orbital. For carbon, the minimum basis set consists of a 1s orbital, a 2s orbital and a set of three 2p orbitals. The most common minimum basis sets is the STO-nG type, in which n primitive GTOs are combined to fit to a STO orbital. The next step up in basis set size is Triple Zeta (TZ), which has three times the number of functions as the minimal basis set, Quadruple Zeta (QZ), Quintuple Zeta (5Z)
4.3 Split valence basis set

As seen in the previous paragraph, minimal basis sets can be built from a combination of n primitive GTOs. In the same way, Pople et al.42-47 designed the split valence basis set. They are denoted as knlmG basis sets in which each core orbital is represented by a single contraction of k primitive GTOs while the valence orbital is split into three regions, represented by n, l and m primitive GTOs respectively42-47. 44
4.4 Polarization functions, diffuse functions

A better representation of the chemical bonding can be obtained by adding polarization functions. For example, the chemical bond involving a hydrogen atom is poorly described by using only its s-orbital, because of the spherical symmetry of the s-functions. The electron distribution along the bond should be clearly different than in any other direction. Consequently, adding a contribution of a p-orbital to the sorbital will improve the description of the bond. For the same reason d-type functions are needed for the first row atoms. The presence of polarization functions is denoted by the symbol: '*', which means that polarization functions are only added to p-functions. ** denotes that polarization functions are added to s- and p-functions. If methods including electron correlation are used, polarization functions are needed. Indeed, to describe the situation in which two electrons are on opposite sides of the nucleus, one needs functions with the same magnitude of exponents, but with different angular momenta. For the same reason, one can add diffuse functions, i.e. functions with small exponents. The primary argument is to extend the valence region of an atom, which is very convenient when considering systems such as anions or exited states. Diffuse functions are denoted by + or ++. The first + indicates that one set of diffuse s and p function is added to heavy atoms, the second + indicates that a diffuse s-function is added to hydrogen atoms.
5. Molecular quantities
5.1 The electron density function
The electron density function (r ) is defined as the probability to find an electron in the volume dr around r. In terms of the wave function, this function can be expressed as:
(r ) = N ... * ( x1 , x 2 ,..., x N )( x 1 , x 2 ,..., x N )d 1dx 2 ,..., d x N
(3.98)
in which the integration is performed over all the variables of the wave function, except for the spatial coordinates of one electron. This function must integrate to the total number of electrons N within the molecule:
( r)dr = N
45
(3.99)
Within the Hartree-Fock approximation (RHF case), eq. 3.99 yields:
(r ) =
=1 =1
P *
(3.100)
in which the summation is done over and v of all b asis functions.
5.2 The atomic electron population

Chemists often like to ascribe portions of the total electronic charge to specific atoms in molecules. There are several ways of achieving this. In the following paragraphs the population analysis methods important for this study will be discussed.
5.2.1 Orbital-based population analysis methods: the natural population analysis method
The natural population analysis method (NPA), developed by Reed, Weinstock and Weinhold48, attempts to define atomic orbitals based on the molecular wave function. As a result, atomic orbitals are obtained depending on the chemical environment of the atom. This approach is based on the first order reduced density matrix ( x1 , x'1 ) , defined as:
' ' ( x1 , x1 ) = ... * ( x1 , x 2 ,..., x N ) ( x1 , x 2 ,..., x N )d 1d x 2 ...d x N
(3.101)
The orbitals resulting from the diagonal reduced density matrix are called the natural orbitals and the diagonal elements are the occupation numbers. These natural orbitals are orthonormal molecular orbitals having maximum occupancy. The natural atomic orbitals are, by analogy, the atomic orbitals having maximum occupancy and are obtained as eigenfunctions of the atomic subblocks of the density matrix. Reed, Weinstock and Weinhold now defined these subblocks and obtained eigenfunctions that are orthonormal, not only within the subblock, but also with all the other eigenfunctions leading to the NPA charges.
5.2.2 Electrostatic potential derived charges
An alternative method for obtaining atomic charges is to fit the electrostatic potential to a series of point charges centered on the atomic nuclei. This monopole expansion VM is given by:
46
VM (r) =
k
qk r Rk
m i
(3.102)
The best least squares fit is obtained by minimizing y :
y (q1 , q 2 ,..., q k ) = (V (r i ) VM (r i )) 2
(3.103)
with the constraint that the total molecular charge should be preserved. The electrostatic potential derived charges used in this work were obtained by the so-called ChelpG method, designed by Breneman and Wiberg49.
6. Solvent effects
Solvent effects play an important role in determining equilibrium constants, selectivity and conformational behaviour. The large majority of methods describing chemical processes in solution are based on continuum models involving a bulk dielectric constant for the solvent and a cavity surrounding the solute molecule50. The shape and size of the cavity are differently defined in the various versions of the continuum models. The optimal size and shape of the cavity have been subject of debate, and several definitions have been proposed. The cavity should exclude the solvent and should contain within its boundaries the largest possible part of the solute charge distribution M. Obviously these requirements are in contrast with the description of the whole system given by any quantum chemical level. The electronic charge distribution of an isolated molecule, in fact, persists to infinity. In a condensed medium the conditions on M at large distances are less well-defined, but in any case there will be an overlap with the charge distribution of the medium, not explicitly described in continuum models but existing in real systems. It is universally accepted that the cavity shape should reproduce as well as possible the molecular shape. Cavities not respecting this condition may lead to deformations in the charge distribution after solvent polarization, giving large unrealistic effects on the results, especially on properties. Here, once again, there is a trade-off between computational exigencies and the desire for better accuracy. Computations are far simpler and faster when simple shapes are used, such as spheres and ellipsoids, but molecules are often far from having a spherical or ellipsoidal shape. In the following paragraphs, the widely used polarizable continuum model and an improvement of this model are discussed.
47
6.1 The PCM model

In the polarizable continuum model (PCM)51-53, the solute cavity is defined by a set of overlapping spherical atoms having the Van der Waals radius multiplied with a constant, since the first hydratation shell has dielectric properties different of those of bulk solvent. The surrounding solvent is represented by an infinite, unstructured, polarizable dielectric medium outside the boundaries of the cavity. The general basis of the PCM model can be expressed in the following way: the Hamiltonian of the system is partitioned into two parts, regarding solute (M) and solvent (S), supplemented by a coupling term:
H tot = H M + H S + H MS
(3.104)
In practice, HS is discarded and the attention is put on the Hamiltonian regarding M, with inclusion of the M-S coupling as an effective interaction operator Vint:
0 H M = H M + Vint
(3.105)
Vint results from the charge distribution in the cavity M that polarizes the continuum, which in turn polarizes the solute charge distribution. It is the sum of the electrostatic potential VM generated by the charge distribution M in the cavity and the reaction potential V generated by the polarization of the dielectric medium.
Vint = V + VM
(3.106)
To obtain the reaction potential V, the cavity surface is approximated in terms of a set of finite elements (called tesserae) small enough to consider the apparent surface charge (r) almost constant within each tessera. With (r) completely defined point-by-point, it is possible to define a set of point charges qk on each of these tesserae in terms of the local value of (r) multiplied by the corresponding area Ak.
V =
( r ) Ak
r rk
qk r rk
(3.107)
qk depends on the dielectric constant that characterizes the solvent and on the electric potential Vint describing the solute-solvent interactions. Since neither Vint nor qk are known initially, one finds the apparent surface charges by an iterative process. The converged charges are used to find the potential energy of electrostatic interaction. 48
6.2 The SCI-PCM model

The self consistent isodensity polarizable coninuum model (SCI-PCM)54-56 adopts the PCM scheme to account for the solvation effects, but differs from the PCM model by the definition of the cavity. In the SCI-PCM model, the cavity is defined based on an isosurface of the total electron density, giving a more accurate description of the cavity than the PCM model does. The density isosurface level is typically taken in the range 0.0004 0.001 a. u.
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. Jensen, F., Introduction to Computational Chemistry, John Wiley & Sons, Chichester, 1999. Levine, I. N. Quantum Chemistry, 4th ed., Prentice Hall, New Jersey, 1991. Born, M., Oppenheimer, J. R., Ann. Physik 1927, 84, 457. Born, M., Huang, K. C., Dynamical Theory of Crystal Lattices, Vol. Chapter IV and Appendices VII and VIII, Oxford, 1954. Szabo, A., Ostlund, N. S., Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory, 1st rev. ed., McGraw-Hill, New York, 1989. Hehre, W. J., Radom, L., Schleyer, P. v. R., Pople, J. A., Ab initio molecular orbital theory, Wiley, New York, 1986. Parr, R. G., Yang, W., Density Functional Theory of Atoms and Molecules, Oxford University Press, New York, 1989. Koch, W., Holthausen, M. C., A Chemist's Guide to Density functional Theory, 2nd ed., Wiley-VCH, Weinheim, 2001. Moller, C., Plesset, M. S., Phys. Rev. 1934, 46,618. Bartlett, R. J., J. Phys. Chem. 1989, 93, 1697. Gelfand, I. M., Fomin, S., Calculus of Variations, New Jersey, 1963. a. Roothaan, C. C. J., Sachs, L. M., Weiss, A. W., Revs. Mod. Phys. 1960, 32, 186194. b. Hall, G. G., Proc. Roy. Soc. (London) 1951, A205, 541. Roothaan, C. C., Revs. Modern Phys. 1951, 23, 69. Pople, J. A., Nesbet, R. K., J. Chem. Phys. 1954, 22, 571. Parr, R. G., Ann. Rev. Phys. Chem. 1983, 34, 631. Dreizler, R. M., Gross, E. K. U., Density Functional Theory, New York, 1990. Parr, R. G., Yang, W. T., Ann. Rev. Phys. Chem. 1995, 46, 701. Kohn, W., Becke, A. D., Parr, R. G., J. Phys. Chem. 1996, 100, 12974. Hohenberg, P., Kohn, W., Phys. Rev. B 1964, 136, 864. Lieb, E. H., Rev. Mod. Phys. 1981, 53, 603. Kohn, W., Sham, L. J., Phys. Rev. A 1965, 140, 1133. Lee, C., Yang, W., Parr, R. G., Phys. Rev. B 1988, 37, 2. Becke, A. D., J. Chem. Phys. 1993, 98, 5648. Yang, W., Parr, R. G., Proc. Natl. Acad. Sci. 1985, 82, 6723. Parr, R. G., Yang, W., J. Am. Chem. Soc. 1984,106, 4049. Parr, R. G., Szentpaly, L. V., Liu, S., J. Am. Chem. Soc. 1999, 121, 1922. Maynard, A. T., Huang, M., Rice, W. G., Covell, D. G., Proc. Natl. Acad. Sci. USA 1998, 95, 11578.
49
28. a. Perez, P., Andres, J., Safont, V. S., Tapia, O., Contreras, R., J. Phys. Chem. A 2002, 106, 5353. b. Olah, J., De Proft, F., Veszpremi, T., Geerlings, P., J. Phys. Chem. A 2004, 108, 490. 29. a. Ayers, P. W., Anderson, J. S. M., Rodriguez, J. I., Jawed, Z., Phys. Chem. Chem. Phys. 2005, 7, 1918 b. Ayers, P. W., Anderson, J. S. M., Bartolotti, J. L., Int. J. Quant. Chem. 2005, 101, 520. 30. Pearson, R. G., Chemical Hardness, Wiley-VCH, Weinheim, Germany, 1997. 31. Gzquez, J. L., J. Phys Chem. A 1997, 101, 4657. 32. a. Nguyen, L. T., Le, T. N., De Proft, F., Chandra, A. K., Langenaeker, W., Nguyen, M. T., Geerlings, P., J. Am. Chem. Soc. 1999, 121, 5992. b. Geerlings, P., De Proft, F., Int. J. Quant. Chem. 2000, 80, 227. c. Nguyen, L. T., De Proft, F., Nguyen, M. T., Geerlings, P., J. Org. Chem. 2001, 66, 4316. d. Nguyen, L. T., De Proft, F., Nguyen, M. T., Geerlings, P., J. Chem. Soc. Perkin Transactions 2001, 2, 898. e. De Proft, F., Geerlings, P., in Recent Advances in Density Functional Methods III, Barone, V., Bencini, A., Fantucci, P., eds, World Scientific Publishing Co, New Jersey, 2002. 33. Ponti, A., J. Phys. Chem. A 2000, 104, 8843. 34. Klopman, G., in Chemical Reactivity and Reaction Paths, Klopman G., ed, J. Wiley, New York, 1974. 35. Parr, R. G., Density Functional Theory in Chemistry. In Density Functional Methods in Physics, Dreizler, R. M., da Providencia, J., ed., Plenum, 1985. 36. Slater, J. C., Phys. Rev. 1930, 36, 57. 37. Huzinaga, S. J., Chem. Phys. 1965, 42, 1293. 38. Boys, S. F., Proc. Roy. Soc. (London) 1950, A200, 542. 39. Hehre, W. J., Stewart, R. F., Pople, J. A., J. Chem. Phys. 1969, 51, 2657. 40. Hehre, W. J., Ditchfield, R., Stewart, R. F., Pople, J. A., J. Chem. Phys. 1970, 52, 2769. 41. Pietro, W. J., Levi, B. A., Hehre, W. J., J. Am. Chem. Soc. 1980, 102, 2225. 42. Binkley, J. S., Pople, J. A. , Hehre, W. J., J. Am. Chem. Soc. 1980, 102, 939. 43. Gordon, M. S., Binkley, J. S., Pople, J. A. , Pietro, W. J. , Hehre, W. J., J. Am. Chem. Soc. 1982, 104, 2797. 44. Dobbs, K. D., Hehre, W. J., J. Comp. Chem. 1986, 7,359. 45. Hehre, W. J., Ditchfield, R., Pople, J. A., J. Chem. Phys. 1972, 56, 2257. 46. Binkley, J. S., Pople, J. A., J. Chem. Phys. 1977, 66, 879. 47. Krishnan, R., Frisch, M. J., Pople, J. A., J. Chem. Phys. 1980, 72, 4244. 48. Reed, A. E., Curtiss, L. A., Weinhold, F., Chem. Rev. 1988, 88, 899. 49. Breneman, C. M., Wiberg, K. B., J. Comp. Chem. 1990, 11, 361. 50. Tomasi, J., Mennucci, B., Cammi, R., Chem. Rev. 2005, 105, 2999. 51. Miertus, S., Scrocco, E., Tomasi, J., Chem. Phys. 1981, 55, 117-129. 52. Cossi, M., Barone, V., Cammi, R., Tomasi, J., Chem Phys. Let. 1996, 255, 327335. 53. Amovilli, C., Barone, V., Cammi, R., Cances, E., Cossi, M., Mennucci, B., Pomelli, C. S., Tomasi, J., Advances in Quantum Chemistry, Vol 32: Quantum Systems in Chemistry and Physics, Pt Ii 1999, 32, 227. 54. Wiberg, K. B., Keith T. A., Frisch M. J., Murcko M., J. Phys. Chem. 1995, 99, 9072. 55. Foresman, J. B., Keith, T. A., Wiberg, K. B., Snoonian, J., Frisch, M. J., J. Phys. Chem. 1996, 100, 16096. 56. Foresman, J. B., Frisch, A. E., Exploring Chemistry with Electronic Structure Methods, 2nd ed. Gaussian, Inc: Pitsburgh, 1996.
50
CHAPTER IV A computational and conceptual DFT study on the Michaelis complex of pI258 arsenate reductase: structural aspects and activation of the electrophile and nucleophile
The human mind treats a new idea the way the body treats a strange protein it rejects it.
(Peter Medawar)
52
Chapter IV: Michaelis complex of pI258 ArsC

The first step in the reduction of arsenate to arsenite catalyzed by the enzyme arsenate reductase (ArsC) from S. aureus plasmid pI258 involves the nucleophilic attack of a cysteine thiolate (Cys10) on the arsenic atom leading to a covalent sulfur-arseno intermediate. We present a quantum chemical study on the onset of the nucleophilic displacement reaction. To optimize the reactant state geometry, a density functional study was performed on Cys10, on di-anionic arsenate and on the catalytic site sequence motif: X-X-Asn13-XX-Arg16-Ser17. Both the hydrogen bond from Arg16 to the leaving hydroxyl group of arsenate and the hydrogen bonds from various backbone amide nitrogens of the catalytic site to the other oxygen atoms of arsenate are responsible for the increased electrophilicity of the central arsenic atom. Especially Arg16 is identified as a residue that destabilizes the groundstate of the complex. Further, the binding of dianionic arsenate to the enzyme induces negative charge transfer from the substrate to ArsC, that renders arsenic more receptive to nucleophilic attack. On the other hand, an -helical macrodipole and a K+-Cys10 interaction network via Asn13 and Ser17 activate the nucleophile and stabilize the thiolate form of Cys10 by lowering its pKa to 6.0. By dissecting these interactions and performing a reactivity analysis, the experimentally measured steady-state kinetic data and the function of crucial interactions observed in the X-ray structures of ArsC are illuminated.
1. Introduction
In this chapter we focus on the first step of the catalytic mechanism of pI258 arsenate reductase (ArsC) consisting of a phosphatase-like (PTPase) nucleophilic displacement reaction carried out by Cys10 on arsenate1 (Fig. 2.3, Chapter II). Knowledge of the correct protonation state of the enzyme bound substrate is crucial for understanding the first reaction step of the ArsC mechanism. At the pH of maximum activity (pH = 8.0)2 of pI258 ArsC, a mono-anionic as well as a di-anionic bound substrate are likely to exist (pKa2 H3AsO4 = 6.97). In the past, both experimental studies (kinetic isotope investigations and pH rate profiles) and computational investigations (pKa calculations and mechanistic studies with calculation of reaction energies) have been conducted for PTPases3-9, leading to proposals of either a mono-anionic4,5,7 or a dianionic3,6,8,9 substrate. The proposed thiolate attack on a di-anionic substrate experiences resistance because of the large electrostatic repulsion between the di-anionic substrate and the negatively charged nucleophile in the active site4. Recently, however, the di-anionic substrate received strong support from a Density Functional Theory (DFT) study on the mechanism of PTPase10. For ArsC, no mechanistic studies have been performed yet. In this chapter, a di-anionic protonation state of the enzyme-bound arsenate in ArsC is proposed and the counter intuitive idea of a reaction mechanism in which the nucleophilic attack is done by a 1 charged thiolate on a 2 charged substrate is documented. In pI258 ArsC, a hydrogen bond network11 involving SCys10, Ser17 and Asn13, terminated by the electrostatic interaction with a potassium ion (called the C10-K+ interaction network, further on) 53
(Fig. 4.1) is observed in all the X-ray structures of pI258 ArsC present in the PDB1,11. ArsC also possesses an -helix (extending from amino acid 16 to 29)1 of which the N-terminal side faces the nucleophile. Since the experimental determination of the pKa of the Cys10 thiol group is not straightforward (see Chapter II), high level quantum chemical calculations will reveal the effect of the C10-K+ interaction network and the macro-dipole arising from the -helix on the pKa of the Cys10 functional group.
Figure 4.1: C10-K+ interaction network. WT model. The Ser17 mutant (-Ser17), the Asn13 mutant (-Asn13), the potassium mutant (-K+) and the double mutant (-Asn13/- K+) are constructed from this model. In all of the model systems, the hydrogen atoms were optimized at the B3LYP/6-31+G* level. The coordinates of the heavy atoms are taken from the PDB structure 1JF8ref. 1 (See also 2.2: Interactions with the nucleophile).
All essential intermediates in the reaction mechanism of ArsC have been visualized with X-ray crystallography supplemented by NMR12, with the exception of a Michaelis complex. We will focus on 54
Chapter IV: Michaelis complex of pI258 ArsC the onset of the nucleophilic displacement reaction by Cys10 in ArsC and we will present a theoretically optimized ArsC-arsenate complex with a concomitant in-depth description of the enzyme-substrate interactions, using both computational13 and conceptual14 DFT. These interactions and a reactivity analysis provide insight into the structural features of ArsC related to its capability to activate both the electrophile (arsenate) and the nucleophile (Cys10) in the reactant state. The experimentally measured steady-state kinetic data are elucidated and crucial interactions in the X-ray structures of ArsC are explained by looking to the properties of some critical residues into the ground state.
2. Model systems and Computational details

2.1 Optimization of the Michaelis complex
pI258 ArsC is a relatively small enzyme (131 amino acids)1, but there are still too many electrons for high level quantum chemical calculations. Therefore, the enzyme needs to be described by an adequate model, combining accuracy with computational tractability. The model system of choice was constructed starting from the X-ray structure (resolution 1.4 ) of the Cys15Ala mutant of ArsC complexed with arsenite (product of the first reaction step, PDB 1LJU)12. Our model included the complete conserved catalytic sequence motif, Cys10-X-X-Asn13-X-X-Arg16-Ser17, since the backbone amides of this substrate binding loop form hydrogen bonds with the oxygen atoms of the substrate. Amino acids 10 and 17 were terminated respectively with NH2 and CONH2. The side chains of residues 11, 12, 14 and 15 were terminated on a C, since they are positioned at the periphery of the substrate binding loop where no interaction with the substrate occurs. The three well positioned water molecules present in the active site of the PDB structure 1LJU were incorporated. Di-anionic arsenate was taken as substrate. Since the hydrogen atom positions cannot be discerned from a 1.4 resolution X-ray structure, they were placed with the SPARTAN15a package and subsequently minimized by means of the Merck force field (MMFF)15b. The resulting model is called wild type (WT) throughout (Fig. 4.2). The Arg16Ala and the Asn13Ala mutants were built in silico16, starting from the coordinates of the WT model. The geometry of the structures of the WT, the Asn13Ala and the Arg16Ala mutants were optimized using a QM/QM ONIOM17-20 multilayer model. The ONIOM method has been proven to be a powerful tool for the theoretical treatment of the structure of large molecular systems19. The strength of this method is in the fact that highly accurate calculations on large systems are made possible by partitioning the systems into different layers (in general 2 or 3), each of them treated at a different level of theory. By this approach electron correlation effects can be included in quantum chemical calculations of large systems (i. e. more than 100 atoms). The wild type, Asn13Ala and Arg16Ala model systems of ArsC 55
were partitioned into two layers (Fig. 4.2). The most relevant parts, being the nucleophile and the substrate, form the inner layer and and were treated at a high level of theory (B3LYP/6-31+G**) while the remaining part of the system, the ligand binding loop, constituting the outer layer was described by a computationally less demanding method (HF/6-31G).
Figure 4.2: Reduction of the X-ray structure of ArsC (PDB 1LJU)12 to the wild type (WT) model. Partitioning of the WT model system of ArsC into 2 layers: high level represented in Ball & Stick; low level in Tube. A similar division is made for the Asn13Ala and the Arg16Ala mutants.
2.2 Interactions with the electrophile

For comparison of the charge distribution between enzyme bound and free arsenate in the gas phase, the NPA population analysis21 calculated at the B3LYP/6-31+G** level, is used. This choice is founded on the many successful applications of this population analysis in the study of molecular properties22. In contrast, Mulliken charges are not advisable to use because of their strong basis set dependence23, whereas electrostatic potential derived charges (e.g. ChelpG) are not recommended in view of the ChelpG charge-deriving scheme24 from which one can suspect a poor description of ligands embedded in large systems25, our point of interest.
56
2.3 Interactions with the nucleophile

In order to study the effect of the C10-K+ interaction network on the nucleophilic Cys10, a model system was constructed from the X-ray structure of ArsC Cys10Ser/Cys15Ala (PDB 1JF8)1. This model included Cys10, Ser17, Asn13, potassium and the potassium binding pocket11 consisting of Asp65, Glu21 Thr63, Ser36 Asn13 and two water molecules. Based on the hydrogen bond interactions present and the interactions with the potassium ion, Cys10 is modeled as CH3S-, Ser17 as CH3OH, Asn13 as NH2-CO-CH3, Glu21, Thr63 (backbone), Asp65 as CH3-COO- and Ser36 as HOCH2-CH2-COH. This model is called WT' (Fig. 4.1). Starting from the WT model, the Ser17 mutant (-Ser17), the Asn13 mutant (-Asn13), the potassium mutant (-K+) and the double mutant (-Asn13/- K+) were created. In the considered models, the hydrogen atoms were placed manually with MMFF15b implemented in the SPARTAN15a package and then optimized at the B3LYP/6-31+G* level, while the coordinates of the heavy atoms (carbon, nitrogen and oxygen) were taken from the X-ray structure (PDB 1JF8). To calculate the hydrogen bond strength between Cys10 and Ser17, after the optimization of the hydrogen atoms, the model was simplified to a two-component system consisting of Cys10 and Ser17. The error caused by the basis set superposition, the basis set superposition error (BSSE), was taken into account by the counterpoise correction (CP) proposed by Boys and Bernardi26. Since DFT provides reliable hydrogen bond strengths16a,27,28, our calculations were performed in a DFT context, at the B3LYP/6-31+G** level. When studying the effect of the -helix (extending from amino acid 16 to 29), the dipole of the helix was taken into account by representing the atoms of the helix as point charges. Before calculating these charges, the hydrogen atoms were placed by minimization with MMFF15b implemented in the SPARTAN package15a. No quantum chemical optimization was carried out after this step. Since the point charges are required to describe the electrostatic effects of the helix, the electrostatic potential derived ChelpG24 population analysis executed at the B3LYP/6-31G** level was chosen29. One should remark that the 14 amino acids of the helix form a too large system to perform a ChelpG population analysis, so that a reduction of the helix to a model is required. To verify the notion that mainly the backbone of the helix contributes to the macro-dipole, we compared the effect of the Mulliken charges of the -helix on the proton affinity of Cys10 with the effect of the Mulliken charges of a so-called Ala-helix (obtained by terminating every amino acid different from Gly at C). This comparison gives only a negligible difference of 1 kcal/mol on a proton affinity of 400 kcal/mol, permitting us to use the Ala-helix safely to study the effect of the dipole of the -helix on Cys10.
57
To explore and quantify the effect of the C10-K+ interaction network and the -helix on the basicity of Cys10, we calculated the proton affinity of Cys10 in the presence and absence of the components of the C10-K+ interaction network and the -helix. Proton affinities were calculated at the B3LYP/6-31+G** level by subtracting the energies of the optimized (B3LYP/6-31+G*) protonated and deprotonated forms. To translate changes in proton affinity to changes in the acid dissociation constant (pKa), we calculated proton affinities of a series of five thiolates (methanethiol, benzenemethanethiol, mercaptoethanol, cysteine and trifluoroethanethiol) and plotted these values against experimental pKa values (Fig. 4.3). The resulting linear relationship was used to extrapolate the pKa of Cys10 in the considered models from its calculated proton affinity.
11 1 2 pKa 3 4 5 10,3 10
R=0,882
9 8 7 6,2 6
-360
-355
-350 -345 Proton affinity (kcal/mol)
-340
-335
Figure 4.3: Proton affinity-pKa correlation curve calculated in the gas phase for a series of five substituted thiolates. 1 = methanethiol; 2 = benzenemethanethiol; 3 = mercaptoethanol; 4 = cysteine; 5= trifluoroethanethiol. Effect of the hydrogen bond network on the pKa of Cys10 is shown. The calculated proton affinities of Cys10 in the presence of different elements of the C10-K+ interaction network are inserted. 10.3 pKa of methanethiol 10.0 pKa of methanethiol in the presence of a solitary hydrogen bond with Ser17 6.2 pKa of methanethiol in the presence of the C10-K+ interaction network 6.0 pKa of methanethiol in the presence of the C10-K+ interaction network + -helix dipole effect
58
2.4 DFT Reactivity analysis

Isolated structures of H3AsO4/H2AsO4-/HAsO42-/AsO43- and CH3S- were optimized in gas phase and solution (SCI-PCM model)30 with = 20.7 on the B3LYP/6-31+G** level. Thiolate and arsenate are soft (polarizable) species. As a consequence a soft-soft mechanism underlies the reactivity of ArsC, which at the local level can be quantified by the difference in local softness (HSAB principle)31 of the interacting parts s(r) = |s+(As) s-(S)| (eq. 3.94, Chapter III). An electrostatic model was used as an approximation for the influence of ArsC on the reactivity indices of arsenate and CH3S- (as model for Cys10). Arsenate was embedded in the enzymatic environment of the WT and Arg16Ala model systems, while the environments of WT and Asn13/-K+ were used to surround CH3S-. The enzymatic environment of wild type and mutant ArsC was represented by ChelpG point charges, calculated at the B3LYP/6-31G** level24,29. NPA charges21 calculated at B3LYP/6-31+G** were used to obtain the Fukui function. The global 2 + 1 (ref. 32) and the local electrophilicity: + = ( IE + EA) f A softness is calculated using S = LUMO HOMO 8( IE EA) (eq. 3.85 and 3.89, Chapter III). All calculations were performed using the GAUSSIAN 03 package33.
3. Results and Discussion

3.1 Theoretically optimized Michaelis complex
3.1.1 Calculated model
Starting from the X-ray structure of ArsC complexed with arsenite i. e. the product of the first reaction step (PDB 1LJU)12 (Fig. 2.3, Chapter II), a model of the enzyme-substrate (Michaelis) complex of ArsC is optimized using a 2-layer QM/QM ONIOM17-20 scheme (B3LYP/6-31+G**//HF/6-31G) (Fig. 4.2). To check whether this structure is an acceptable Michaelis complex, the same methodology used to obtain the ArsC Michaelis complex was also applied to a product-like structure of a protein tyrosine phosphatase (PTPase) of the Yersinia bacteria in complex with NO3- (PDB 1YTN)34, which is analogous to AsO3-. This calculated PTPase Michaelis complex was compared with the experimental X-ray structure of a complex with a tetrahedral oxyanion (Michaelis complex-like structure) of the same 59
Yersinia PTPase (PDB 1YTS)35. All the observed enzyme-substrate interactions in 1YTS were retrieved in the optimized Michaelis complex. In analogy with this result, the structure of the ArsC Michaelis complex obtained from 1LJU by using the QM/QM ONIOM scheme (B3LYP/6-31+G**//HF/6-31G) can be treated with confidence. The dihedral angles of the peptide bonds found in the optimized enzyme-substrate complex of ArsC deviate on average by seven degrees from planarity (between a deviation-maximum and -minimum of eleven and two degrees respectively). Experimental statistical data report deviations from the exact planar peptide bond up to six degrees36a,b and even more when circular peptides are considered36c. As such, in the ligand binding pocket of ArsC, which has a circular geometry, the averaged deviation from peptide bound planarity can be considered as acceptable and the activation barriers of peptide bond rotations are properly described by the proposed ONIOM scheme.
3.1.2 Enzyme-substrate interactions
During ligand binding, the desolvation energy has to be overcome and entropy is lost by the stabilization of the ligand-binding loop. To deal with this energetically costly process several favorable enzymesubstrate interactions are formed in the Michaelis complex of which figure 4.4 and table 4.1 present an overview. All comparisons were performed with the X-ray structure of pI258 ArsC (PDB 1LJU)12, because this structure resembles most the Michaelis complex studied in this work.
Donor--Acceptor NH(Arg16)--OH(LG) OH(LG)--OH2(1) N16H--O(1) N17H--O(1) HOH(2)--O(2) N14H--O(2) N11H--O(3) NH(Arg16)--O(3) HOH(3)--O(3) l () 2.82 2.70 2.86 2.79 2.69 2.83 2.84 2.74 2.70 a () 168 147 153 166 170 153 150 164 167
Table 4.1: Enzyme-substrate interactions in the Michaelis complex. HOH(x)--O(y) points to a hydrogen bond between HOH number x as proton donor and the substrate oxygen atom number y as proton acceptor. l gives the distance between donor and acceptor in ngstrm and a gives the angle between donor-proton-acceptor in degrees. LG stands for leaving group and NxH for backbone amide group of the amino acid with number x.
60
Figure 4.4: Stereoview of the optimized (2-layer ONIOM scheme: B3LYP/6-31+G**//HF/6-31G) Michaelis complex. Figure created by Messens, J.
To discern which anionic species of arsenate is most likely to be bound in the active site, we compared the interaction energies (calculated at B3LYP/6-31+G**) of mono- and di-anionic arsenate with ArsC. The binding of di-anionic arsenate turned out to be 82 kcal/mol more favourable than that of monoanionic arsenate, despite the vicinity of the negatively charged Cys10. In the ArsC-di-anionic arsenate complex, all backbone amide hydrogen atoms in the catalytic loop are oriented toward the centre of the loop. With the exception of Gly12 and Asn13, they all form hydrogen bonds with the oxygen atoms of di-anionic arsenate. All free electron pairs of these oxygens are involved in hydrogen bonding. In the case of a mono-anionic arsenate, the extra hydrogen atom on one of the oxygens would experience steric hindrance, making a di-anionic form of arsenate more favorable. The nucleophilic SCys10 interacts with the Gly12 and Asn13 backbone amide groups and with the hydroxyl group of Ser17 via hydrogen bonds (Fig. 4.4). In the crystal structure of the product of the first reaction step (PDB 1LJU)12, the distance between SCys10 and the Gly12 amide, between SCys10 and the Asn13 amide, and between SCys10 and OSer17 are respectively 4.26 , 3.65 and 3.28 . With the exception of the first interaction, these distances are in the same range as those in the in silico obtained Michaelis complex (Table 4.1).
3.1.3 Arg16 guanidinium group
In the optimized wild type model, the guanidinium group of Arg16 provides an extension of the substrate-binding pocket, via a spherical structure surrounding the substrate. In LMW PTPase9, a crucial 61
Arg is observed at about the same structural position. Nevertheless, the orientation of the ligand in the Michaelis complex of ArsC is quite different from the position taken by the phosphotyrosine substrate in the active site of LMW PTPase. In LMW PTPase, strong hydrogen bonds are formed between NH/NH of the guanidinium group of the ArsC Arg16 homologue and two non-protonated oxygen atoms of the substrate9 (Fig. 2.2, Chapter II). In contrast, in the optimized structure of wild type ArsC only Arg16NH is involved in a hydrogen bond with a non-protonated oxygen atom, while Arg16NH donates a hydrogen bond to the leaving hydroxyl group (Fig. 2.2, Chapter II). This difference in substrate binding between ArsC and LMW PTPase might promote ArsC as an arsenate reductase instead of a PTPase. After a comparison with the X-ray structure of the first reaction step product (1LJU)12, where Arg16N interacts with the leaving water molecule (2.96 ), we can conclude that our calculated model is in full accordance. With a mono-anionic arsenate as ligand, the interaction of the leaving group with Arg16N disappears. As such, the mono-anionic substrate-enzyme complex resembles less the covalent adduct. As a consequence the binding of a mono-anionic substrate is less probable.
3.1.4 Asn13Ala structure
In the in silico optimized Asn13Ala mutant, the enzyme-substrate interactions of the wild type complex are conserved. The average hydrogen bond length is 2.76 and the average donor-H-acceptor angle is 165, a deviation from the average wild type values by only 0.01 and 5 respectively. As such, all interactions of the optimized wild type ArsC-arsenate complex are maintained in this mutant. This observation is in agreement with the similar KM values for WT and Asn13Ala ArsC. The low kcat values (Table 4.2) suggest that KM is a genuine binding constant.
ArsC wild type Arg16Lys Asn13Ala Ser17Ala KM 68 M No activity 68 M 79 M kcat 215 min-1 29.2 min-1 38 min-1 kcat/ KM (M-1s-1) 5.2 104 7 103 8 103
Table 4.2: Experimentally measured KM and kcat values for wild type and mutant ArcC. The catalysis of arsenate reductase by pI258 ArsC mutants2,12. Arsenate reductase activity determined in a coupled enzyme assay with Trx, TR and NADPH under standard assay conditions. No activity means a kcat value lower than 0.08 s-1 in an initial velocity experiment in the presence of 20 mM arsenate.
62
3.2 Reactivity analysis by means of the HSAB principle

The application of the HSAB principle31 (eq. 3.94, Chapter III) estimates the interaction strength between two interacting partners by comparing their local softness in the reactant state. This qualitative description of the interaction strength enables us to asses the protonation state of the substrate on which the nucleophilic attack occurs preferentially and to identify the residues in ArsC responsible for nucleophilic activation in the first reaction step. Confronting the local softness of the arsenic atom of arsenate with the sulfur atom of thiolate gives the following sequences of reactivity: H3AsO4 < H2AsO4- < HAsO42- < AsO43- in gas phase as in an implicit solvent model with a dielectric constant () of 20.7 representing the enzymatic environment37 (Table 4.3).
s (a.u.) CH3S- (gas phase) CH3S- ( = 20.7) H3AsO4 4.824 3.146 H2AsO43.246 2.486 HAsO420.369 1.299 AsO430.014 0.078
Table 4.3: Reactivity between arsenate and thiolate. Differences in local softness (s) between As and S calculated using s(r) = |s+(As) s-(S)| in gas phase and solvent ( = 20.7)
If the softness of the central electrophilic atom As increases, the reactivity toward the soft sulfur atom increases too. This gives a strong argument for the nucleophilic attack of a thiolate on a di-anionic rather than on a mono-anionic substrate during the first reaction step in the catalysis of ArsC. The reactivity sequence complies with the increasing kcat values2 with increasing pH of the reductase reaction catalyzed by ArsC. However, the experimentally obtained kcat values for the reduction catalyzed by ArsC are macroscopic rate constants. Therefore, apart from the first reaction step considered here, kcat may comprise other microscopic rate constants appearing in the successive reaction steps of the catalytic cycle. As such, a direct comparison between these experimental data and our theoretical calculations is only meaningful in the hypothesis that the values of the macroscopic kcat reflect the rate of the first reaction step during the enzymatic catalysis. On the other hand, since the microscopic kcat value of the first reaction step is not known, application of the HSAB principle can provide more insight in the global kcat: agreement with the experimental data could mean that the nucleophilic attack could be the rate-limiting step of the enzyme-catalyzed reaction. The reactivity between arsenate and Cys10 increases with decreasing difference in local softness of the interacting sulfur and arsenic atoms. The difference between the local softness of SCys10 and arsenic is minimized in the presence of both Arg16 and the C10-K+ interaction network. Both have a positive 63
impact on the Cys10-arsenate reactivity (Table 4.4), with the influence of the C10-K+ interaction network being the most important. According to the HSAB principle, we can argue that the stabilization of the nucleophile is of greater importance than the stabilization of the electrophile, at the onset of the first reaction step, partially explaining the decrease in kcat found for mutant ArsC (Table 4.2).
Model Wild Type electrophile effect: Arg16Ala nucleophile effect: -N13/-K+ s (a.u.) 1.575 1.601 2.269
Table 4.4: Reactivity between As and S. Reactivity in the enzymatic environment of Wild Type and mutant ArsC as measured by difference in local softness (s) between As and S (s(r) = |s+(As)-s-(S)|).
3.3 Activation of the electrophile

The active site of ArsC embraces arsenate upon binding, without any significant conformational changes of the arsenate. Compared to free arsenate in gas phase, the mean deviations of the As-O bond lengths and O-As-O angles are respectively 0.026 and 3. The largest effect of arsenate binding upon its structure is seen for the As--OH(LG) bond length for which a decrease of 0.096 is observed in comparison to free arsenate in gas phase. Since geometrical structures are function of the electron distribution, this diminished bond length seems natural, because of the observed charge transfer from arsenate to the enzyme (vide infra). In gas phase, the As-OH bond lengths (fully optimized geometries calculated at B3LYP/6-31+G**) for a diprotonated non-bound arsenate are also shorter (0.089 ) than those for monoprotonated non-bound arsenate. The As--OH(LG) bond length in the Arg16Ala mutant shortens with 0.039 in comparison to this of the wild type. A significant value, as this is almost four times as much as in the Asn13Ala mutant. A greater bond length signifies a weaker bond and as a consequence a facilitated dissociation of the leaving group. This is in line with the catalytic principle of ground state destabilization38. The elongation of the As--OH(LG) bond in the wild type Michaelis complex indicates the importance of Arg16 as ground state destabilizer. The sensitivity to nucleophilic attack of a species can be quantified in a DFT context by the electrophilicity index (). Herein, the ionization energy (IE) and the electron affinity (EA) are set equal to zero when negative values are found. If IE and EA are both negative, is beyond its action radius and loses its meaning, which is the case for free arsenate in gas phase. In the wild type complex, bound arsenate has an value equal to 5 kcal/mol, whereas for arsenate in the Arg16Ala mutant is again 64
Chapter IV: Michaelis complex of pI258 ArsC meaningless, indicating that Arg16 has a positive influence on the electrophilicity of arsenate and as a consequence on the accessibility of arsenate to the nucleophilic attack. The critical function of Arg16 in lengthening the As--OH(LG) bond and on the electrophilicity of arsenate makes this residue highly important in the reactant state. As such, it was not at all a surprise to observe a dramatic drop in activity of the Arg16Lys mutant (Table 4.2), the moment the NH was removed. One of the most important and intuitive arguments against a di-anionic substrate in the enzymatic mechanism is the Coulomb repulsion with the proximal nucleophilic thiolate6,7,8. Nevertheless, the nucleophilic attack on a di-anionic substrate has been proposed32 on the basis of predicted reactivities and calculated interaction energies between sulfur and arsenic. The difference in softness between arsenic and sulfur decreases as arsenate becomes more deprotonated (softer). This was observed in the gas phase as well as in an enzymatic environment, modeled with a dielectric constant of 20.7ref. 37 (Table 4.3). The total calculated charge on arsenate in the wild type Michaelis complex (-1.71 unit charges), clearly demonstrates a charge transfer to the enzyme (Table 4.5). This means that through the numerous enzyme-substrate interactions in the reactant state the original di-anionic substrate passes into a decreased di-anionic state by binding. This negative charge stabilization reduces the electrostatic repulsion in the enzyme and hence partially explains the binding of a di-anionic substrate.
Charge distribution (a.u.) As O(LG) O O H O Charge transfer WT-Complex 2.5915 -1.1264 -1.2656 -1.2249 0.5427 -1.2227 0.2945
*HAsO42-
(water)
2.4529 -1.1278 -1.2705 -1.2551 0.4710 -1.2705 0
Table 4.5: Charge distribution in enzyme-bound and water-solved HAsO42-. NPA charge distribution (a.u.) in enzyme bound and water-solvated di-anionic arsenate (HAsO42-), calculated at the B3LYP/6-31+G** level. * SCI-PCM33 solvent model used.
Consider the enzyme-substrate interactions N/O--H+---O-HAsO32-, with N/O-H the backbone amide or water hydrogen bond donors and OHAsO32- (arsenate) the hydrogen bond acceptor. A brief analysis of the charge transfer (Table 4.6) indicates that the spillover effect of the negative charge, as described for donor-acceptor interactions in general by Gutman39, is also found in the ArsC-arsenate hydrogen bonds, apart from two exceptions: the N14H--O(2) and the N11H--O(3) interactions (Table 4.6). Upon arsenate 65
binding, the negative charge is transferred from arsenate to ArsC. As the hydrogen atoms of the backbone amides and the water molecules directly interact with the substrate arsenate, they are the first acceptors of the transmitted negative charge. However, the fractional positive charge (+) on the hydrogen atoms of the hydrogen bond donors increases. The negative charge from arsenate is transmitted via these hydrogen atoms to the nitrogen and oxygen atoms of the hydrogen bond donors. In this way, the fractional negative charge of the nitrogen and oxygen atoms increases (Table 4.6). In small molecules, it is well known39 that after this first spillover transmission, the negative charge is transferred further to the terminal atoms. In the enzyme, terminal atoms are at very large distance, so that the transferred charge is delocalized over the system. The charge transfer from arsenate to ArsC is accompanied by a charge rearrangement within arsenate in such a way that the central arsenic atom becomes more positive compared to unbound arsenate (0.149 unit charges, almost half of the observed charge transfer of 0.294 unit charges), while the charges on the oxygen atoms remain more or less the same. Along with this increase in positive charge, the central arsenic atom becomes more electrophilic as quantitatively measured by the local electrophilicity (+). + passes from meaningless (vide supra) for the arsenic atom of free arsenate in gas phase towards 5 kcal/mol upon binding. As a consequence, arsenic becomes more receptive to the nucleophilic attack, and is activated by binding to ArsC.
Donor--Acceptor NH(Arg16)--OH(LG) N16H--O(1) N17H--O(1) HOH(2)--O(2) N14H--O(2) N11H--O(3) NH(Arg16)--O(3) HOH(3)--O(3) q N/O Free enzyme -0.616 -0.670 -0.670 -1.033 -0.670 -0.691 -0.813 -1.015
+
q H 0.449 0.406 0.438 0.473 0.407 0.414 0.438 0.482

-
q q N/O H Enzyme-substrate complex -0.646 0.484 -0.683 0.460 -0.683 0.478 -1.072 0.537 -0.668 0.464 -0.691 0.469 -0.818 0.479 -1.062 0.534
N/O -H --- O HAsO32acceptor H-bond donor
Table 4.6: Spillover effect in the charge in the enzyme-substrate hydrogen bonds in the ArsC Michaelis complex. The arrows indicate the direction of negative charge flow. NPA charges (q) are calculated (B3LYP/6-31+G**) on the hydrogen bond donor (N/O) and the hydrogen (H) atoms of the enzyme-substrate complex (WT model) and the non-bounded catalytic loop of ArsC. Coordinates of the free catalytic site are taken from the optimized geometry of the WT Michaelis complex.
66
3.4 Activation of the nucleophile

At the optimum pH for enzymatic catalysis by ArsC (pH = 8.0)2, a substantial amount of free cysteine (pKa = 8.3) is present in the thiolate form. In the enzyme-substrate complex, however, the presence of di-anionic arsenate in the vicinity of Cys10 is expected to increase the latters basicity and to drive the thiol/thiolate equilibrium toward the thiol form, which is a weaker nucleophile compared to the thiolate form40. However, it can be anticipated that the enzymatic environment favors the deprotonated state. Apart from the backbone amide hydrogens of Gly12 and Asn13, Ser17 is the only residue that interacts directly with Cys1012. Since Ser17 can not function as a general base, general base catalysis can be excluded. Rather, we strongly suggest nucleophilic catalysis with a stabilized thiolate form in the reactant state. Hydrogen bonds are known to have a pKa lowering effect on the acceptor molecule, especially when the donor molecule is positively charged41. However, this is not the case for the Cys10S--H-OSer17 hydrogen bond in pI258 ArsC and, as such, it is unlikely that this single interaction will sufficiently suppress the pKa of Cys10. As a consequence, one can think of a possible role for the C10-K+ interaction network (Fig. 4.1) in strengthening the Cys10S--H-OSer17 hydrogen bond. Under the influence of Asn13 and potassium, a clear displacement of the hydrogen from H-OSer17 towards SCys10 is observed (Table 4.7). Asn13 and potassium have the same impact on the Cys10S--H-OSer17 distance and their effect is additive (Table 4.7). The shortening of the Cys10S--H-OSer17 distance when inserting the elements of the network one by one, is in agreement with the increase of the SCys10-H-OSer17 hydrogen bond strength. Also here, an additive effect of Asn13 and potassium is observed (Table 4.7). Under the impulse of potassium, the Asn13-N-hydrogen moves toward OSer17 (Table 4.7).
H2--O WT Asn13 - mutant K+ - mutant Asn13 -/K+- mutant 1.960 1.968 H1--S 1.690 1.700 1.700 1.710 Cys10--Ser17 HBS (kcal/mol) -6.90 -6.30 -6.33 -5.68
Table 4.7: Hydrogen bond distances () and Cys10-Ser17 hydrogen bond strength (HBS) (kcal/mol) in the C10-K+ interaction network. Hydrogen bond distances are measured from the hydrogen atom to the hydrogen bond acceptor atom. H1--S: from OH of Ser17 to S of Cys10; H2-O: from NH of Asn13 to O of Ser17 (see also Fig. 4.1).
67
As mentioned in Chapter II, the experimental determination of the pKa is not straightforward, since three of the four cysteine residues present in pI258 ArsC are involved in the reaction mechanism. However, evidence for a shifted pKa can be given from high level theoretical calculations16c. Figure 4.3 gives a proton affinity-pKa calibration curve (R2 = 0.88) for a series of substituted thiolates. Herein, the calculated proton affinities of methanethiol (as modelsystem for Cys10) in the presence of different elements of the C10-K+ network are implemented. In the presence of the single SCys10--H-OSer17 hydrogen bond, a decrease of the pKa with 0.3 units is observed. When Asn13 and K+ are completing the C10-K+ network, an additional decrease of the pKa with respectively 3.4 and 0.4 units is found, yielding a final pKa of methanethiol in the presence of the hydrogen bond network of 6.2. As such, Cys10 can be predicted to be mainly deprotonated in the range of maximal catalytic activity of ArsC. The relative importance of Ser17 and Asn13 on the pKa of Cys10 correlates with the drop of the experimentally measured kcat values (Table 4.2) with a factor of about 7 and 5 for the Asn13 and the Ser17 mutation respectively. Previously, the increase of kcat/KM11 in the presence of K+ has been explained by its role in the thermal stabilization of ArsC11. Here, we showed the importance of K+ for the activation of the nucleophilic Cys10. ArsC is related to LMW PTPase1, but there, no K+ is present at this binding site. In bovine and human LMW PTPases42, the imidazole side chain of His72 occupies the space of the potassium binding site11. More specifically, the potassium coincides precisely with the (protonated) N2 of His72, which makes contacts within hydrogen bonding distance with the structurally and sequentially conserved Asn and Ser1,11. Potassium ions are particularly suited for replacing protonated histidine residues, while maintaining similar interactions with the neighbour residues. Both have a single positive charge and the ionic radius of potassium allows it to coordinate to oxygens with bond distances similar to those of hydrogen bonds. In LMW PTPases, the His-K+ substitution is not only structurally conservative, but also functionally. The histidine imidazole has been shown to contribute significantly to enzyme stability43 as well as to stabilize the thiolate anion of the nucleophilic cysteine by lowering its pKa43, exactly the two functions observed for K+ in ArsC. The unique feature of potassium binding in ArsC is further explored in Chapter VIII. In the crystal structure of the oxidized ArsC C89L mutant12, the Cys10-Cys82 disulfide bridge is formed and a negatively charged chloride ion occupies the position of the Cys10 thiolate in the reduced form, indicating the importance of maintaining the C10-K+ interaction network - starting from a negative charge. On the basis of the importance of the C10-K+ interaction network in the stabilization of the Cys10 thiolate, the drop of the experimentally measured kcat values (Table 4.2) for mutant ArsC can be explained. An extra decrease of the pKa with 0.2 units is calculated for the macro-dipole moment exerted by the helix spanning residues 16 to 29, which brings the final Cys10 pKa in S. aureus pI258 ArsC to 6.0 (Fig. 68
Chapter IV: Michaelis complex of pI258 ArsC 4.4). For Escherichia coli R773 ArsC44, the experimental pKa value of the Cys10 S. aureus pI258 ArsC analogue is 6.3 in the presence of a charged histidine residue44. Although the pKa values of the nucleophilic cysteines of both S. aureus and E. coli ArsC are of similar magnitude, the two ArsC families are not related. As a consequence, we can conclude that an analogous evolution has taken place. The S. aureus pI258 ArsC C10-K+ interaction network and the E. coli R773 ArsC histidine residue have developed separately to deal with the same problem of activation of the nucleophile.
4. Conclusion
Starting from the X-ray structure of the first reaction step product, the 2-layer QM/QM ONIOM method provides a reliable structure of the Michaelis complex of ArsC. Activation of the electrophile in the Michaelis complex of pI258 ArsC takes place by a charge transfer from arsenate to the enzyme during the formation of the enzyme-substrate complex. Especially the central arsenic atom becomes more positively charged rendering the substrate more electrophilic and more susceptible to nucleophilic attack. The observed charge transfer strengthens the evidence for a dianionic arsenate in the presence of a negatively charged sulfur on the nucleophilic Cys10. Moreover, the interaction with the N of Arg16 increases the bond length of the As-OH(LG) bond and will finally lead to the breaking of this bond in the reaction state. Stabilization of the nucleophilic thiolate form of Cys10 is accomplished by decreasing its pKa to 6.0 by both the macro-dipole of a nearby -helix and the interaction network from SCys10 via HOSer17 and HNAsn13 to potassium. All together, the use of DFT and the application of the HSAB principle leads to conclusions that were unpredictable only based on X-ray structures and steady-state kinetic data. This gives the theoretical model approach an extra dimension in the explanation of experimental data.
69
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. Zegers, I., Martins, J. C., Willem, R., Wyns, L., Messens, J., Nat. Struct. Biol. 2001, 8, 843. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. Czyryca P. G., Hengge A. C., Biochim. et Biophys. Acta 2001, 1547, 245. Kolmodin K., qvist J., FEBS Letters 2001, 498, 208. Alhambra C., Gao J., J. Comp. Chem. 2000, 21, 1192. Hansson T., Nordlund P., qvist J., J. Mol. Biol. 1997, 265, 118. Kolmodin K., Nordlund P., qvist J., Protein: Struct., Funct., Genet. 1999, 36, 370 Kolmodin K., qvist J., Int. J. Quant. Chem. 1999, 73, 147. Zhang, Z.-Y., Critical Reviews in Biochemistry and Molecular Biology 1998, 33, 1. Asthagiri D., Dillet V., Liu T., Noodleman L., Van Etten R. L., Bashford D., J. Am. Chem. Soc. 2002, 124, 10225. Lah, N., Lah, J., Zegers, I., Wyns, L., Messens J., J. Biol. Chem. 2003, 278, 24673. Messens, J., Martins, J. C., Van Belle, K., Brosens, E., Desmyter, A., De Gieter, M., Wieruszeski, J-M., Willem, R., Wyns, L., Zegers, I., Proc. Natl. Acad. Sci. USA 2002, 99, 8506. Parr, R. G., Yang, W., Density-Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. a. Parr, R. G., Yang, W., Ann. Rev. Phys. Chem. 1995, 46, 701. b. Geerlings, P., De Proft, F., Langenaeker, W., Adv. Quant. Chem. 1999, 33, 303. c. Chermette, H., J. Comp. Chem. 1999, 20, 129. d. Geerlings, P., De Proft, F., Int. J. Quant. Chem. 2000, 80, 227. e. De Proft, F., Geerlings, P., Chem Rev. 2001, 101, 1451. f. Geerlings, P., De Proft, F., Langenaeker, W., Chem. Rev. 2003, 103, 1793. a. SPARTAN version 5.0, Wavefunction, Inc. 18401 Von Karman Ave., Ste. 370 Irvine, CA 92612 U.S.A. b. Halgren, T. A., J. Comp. Chem. 1996, 17, 490. a. Mignon, P., Steyaert, J., Loris, R., Geerlings, P., Loverix, S., J. Biol. Chem. 2002, 277, 36770. b. Mignon, P., Loverix, S., Steyaert, J., Geerlings, P., Int. J. Quant. Chem. 2004, 99, 53. c. Versees, W., Loverix, S., Vandemeulebroeke, A., P. Geerlings, Steyaert, J., J. Mol. Biol. 2004, 338, 1. Svensson, M., Humbel, S., Froese, R. D. J., Sieber, S., Morokuma, K., J. Phys. Chem. 1996, 100, 19357. Dapprich, S., Komromi, I., Byun, K. S., Morokuma, K., Frisch, M. J., J. Mol. Struct. (Theochem) 1999, 462, 1. a. Torrent, M., Vreven, T., Musaev, G., Morokuma, K., Farkas, ., Schlegel, H. B., J. Am. Chem. Soc. 2002, 124, 192. b. Vreven, T., Morokuma, K., J. Phys. Chem. A 2002, 106, 6167. Vreven, T., Morokuma, K., Farkas, ., Schlegel, B. J., Frisch, M. J., J. Comp. Chem. 2003, 24, 760. Reed, A. E., Curtiss, L. A., Weinhold F., Chem. Rev. 1998, 88, 899. Bachrach, S. M., in Reviews in Computational Chemistry, Volume V, (Liprowitz, K. B. and Boyd, D. B. eds) VHC, New York, 1997. Jensen, F., Introduction to Computational Chemistry, John Wiley and Sons, New York, 1999. a. Breneman, C. M., Wiberg, K. B., J. Comp. Chem. 1990, 11, 361. b. Hayes, D. M., Kollman, P. A., J. Am. Chem. Soc. 1976, 98, 3335. a. Bultinck, P., Langenaeker, W., Lahorte, W., De Proft, F., Geerlings, P., Waroquier, M., Tollenaere, J. P., J. Phys. Chem. A 2002, 106, 7887.
15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.
70

b. Bultinck, P., Langenaeker, W., Lahorte, W., De Proft, F., Geerlings, P., Van Alsenoy, C., Tollenaere, J. P., J. Phys. Chem. A 2002, 106, 7895. a. Boys, S. F., Bernardi, F., Mol. Phys. 1970, 19, 553. b. Van Duijneveldt, F. B., Van Duijneveldt-Van de Rijdt, J. G. C. M., Van Lenthe, J. H., Chem. Rev. 1994, 94, 1873. Koch, C. W., Holthausen, M. C., A Chemist`s Guide to Density Functional Theory, Second Edition, Wiley VCH, Weinheim, Germany, 2001. a. Smallwood, J. C., McAllister, M. A., J. Am. Chem. Soc. 1997, 119, 11277. b. Pan, Y., McAllister, M. A., J. Am. Chem. Soc. 1997, 119, 7561. Sigfridsson, E., Ryde U., J. Comp. Chem. 1998, 19, 377. a. Tomasi, J., Perisco, M., Chem. Rev. 1994, 94, 2027. b. Wiberg, K. B., Keith, T. A., Frisch, M. J., Murcko, M., J. Phys. Chem. 1995, 99, 9072. c. Foresman, J. B., Frisch, AE. Exploring Chemistry with Electronic Structure Methods, 2nd ed. Gaussian, Inc: Pitsburgh, 1996. Pearson, R. G., Chemical Hardness, Wiley-VCH, Weinheim, Germany, 1997. Roos, G., Loverix, S., De Proft, F., Wyns, L., Geerlings P., J. Phys. Chem. A 2003, 107, 6828. Gaussian 03, Revision A.1, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuceria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M Hada, M. Ehara, k. Toyota, R. Fukuda, J. Hasegawa M. Ishida, T. nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J.A. Pople, Gaussian, Inc., Pittsburgh PA, 2003. Fauman, E. B., Yuvaniyama, C., Schubert, H. L., Stuckey, J. A., Saper, M. A., J. Biol. Chem. 1996, 271, 18780. Schubert, H. L., Fauman, E. B. Stuckey, J. A., Dixon, J. E., Saper, M. A., Protein Sci. 1995, 4, 1904. a. Scarsdale, J. N., Van Alsenoy, C., Klimkowski, V. J., Schaefer, L., Momany, F. A., J. Am. Chem. Soc. 1983, 105, 3438. b. MacArthur, M. W., Thornton, J. M., J. Mol. Biol. 1996, 264, 1180. c. Ramachandran, G. N., Biopolymers 1968, 6, 1494. a. Fitch, C. A., Karp, D. A., Lee, K. K., Stites, W. E., Lattman, E. E., Garcia-Moreno, E. B., Biophys. J. 2002, 82, 3289. b. Schutz, C., Warshel, A., Proteins: Struct., Funct. and Genet. 2001, 44, 400. c. Dillet, V., Van Etten, R., Bashford, D., J. Phys Chem B 2000, 104, 11321. a. Bayliss, W. M., Sir. The Nature of Enzyme Action, 5th edition. Longmans, Green & Co., London, 1925. b. Haldane, J. B. S., Enzymes, Longmans, Green & Co., London, 1930. a. Gutman, V., The Donor-Acceptor Approach to Molecular Interactions, Plenum Press, New York and London, 1978. b. Geerlings, P., Tariel, N., Botrel, A., Lissillour, R., Mortier, W. J., J. Phys. Chem. 1984, 88, 5752. Dantzman, C. L., Kiessling, L. L., J. Am. Chem. Soc. 1997, 118, 11715. Jeffrey, G. A., An Introduction to Hydrogen Bonding, Oxford University Press, New York, 1997. a. Logan, T. M., Zhou, M. M., Nettesheim, D. G., Meadows, R. P. Van Etten, R. L., Fesik, S. W., Biochemistry 1994, 33, 11087. b. Zhang, M., Van Etten, R. L., Staufacher, C. V., Biochemistry 1994, 33, 11097. c. Zhang, M., Stauffacher, C. V., Lin, D., Van Etten, R. L., J. Biol. Chem. 1998, 273, 21714. Thomas, C. L., McKinnon, E., Granger, B. L., Harms, E., Van Etten, R. L., Biochemistry 2002, 41, 15601. Gladysheva, T., Liu, J., Rosen, B. P., J. Biol. Chem. 1996, 271, 33256.
27. 28. 29. 30. 31. 32. 33. 34. 35.
36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.
71
72
Chapter V Intermezzo Gas phase stability of tetrahedral multiply charged anions: a conceptual and computational DFT study
How can you do both physics and poetry? In physics we try to explain in simple terms something that nobody knew before. In poetry it is the exact opposite.
(Dirac to Oppenheimer)
74
Chapter V: Repulsive Coulomb Barriers

Multiply charged anions (MCAs) are unstable relative to electron auto-ejection, however, the repulsive Coulomb barrier (RCB) provides electronic stability. In view of their interest in biological systems, the behaviour of isolated AsO43-, PO43-, SO42- and SeO42- in gas phase and in solution has been studied. To calculate the RCB values, the electrostatic and point charge model - two methods currently used in literature - are applied, together with a recently introduced Conceptual Density Functional Theory (DFT) based approach. The relative stability of the above mentioned MCAs is compared. The trends of the RCB are analysed by including analogous compounds from the second and third row and by passing from dianionic to trianionic systems. Considering the effect of solvent, using the SCI-PCM solvent model, the evolution of the RCB when passing to higher dielectric constants is evaluated. The RCB is related to the properties of the system as polarizability/softness. Both a numerical and conceptual correlation between the RCB and the global softness is found.
1. Introduction
The previous chapter1,2 reports reactivity analyses of multiply charged anions in gas phase, condensed phase and enzymatic environment. In gas phase the E versus N curve for di- and tri-anionic arsenate and phosphate clearly ascends1, which was previously mentioned for the prototypical multiply charged gas phase unstable sulfate anion3. This unusual evolution of the E versus N curve is due to the instability of multiply charged anionic (MCA) compounds in gas phase with respect to electron emission due to strong Coulomb repulsion. Multiply charged anions (MCAs) are common in the condensed phase and play an important role in chemistry, material science and biochemistry. Many familiar inorganic MCAs in the gas phase are for example present as stable entities in proteins. They are stabilized with respect to electron emission trough the numerous interactions with the enzymatic environment. For example, our preceding work showed the stabilization of the negative charge of the substrate arsenate in complex with the enzyme arsenate reductase (ArsC)2. Upon binding, due to the several ArsC-arsenate hydrogen bonds, negative charge from arsenate is transferred to ArsC by which the di-anionic arsenate passes into an intermediate form between mono- and di-anion. Phosphates, sulphates and selenates are other examples of inorganic compounds that are present as stable multiply charged species in proteins such as phosphatases, sulphate-binding protein and molybdenum enzyme4. In gas phase, however, one has to deal with the electronic instability of MCAs. The strong intramolecular Coulomb repulsion due to the excess of negative charge makes MCAs very fragile and sensitive to electron autodetachment5. Therefore, MCAs have rarely been observed in gas phase until 75
very recently, when increasingly sophisticated experimental tools based on photo-electron spectroscopy (PES) and electrospray ionisation have become available5,6. The experimental observation of MCAs can be explained by the existence of a repulsive Coulomb barrier (RCB) hindering the emission of one of the excess electrons. The RCB has a destabilizing and a stabilizing influence. Starting from an N-electron reference system with charge n, the escaping electron experiences both a valence-range attractive potential depending on the electrostatic and induced moments of the (n-1)-charged molecule that is left behind and a long range Coulomb repulsive potential caused by the remaining (n-1)-charged anion. If in the valence region the attractive valence potentials are stronger than the repulsive Coulomb potentials, the lowest bound state of the n-charged anion lies below the lowest state of the (n-1)-charged anion, and the n-charged anion is electronically stable. On the other hand, when the Coulomb repulsive potential is strong enough to outweigh the attractive potentials, the n-charged anion is electronically unstable with respect to the (n-1)-charged anion. However, the existence of the RCB stabilizes the N-electron anion by requiring the departing electron to tunnel trough this barrier to escape, through which a metastable anion of charge n results7 (Fig. 5.1). The RCB requires the departing electron to overcome a potential barrier to escape, which can be a very unlikely process. Consequently, relatively long but finite lifetimes can be observed for metastable species7,8,9.
A B
Figure 5.1: Stable versus metastable systems. Schematic potential energy curves showing both the binding energy of an electron to a (n-1)-charged system as well as the repulsive Coulomb barrier (RCB). A. Original n charged N-electronic system is stable B. N-electronic system is metastable. R = Distance to the ejected electron
Since arsenate, phosphate, selenate and sulphate are of biological interest, these MCAs are often target for theoretical studies of enzymatic processes1,10. Consequently, it is of fundamental chemical and physical significance to understand the behaviour of these MCAs in gas phase. 76
Chapter V: Repulsive Coulomb Barriers RCBs are calculated for tri-anionic arsenate (AsO43-) and phosphate (PO43-), and di-anionic sulphate (SO42-) and selenate (SeO42-) using two methods currently applyed in literature: the electrostatic11 and the point charge (PCM)9,8 model. A third approach to calculate the RCB, based on conceptual Density Functional Theory (DFT)12, is introduced here for the first time in order to interpret the RCB in terms of molecular properties (softness). Since the studied anions are highly polarizable, the relationship between the trends in the RCB values and the trends in the global softness (S)13 are explored. Also, the stabilizing effect of solvent on MCAs is investigated.
2. Theoretical background
Although the RCB is clearly dominated by the electrostatic interaction between the outgoing electron and the residual anion, it is emerging from a non-local and energy-dependent potential8,9. An exact theory for the RCB can be derived in the framework of Greens function theory, in analogy to scattering potentials, to which the RCB is closely related to. The Greens function (GF) obeys a Dyson equation, relating the GF for the total system to the free GFs of the unperturbed anion via its self-energy. The self-energy corresponds to the exact potential experienced by an electron when it is emitted from the electronically unstable MCA, and can thus be identified with the RCB. The exact self-energy is not straightforward to compute, and due to its non-local, energy dependent, and probably complex nature not easily depictable8,9. Therefore, approximations have to be made. Simons et al.7 have shown that a simple Coulomb-energy model can be applied to roughly estimate the height of the RCB of compact stable and metastable MCAs7. Beyond this simple Coulomb-model, the RCB can be computed in the framework of local ab initio approaches, which are meaningful approximations for the true RCB, when the system under investigation is spatially extended8. A straightforward and natural way to estimate the RCB is to compute the total energy of the (N-1)system in the presence of a negative point charge, which may represent the outgoing electron. If the negative point charge is placed at varying distances r to the mono-anion, one readily obtains a complete potential barrier profile. The RCB determined by the point charge model (PCM) potential, denoted as VPCM (r), is then obtained from the equation:
VPCM (r ) = E0 (r ) E0
(5.1)
with E0(r) the total energy of the (N-1)-system in the presence of the negative point charge at distance r, while E0 is the total energy of the (N-1)-system8,9. Note that when using atomic units, the charge of an electron (absolute value) equals 1, so energy and potential are numerically equal here and will be used together throughout. 77
The point charge model correctly describes the remaining (N-1)-anion at large distances between the point charge and the remaining (N-1)-anion, but possesses some weakness at short distances. By fixing the approaching electron at a certain position r trough which this electron becomes distinguishable from the other N-1 electrons of the system, the PCM model looses accuracy. When the electron approaches, the (N-1)-system becomes statically polarized. This can only happen when the electron approaches with high velocity. Because this is not necessarily the case, the static polarization of the (N-1)-system makes the PCM model less rigorous. (For an extensive discussion see ref. 8 and 9.) The electrostatic model11 calculates the RCB as the potential energy of interaction between an electron and a charged sphere. This potential energy is largely governed by the polarization and the long-range Coulomb repulsion and is given by: W =
14.4 14.4 +Q R 2 R4
(5.2)
in which is the polarizability (in 3), Q (in a. u.) the remaining charge of the sphere after the electron has left, and R (in ) the distance between the centre of mass of the charged sphere and the leaving electron. The conversion factor 14.4 is given in eV. Within the framework of Conceptual DFT12, our group proposed a methodology to calculate the interaction energy between a molecule and a single point charge, based on first-order perturbation theory to the electron density14. This method was originally used to calculate interaction energies in adsorption processes of zeolites14 and is applied in this work for the first time to calculate RCBs. The interaction energy E(R) of a molecule and a single point charge q is given by:
E ( R) = qV ( R) + q 2
(r , r `)
rR r` R
d rd r `= E1 ( R) + E 2 ( R)
(5.3)
where V(R) can be identified as the classical molecular electrostatic potential (MEP)15 at position R. In a point charge model and for molecules at large distances, the first term of eq. 5.3 reduces to the Coulomb term of eq. 5.2.
(r ) 16 (r,r`) stands for the linear response function (r ' ) N . According to the Berkowitz-Parr relation
78
(r,r`) is equal to:
(r , r `) =
s(r ) s(r `) s(r , r `) S
(5.4)
E2(R) includes the response of the systems density to the change in potential due to the leaving electron as seen from the exact expression for E2(R):
E2 ( R ) = ( r , r `) ( r `) ( r ) d rd r `
(5.5)
in which (r ) stands for a variation in the external (i.e. due to the nuclei and, in the MCA case, the leaving electron) potential. Writing the external potential due to the leaving electron as:
( r ) =
1 rR
(5.6)
and s(r,r`) as usual simplified to17:
s(r , r `) Sf (r ) (r r `)
V ( R ) 2 f (r ) el E2 (R ) S dr 2 N rR
(5.7)
where f(r) is the electronic fukui function and S the global softness, E2(R) can be written as14: (5.8)
This term is a correction term arising from the change in electron density of the MCA when the excess electron is emitted and involves besides global properties as the global softness (S), also two local properties: the N-derivative of the electronic part of the electrostatic potential
V el ( R ) and the N
electronic Fukui function f(r) = N , being the N-derivative of the electron density function, both evaluated at constant external potential.
(r)
79
For a wave function obeying the Hellman-Feynman theorem, function18 (R), and can be written as F ( R ) N function acting on a unit charge placed at R.
f (r) rR
2
d r is the nuclear Fukui
where F(R) is the force due to the electronic Fukui
The response of the (N-1) systems density to the departing electron is fully included into the PCM model7, while an additional approximation (eq. 5.7) has to be made on eq. 5.5, itself derived from first order perturbation theory, to obtain the working equation (eq. 5.8). Therefore, the DFT based expression can be considered as less exact in comparison to the PCM model, but has the advantage to be expressed in terms of molecular reactivity descriptors, describing the evolution of a system when passing from N-1 to N electrons and vice versa. E2(R) explicitly demonstrates a dependence on S, which was seen to be proportional to ref. yielding a correspondence with the R-4 term in eq. 5.2.
17,19
3. Computational details
Calculations were performed on AsO43-, PO43-, SO42- and SeO42-. The geometries of the considered Ncharged anions were optimised using the B3LYP20 exchange-correlation functional with a 6-31+G** basis set21. All further calculations were carried out on these optimised geometries at the B3LYP/631+G** level. To assure a constant external potential needed in the calculation of the DFT descriptors, calculations on the (n-1)-charged systems were also performed on the optimised geometries of the ncharged systems. The polarizability appearing in eq. 5.2 was calculated as the arithmetic average of the diagonal elements of the polarizability tensor:
xx + yy + zz
3
(5.9)
obtained analytically.
80
Chapter V: Repulsive Coulomb Barriers The global softness S is given by the finite difference approximation (cfr. eq. 3.68, Chapter III):
S=
1 IE EA
(5.10)
where IE and EA are the ionization energy and the electronaffinity respectively. In eq. 5.10, the ionization energy (IE) and the electron affinity (EA) are set equal to zero when negative values are found. Consequently, for SO42- and SeO42-, S is approximated by the inverse of the ionisation energy while for PO43- and AsO43- S is undefined (both IE and EA are negative). To overcome this problem, the equality of the ratio of the global softness of the (N-1)-system and the (N-2)-system (SN-1 and SN-2 respectively) with the ratio of their polarizabilities (N-1 and N-2) is used:
S N 1 N 1 S N 2 N 2
(5.11)
This equality is based on the proposed proportionality between S and ref.17,19.

Vel ( R ) and F ( R ) N N
are calculated using the finite difference approach in which N is set equal
to 1. The electrostatic model yields a spherically averaged potential, while both the PCM model as the DFT based model provide direction dependent potentials. As a consequence, when using the PCM model and the DFT based model, potential curves for electron emission can be obtained for all possible directions. For the tetrahedral anions considered here, the RCB was calculated into the positive and negative direction of one of the four equivalent threefold X-O axes (X = As, P, S, Se) and in the direction of one of the two equivalent twofold X-Y axes (with Y the midpoint between two oxygen atoms) (Fig. 5.2).
81
Figure 5.2: Directions into which potential curves for electron emission are calculated for the tetrahedral anions XO42-/3- with X = S, Se, As, P. Colour code: threefold axis, X-O (positive direction): black, O-X (negative direction): blue; twofold axis, O-Y (Y is the midpoint between two oxygen atoms): red.
To investigate the stabilization of MCAs in solvent, the SCI-PCM22 solvent model was used. All calculations were performed using the GAUSSIAN 0323 package.

4.1 Electronically unstable MCAs
The electronic instability of AsO43-, PO43-, SO42- and SeO42- can be seen through both their positive HOMO energies found in agreement with Roos et al.1 (Table 5.1) and their negative ionisation energies, implying that the anion An-, at its optimal geometry has a higher energy than the corresponding anion A(n-1)- with lower charge at the same geometry. 82

SO4 2SeO4 2PO43AsO43IE -0.0478 -0.0247 -0.2636 -0.2357
HOMO
0.132 0.105 0.332 0.298
Table 5.1: Instability of MCAs. Calculated (B3LYP/6-31+G**) Ionisation (IE) and HOMO (HOMO) energies (a.u.) of SO42-, SeO42-, AsO43-, PO43-.
4.2 Calculation of the RCB

As mentioned before, the existence of the RCB stabilizes the MCAs through which these MCAs receive a metastable nature8,9. The RCBs calculated with the PCM model7,8 (eq. 5.1), the electrostatic model11 (eq. 5.2), and with the DFT based approach (eq. 5.3 and 5.8) are given in table 5.2 and figure 5.3.
RCB Electrostatic Model 3.62 3.07 8.25 7.37 RCB Point Charge Model 3.85 3.40 8.28 7.79 RCB DFT based Model 11.81 10.69 25.79 25.14 S(n-1) 16.18 17.57 17.09 18.49
SO4 2SeO4 2PO4 3AsO4 3-
Table 5.2: Repulsive Coulomb barriers. RCB`s (eV) calculated (B3LYP/6-31+G**) with the electrostatic (eq. 5.2), the PCM (eq. 5.1) and the DFT based (eq. 5.3 and 5.8, with S calculated with eq. 5.10) model compared with the global softness S (a.u., calculated with eq. 5.14) of the (n-1)-charged system.
The RCB values calculated with the electrostatic and the PCM model are of comparable magnitude. The values calculated with the DFT model are about three times larger than the other estimations of the RCB, but qualitatively, the three calculation methods for the RCB give similar results. From the graphs presented in figure 5.3, it is seen that when the barrier height decreases, the position of the barrier is slightly shifted to longer distance from the nucleus. The RCB values obtained for the di-anionic systems are considerably lower than those for the tri-anionic systems. This is in agreement with the statement by Dreuw et al.8,9 that the magnitude of the RCB strongly depends on the electrostatic repulsion between the excess negative charge and the escaping electron.
83
2A
2B
3A
3B
Figure 5.3: Study of the RCB potentials. RCB potentials of SO42- (blue), SeO42- (pink), PO43- (green) and AsO43- (light blue) in gas phase obtained with 1. the electrostatic model (eq. 5.2), 2. the PCM model (eq. 5.1) and 3. the DFT based model (eq. 5.3 and 5.8), with S calculated with eq. 5.10. In the case of nonspherically averaged potentials, results are given A. along the threefold axis and B. along the twofold axis.
84
Chapter V: Repulsive Coulomb Barriers Therefore, higher negatively charged species, e.g., tri-anions, in general give rise to higher repulsive Coulomb barriers than less charged species, e.g., di-anions. For a given ionic charge, the RCB values decrease when the global softness S of the considered systems increases. The direction dependence of the RCB can be studied with the PCM model and DFT based models, because these models provide direction dependent potentials. It appears that the difference in molecular environment experienced by the electron emitted along the different directions shown in figure 5.2 is relatively small when the PCM model is used. A larger anisotropy is seen when the DFT based model is used (Fig. 5.4).
Figure 5.4: Anisotropy of the RCB potentials for SO42-. The RCB values are obtained with A. the PCM model (eq. 5.1) and B. the DFT based model (eq. 5.3 and 5.8, with S calculated with eq. 5.10). Colour code: twofold axis: green; threefold axis, negative direction (O-X): orange; threefold axis, positive direction (X-O): black.
In the literature, RCB values of both mono-atomic and more extended systems were computed with several methods7-9,11. However, to the best of our knowledge, no studies are performed yet on the highly symmetric tetrahedral MCAs considered in this work, with the exception of SO42-. Hydrated SO42- was extensively studied by Yang et al.11. Their RCB values of water solvated SO42- obtained with the electrostatic model, varies from 2.8 eV to 1.2 eV, when the number of added water molecules was gradually increased from 4 to 50. These values are in excellent agreement with photon-energydependent PES spectra derived RCB values11, decreasing from 2.4 eV to 1.7 eV when the number of water molecules increases from 4 to 18. Using the electrostatic model, we obtain an RCB value of 3.6 eV for SO42- in gas phase, in agreement with the stabilization and the decrease of the RCB of MCAs in solvent. The RCB values for SO42- obtained with the PCM and the DFT based model in gas phase are in line with this principle too. 85
The PCM model is strongly basis set dependent8: the RCB gradually decreases with increasing diffuseness of the basis set, hindering a direct comparison with values reported in literature. Nevertheless, the RCB values we obtain for the tetrahedral anions are higher than those reported for mono-atomic systems as O2- and F2- (calculated with a double-zeta polarized basis set)8,9. This is in accordance with chemical intuition: the larger the system, the more stable it is and thus the lower the RCB value.
4.3 Correlation between the RCB and S

The well-known trend of increasing softness when passing in analogous compounds from the second to the third row12,24 is accompanied by the decrease in the RCB values (Table 5.2). For the electrostatic model, this result is expected, since from eq. 5.2 an expression, explicitly including the charge Q of the (N-1)-system, of in terms of the RCB can be written11:
629.86 RCB
3
Q4
(5.12)
with the polarizability of the (N-1)-system, given in 3 and the RCB given in eV. Introducing the proportionality between the polarizability and the global softness S, put forward first by Politzer19, and later on derived by Vela and Gzquez17, one obtains from eq. 5.12:
S ~ RCB 3 Q 4
(5.13)
with S the global softness of the (n-1)-charged system. The expected correlation was not found when S is calculated with eq. 5.10 combined with eq. 5.11. To overcome this, IE and EA of eq. 5.11 can be approximated by respectively the energy of the highest occupied molecular orbital (HOMO) and by the energy of the lowest unoccupied molecular orbital (LUMO), making use of a Koopmans25 type of approximation yielding for S:
S=
LUMO
1 HOMO
(5.14)
This methodology was proven before, among others by our group, to give reliable results1,26 in applications of conceptual DFT.
86
Chapter V: Repulsive Coulomb Barriers When S is calculated with eq. 5.14, a fair correlation as expected by eq. 5.13, is now found between S and RCB-3 (Fig. 5.5, R2 = 0.69). The here-discussed problem illustrates the inherent difficulties of evaluating S for highly negatively charged systems.
Figure 5.5: Correlation between S and RCB-3. S (a. u.) of the (n-1) charged system is calculated with eq. 5.14 and RCB-3 (eV-3) obtained with the electrostatic model.
By introducing the proposed27 correlation between and S3 in eq. 5.12, the following relationship between the RCB and S-1 can be written: RCB ~ S -1 Q 4/3 (5.15) Using this proportionality, a better linear correlation between the RCB and S-1 is found (Fig. 5.6, R2 = 0.86) arguing in favour for the proportionality between and S3, in agreement with Ghanty et al.27, on the condition that eq. 5.12 holds. Turning to the PCM model, from eq. 5.1, no direct evidence for a correlation between the RCB and the global softness S of the (N-1)-system is given. Nevertheless, the similarity between the results obtained with the PCM model and those obtained with the electrostatic model provides evidence that the inverse correlation between the RCB and S remains valid. A very good linear relationship between the RCB values of the di-anionic and tri-anionic systems and the inverse of the global softness (calculated with eq. 5.14) of their (n-1)-charged counterparts is indeed obtained (Fig. 5.6, R2 = 0.93 for di-anions, R2 = 0.99 for tri-anions). Note that seen the charge dependency of the RCB values, di- and tri-anionic systems are treated separately here. As a consequence, for the tri-anionic systems, only three points are available implying that the obtained correlation has to be treated with the necessary caution. Moreover, the discussed correlation curves are obtained for the studied anions supplied with an extra series of 87
anionic systems (S2-, O2-, N3-, HAsO42- and HPO42-), to have a broader range over which the correlation can be made.
Electrostatic model
9
8
A
9
PCM
13 7 12 2 R = 0.99 11 Q = -3 10 9 1 3 8 7 6 5 5 8 4 2 4 3 0.05 0.07
B
RCB (eV)
6 5
8 7 6 5 4 1 3 2
RCB (eV)
R2= 0.86
R = 0.93 Q = -2
6 9
4 3
0.05
0.07
-1
0.09
0.11
0.13
0.09
0.11
0.13
S (a.u. )
-1
S (a.u. )
-1
-1
Figure 5.6: Correlation between the RCB and S-1. The RCB is (eV) calculated with A. the electrostatic model (eq. 5.2) and B. the PCM model (eq. 5.1) . S (a. u.) of the (n-1)-charged system is calculated with eq. 5.14. Numbering of the data points: 1 = AsO43-, 2 = SO42-, 3 = PO43-, 4 = SeO42-, 5 = HAsO42-, 6 = HPO42-, 7 = N3-, 8 = S2-, 9 = O2-.
In all three approximations used in this work, the non-locality and energy dependence is not considered. With Greens Function theory, non-local and energy dependent RCBs can be obtained. Exact GF results are however very difficult to calculate and moreover very hard to interpret using the softnessconcept. Since the relation between the RCB and the global softness of the system forms the aim of our work, we opted for a methodology involving approximations for the RCB from which a direct relationship with the global softness emerges. Although qualitatively the results obtained with the DFT based model are similar to those obtained with the electrostatic and the PCM model, the numerical correlation between the RCB calculated with the DFT based model and S-1 is lost. The importance of the DFT based model can be found in the explicit presence of the global softness of the (N-1)-system into the equation. Numerically, the S dependent term dominates in eq. 5.3. Although showing qualitative agreement, the behavior of the potential, calculated with the DFT based model, with R deviates quantitatively from that of the potential curve calculated with the two other models (Fig. 5.3). In the DFT based model, the shape of the potential curve is largely dependent on the polarization term of eq. 5.3, a local model of the softness kernel s(r,r`). It is also not excluded that these discrepancies are due to the inherent difficulties of evaluating S for highly 88
Chapter V: Repulsive Coulomb Barriers negatively charged systems. In addition, we have to mention that the relation between S and is not exact, from which further deviations can occur. Softness or its inverse, hardness, is describing the ability to retain electronic charge once the charge has been acquired. Further, S can be approximated as the inverse of the ionization energy (when very small or negative values of the electron affinity are neglected in eq. 5.10), which is a measure of the systems stability towards electron emission. As a consequence, trough eq. 5.3, the RCB is expressed in terms of properties describing the ease of the evolution of a system from N- to (N-1)-electrons.
4.4 Stabilization of MCAs

The significant reduction of the RCB of solvated MCAs relative to bare ones and the continuous decrease of the RCB as more solvent is added, has been described before11,28. In analogy to these studies, the stabilisation of the prototypical MCA SO42- in solvent is followed. Instead of adding solvent molecules in a discrete model, the dielectric constant of the continuum model is changed. RCB values are calculated with both the electrostatic model (eq. 5.2) and the PCM model (eq. 5.1). Both methods demonstrate a decreasing RCB when the dielectric constant increases (Fig. 5.7), implying that the higher the dielectric constant, the more the multiply charged anion is stabilized. Although the same behaviour of the RCB in function of the dielectric constant is found for both calculation methods, the RCB decreases faster and disappears finally when the PCM model is used. The stabilization of the multiply charged SO42- in solvent is in line with the literature11,28 and in agreement with the positive and increasing ionisation energies, and the negative and decreasing HOMO energies found when the dielectric constant of the solvent increases (Table 5.3).
0 1.43 2.247 4.9 10.36 20.7 78.39
IE -0.048 0.035 0.105 0.172 0.202 0.215 0.225
HOMO
0.132 -0.157 -0.203 -0.248 -0.267 -0.276 -0.283
RCB Point Charge Model 3.847 1.154 0.638 0.157 0.041 0.000533 0.000132
RCB Electrostatic Model 3.627 3.305 2.973 2.451 2.158 1.929 1.675
Table 5.3: Solvent effects on the RCB. Ionisation (IE) and HOMO (HOMO) energies (a.u.) and RCB values (eV) evaluated at the B3LYP/6-31+G** level into the positive direction of the threefold X-O axis of SO42- with the electrostatic (eq. 5.2) and the PCM (eq. 5.1) models for different values of .
89
Figure 5.7: Solvent effects on the RCB. Study of the RCB potential of SO42- obtained with A. the electrostatic model (eq. 5.2) and B. the PCM model (eq. 5.1) (Threefold axis, X-O direction) for different values of . Color code: = 1.43 (argon): blue; = 2.247 (benzene): pink; = 4.9 (chloroform): orange; = 10.36 (dichloromethane): light blue; = 20.7 (acetone): red; = 78.39 (water): green
5. Conclusion
RCBs have been calculated for the biologically important tetrahedral MCAs AsO43-, PO43-, SO42- and SeO42-. Values of comparable magnitude are found with the electrostatic, and the PCM model. Qualitatively, the RCB obtained with the DFT based model displays the same pattern but quantitatively, it depends highly on the calculation of the global softness S, a property that is non-trivial to evaluate for MCAs. The RCB values for the di-anionic systems are considerably lower than these for the tri-anionic systems, in accordance with the strong dependence of the RCB on the electrostatic repulsion between the excess negative charge within an MCA. An inverse correlation between the RCB and the global softness of the (n-1)-charged system is deduced from eq. 5.2. A numerical relationship is found between the RCB and S-1 when the RCB is calculated with the electrostatic and the PCM model. This relation is not found when the RCB values are obtained with the DFT based model. This approach however has the advantage to express the RCB in terms of molecular reactivity descriptors, describing the evolution of a system when passing from N to (N-1)-electrons and vice versa. A decreasing RCB, implying an increasing stabilization of MCAs is found when increasing the dielectric constant in a continuum solvent model.
90
References
Roos, G., Loverix, S., De Proft, F., Wyns, L., Geerlings P., J. Phys. Chem. A 2003, 107, 6828. Roos, G., Messens, J., Loverix, S., Wyns, L., Geerlings, P., J. Phys. Chem. B. 2004, 108, 17216. Boldyrev A., Simons J., J. Phys. Chem. 1994, 98, 2298. a. Lindblow-Kull, C., Kull, F. J., Shrift, A., J. Bact. 1985, 163, 1267. b. Pflugrath, J. W., Quiocho, F. A., J. Mol. Biol. 1988, 200, 163. c. Zhang, Z-Y., Critical Reviews in Biochemistry and Molecular Biology 1998, 33, 1. d. Bebien, M., Kirsch, J., Mejean, V., Vermeglio, A., Microbiology 2002, 148, 3865. 5. Wang, X.-B., Nicholas, J. B., Wang, L.-S., J. Chem. Phys. 2000, 113, 653. 6. Skurski, P., Simons, J., Wang, X.-B., Wang, L.-S., J. Am. Chem. Soc. 2000, 122, 4499. 7. Simons, J. Skurski, P., Barrios, R., J. Am. Chem. Soc. 2000, 122, 11893. 8. Dreuw, A., Cederbaum, L. S., Phys. Rev. A 2000, 63, 049904. 9. Dreuw, A., Cederbaum, L. S., Chem. Rev. 2002, 102, 181. 10. See for example: Asthagiri, D., Dillet, V., Liu, T., Noodleman, L., Van Etten, R., Bashford, D., J. Am. Chem. Soc. 2002, 124, 10225. 11. Yang, X., Wang, X-B., Wang, L-S., J. Phys. Chem. A 2002, 106, 7607. 12. a. Parr, R. G., Yang, W., Density-Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1989. b. Parr, R. G., Yang, W., Ann. Rev. Phys. Chem. 1995, 46, 701. c. Geerlings, P., De Proft, F., Langenaeker, W., Adv. Quant. Chem. 1999, 33, 303. d. Chermette, H., J. Comp. Chem. 1999, 20, 129. e. Geerlings, P., De Proft, F., Int. J. Quant. Chem. 2000, 80, 227. f. De Proft, F., Geerlings, P., Chem Rev. 2001, 101, 1451. g. Geerlings, P., De Proft, F., Langenaeker, W., Chem. Rev. 2003, 103, 1793. 13. Yang, W., Parr, R. G., Proc. Natl. Acad. Sci. U.S.A. 1985, 82, 6723. 14. Langenaeker, W., De Proft, F., Tielens, F., Geerlings, P., Chem. Phys. Lett. 1998, 228, 628. 15. Murray, J. S., Sen, K., ed. Molecular Electrostatic Potentials Concepts and Applications, Elsevier, Amsterdam, Lausanne, New York, Oxford, Shannon, Tokyo, 1996. 16. Berkowitz, M., Parr, R. G., J. Chem. Phys. 1988, 88, 2554. 17. Vela, A., Gzquez, L. J., J. Am. Chem. Soc. 1990, 112, 1490. 18. a. Cohen, M. H., Ganduglia-Pirovano, M. V., Kudrnovsky, J., J. Chem. Phys. 1994, 101, 8988. b. Cohen, M. H., Ganduglia-Pirovano, M. V., Kudrnovsky, J., J. Chem. Phys. 1995, 103, 3543. c. Geerlings, P., De Proft, F., Balawender, R., Invited contribution to Reviews of Modern Quantum Chemistry, A Celebration to the Contributions of Parr, R. G., Sen, K. D., ed., World Scientific, Singapore, 2002. 19. Politzer P., J. Chem. Phys. 1987, 86, 1072. 20. a. Lee, C., Yang, W., Parr, R. G., Phys. Rev. 1998, 37, 2. b. Becke, A. D., J. Chem. Phys. 1993, 98, 5648. 21. For a detailed account of this type of basis set see, for example, Hehre, W., Radom, L., Schleyer, P. v. R., Pople, J. A., Ab Initio Molecular Orbital Theory, Wiley, New York, 1986. 1. 2. 3. 4.
91
22. a. Wiberg, K. B., Keith, T. A., Frisch, M. J., Murcko, M., J. Phys. Chem. 1995, 99, 9072. b. Foresman, J. B., Keith, T. A., Wiberg, K. B., Snoonian, J., Frisch, M. J., J. Phys. Chem. 1996, 100, 16096. c. Foresman, J. B., Frisch, AE. Exploring Chemistry with Electronic Structure Methods, 2nd ed. Gaussian, Inc, Pitsburgh, 1996. d. Safi, B., Choho, K., De Proft, F., Geerlings, P., J. Phys. Chem. 1998, 102, 5253. 23. Gaussian 03, Revision A.1, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuceria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M Hada, M. Ehara, k. Toyota, R. Fukuda, J. Hasegawa M. Ishida, T. nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J.A. Pople, Gaussian, Inc., Pittsburgh PA, 2003. 24. De Proft, F., Langenaeker, W., Geerlings, P., J. Phys. Chem. 1993, 97, 1826. 25. Koopmans, T. A., Physica 1933, 1, 104. 26. Nguyen, L. T., De Proft, F., Casas Amat, M., Van Lier G., Fowler, P. W., Geerlings, P., J. Phys. Chem. A 2003, 107, 6837. 27. a. Ghanty, T. K., Ghosh, S. K., J. Phys. Chem. 1993, 97, 4951. b. Nagle, J. K., J. Am. Chem. Soc. 1990, 112, 1490. 28. Stefanovich, E. V., Boldyrev, A. I., Truong, T. N., Simons, J., J. Phys. Chem. B 1998, 102, 4205.
92
Chapter VI Intermezzo Origin of the pKa perturbation of N-terminal cysteine in - and 310-helices: a computational DFT study
The great tragedy of science ... the slaying of a beautiful theory by an ugly fact.
(T. H. Huxley)
94
Chapter VI: pKa perturbation of N-terminal cysteine

It is well documented that helices in proteins can decrease the pKa of residues located at the Nterminus, but the real nature of this perturbation remains unclear. In the present work, the origin of the effect of 310- and -polyalanine helices on the pKa of an N-terminal cysteine residue is examined in gas phase as well as in aqueous solution, by means of Density Functional Theory. In a systematic study of the helix dipole, the proton affinity (PA) and pKa of the N-terminal cysteine in relation to both the helix length and to the strength of the hydrogen bonds between the helix backbone amides and S of the N-terminal cysteine, a direct relation between the terminal hydrogen bonds and the pKa perturbation is revealed.
1. Introduction
In Chapter IV it is shown that the -helix possessed by pI258 arsenate reductase decreases the pKa of the N-terminal nucleophilic Cys10 with 0.2 units1. This is in accordance with other experimental and computational studies which documents well that helices can influence the pKa of residues located at either the N- or C-termini. Among others, a pKa increase of 0.6, 1.6 and 2.2 units was measured for the C-terminal histidine residue in respectively triosephosphate isomerase2, barnase3 and in human hemoglobine4. Similar studies on an N-terminal cysteine residue found a pKa decrease of respectively 1.8 and 2.0 units in rhodanese5 and human thioredoxin6. The pKa of N-terminal aspartate in an experimentally designed dodecapeptide is suppressed by 0.6 units7. Earlier quantum chemical studies on papain have shown that a helix near the active site facilitates the proton transfer from the N-terminal serine to the catalytic histidine8. Where does the origin of the effect of 310- and -polyalanine helices on the pKa come from? The present work gives an answer by examining the effect of 310- and -polyalanine helices on the properties of an N-terminal cysteine residue in gas phase and in aqueous solution. Elements of secondary structure such as 413 (or -) and 310-helices are ubiquitous and important structural features in proteins9,10,11. -helices are the most common type of secondary structures, while 310-helices are the fourth common type. Most -helices in proteins contain 10-15 residues12, while 310helices usually forms short sequences of 4-6 amino acids12. The symbols 310 and 413 imply that the intramolecular hydrogen bonds between the backbone carbonyl oxygens of residue i and the amide protons of residue i + 3 or i + 4 form a ring containing 3 or 4 sequential carbonyl oxygens, consisting of respectively 10 or 13 atoms. Consequently, for the same number of amino acids, 310-helices have the most number of hydrogen bonds. Nevertheless, the -helix is dominant in protein structures13,14,15. The helix macro-dipole is the vector sum of the micro-dipole moments of the individual peptide units and is oriented along the helix-axis. A single peptide unit has a considerable dipole moment due to the partial double bond character of the N-C bond15,16. A commonly accepted value for this dipole moment is 3.5 D17. The direction of the dipole is parallel to the C=O and N-H bonds (Fig. 6.1). 95
N H
Figure 6.1: Dipole of one peptide unit.
In an -helix, the peptide units are aligned in such a manner that ~97% of the peptide dipole moments point in the direction of the helix axis18. The 310-conformation is disfavoured because the H-bond geometry is not as optimal, leading to a less optimal orientation of the micro-dipoles with respect to the 310-helix axis14. The effect of the -helix dipole has been suggested to be equivalent to that of a 0.5 unit charge at the C-terminus plus a +0.5 unit charge at the N-terminus of the helix17,19. The dipolar nature of -helices has been invoked to explain the structure of ligand binding sites17, the relative disposition of helices in proteins20,21 and the clustering of positive and negative charges toward the C- and N-termini of the helices22. The interaction between -helix dipoles and charged residues in small peptides23,24 as well as in proteins4,25,26 has been acknowledged to contribute to stability. The macro-dipole concept has been re-examined several times in literature. Due to the often-used term helix macro-dipole which might have given the impression that both ends of the helix contribute significantly to the overall effect, the short-range nature of the helix effect has not been appreciated for a long time27. Electrostatic free energy calculations suggest that the first turn of the helix (e.g. by providing hydrogen bonds) accounts for about 80% of the overall charge-stabilization effect27, while a mutagenesis study gives further indication of the importance of the terminal hydrogen-bonds in pKa perturbation19. Similarly, a Hartree-Fock study on papain attributes more than half of the helical effect to hydrogen bonds with the backbone rather than to the macro-dipole28. Our major objective is to investigate the individual role of the hydrogen bonds with the N-terminal cysteine, and of the helix dipole on pKa perturbation. Hence, DFT calculations of the helix dipole, the proton affinity (PA) and the pKa of the N-terminal cysteine were carried out in function of both the helix length and the strength of the hydrogen bonds between the helix backbone amides and S of the Nterminal cysteine. 96

(Ala)n (with n = 2, 3, 4, 6, 8) and Cys1-(Ala)n (with n=1, 2, 3, 5, 7) polypeptide chains with an and 310- helical conformation were constructed. The torsional angle defines the rotation of the plane containing Ci, Ci and Oi (and Ni+1) around the Ni-Ci bond, while the angle defines the rotation of the plane containing Ci and Oi and Ni+1 around the CiCi bond. Right-handed -helices have typical torsion angles of = - 57 degrees and = - 47 degrees, whereas 310-helices have torsion angles of = - 49 degrees and = - 26 degrees. Full relaxation of the native geometries of clear - and 310-helical forms converges to very similar intermediate structure as judged by their dihedral angles (Table 6.1). This is especially true for n = 4 and in accordance with ref 13. When the number of amino acids is increased or in the presence of the Nterminal cysteine, the geometries of the optimised structures diverge more, but the optimised dihedral and angles deviate significantly from their starting values (Table 6.1). Therefore, during optimization, the torsional and angles were kept fixed to ensure the desired helix-type was preserved.
4 amino acids Intermediate 1
1: -120 2: -68 3: -70 4: -104 1: -121 2: -68 3: -68 4: -104 1: -74 2: -64 3: -67 4: -98 1: -78 2: -68 3: -79 4: -115 1: 16 2: -22 3: -8 4: -11 1: 16 2: -22 3: -8 4: -11 1: -79 2: -23 3: -11 4: 7 1: -75 2: -15 3: -14 4: -27
6 amino acids
1: -114 2: -65 3: -62 4: -63 5: -71 6: -103 1: -113 2: -65 3: -62 4: -63 5: -70 6: -103 1: -75 2: -62 3: -59 4: -63 5: -70 6: -99 1: -77 2: -60 3: -63 4: -77 5: -62 6: -77 1: 9 2: -26 3: -18 4: -20 5: -6 6: -11 1: 9 2: -26 3: -18 4: -19 5: -8 6: -11 1: -83 2: -25 3: -22 4: -21 5: -8 6: 7 1: 77 2: -31 3: -33 4: -37 5: -24 6: -11
Intermediate 2
S_Intermediate 1
S_Intermediate 2
Table 6.1: Structural data of the optimised - and 310-helices. and torsional angles () of fully optimised helices (Intermediate 2) and 310-helices (Intermediate 1) of 4 and 6 amino acids.
97
For convenience the helices are named accordingly: type of helix + number of amino acids; when the Nterminal residue is a cysteine, an S is put in front of the name: e. g. -4: -helix composed of 4 amino acids; S--4: -helix composed of 4 amino acids of which the first is a cysteine. Helices obtained after full optimization starting from the 310- and -helical forms are named Intermediate 1 and Intermediate 2 respectively. Since dipoles of charged residues are origin dependent, the N-terminal nitrogen atom was chosen as origin to ensure a meaningful comparison between the helix dipoles in the presence of a charged Nterminal cysteine. To translate changes in NH proton donor stretching frequencies to changes in H-bond strength, we calculated stretching frequencies of a series of five CH3-CO-NH(--- SX)-CH3 (with X = -CH3, -CH2F, CHF2, -CHF3 and -CH2CH2F) hydrogen bonds (Fig. 6.2) and plotted these values against the calculated hydrogen bond strengths. The basis set superposition error (BSSE), was taken into account by the counterpoise correction (CP) proposed by Boys and Bernardi29. The CH3-CO-NH(---SX)-CH3 structures were fully optimised (Fig. 6.3).
2800 -15 2900 3000 3100 3200 3300
-CF3 -CHF2
-20
-CH2F
R2 = 0.97
-CH3
-25
Figure 6.2: Correlation between the stretching frequency and hydrogen bond strength. Stretching frequencies are calculated of the bond between the hydrogen donor (nitrogen atom) and the hydrogen atom at the B3LYP/6-31G level. The hydrogen bond strengths are calculated at the B3LYP/6-31+G** level, with the counterpoise method29 to correct for basis set superposition errors. Five CH3COCH3NH SX hydrogen bonds are considered with X = -CH3, -CH2F, -CHF2, -CHF3 and -CH2CH2F).
98
Figure 6.3: Example of a CH3-CO-NH(---SX)-CH3 hydrogen bond. Colour code: yellow: sulfur; blue: nitrogen; white: hydrogen; red: oxygen; grey: carbon; green: fluor.
To translate changes in NPA charge to changes in the acid dissociation constant (pKa), we calculated the NPA charges on the sulfur atom of a series of seven thiolates (methanethiol, benzenemethanethiol, mercaptoethanol, cysteine, trifluoroethanethiol, thioacetic acid and trifluoromethanethiol) in gas phase as well as in aqueous solution and plotted these values against experimental pKa values (Fig. 6.4 and Fig. 6.5). The thiolate structures are fully optimised.
Figure 6.4: NPA-charge-pKa calibration curve in gas phase. This curve is obtained for a series of substituted thiolates (methanethiol (1), benzenemethanethiol (2), mercaptoethanol (3), cysteine (4), trifluoroethanethiol (5), thioacetic acid (6) and trifluoromethanethiol (7)). NPA charges were calculated on the sulfur atom of the thiolates.
99
Figure 6.5: NPA-charge-pKa calibration curve in aqueous solution. This curve is obtained for a series of substituted thiolates (methanethiol (1), benzenemethanethiol (2), mercaptoethanol (3), cysteine (4), trifluoroethanethiol (5), thioacetic acid (6) and trifluoromethanethiol (7)). NPA charges were calculated on the sulfur atom of the thiolates.
The resulting linear relationship was used to assess the pKa of the N-terminal cysteine in the helices from the calculated NPA charge on the SCys atom. To account for solvent effects, the PCM30 solvent model was used. All optimizations were performed at the B3LYP level using the 6-31G* basis set and subsequent calculations were performed at the B3LYP level with a 6-31+G** basis set. The Gaussian 03 package31 was used throughout.
100

3.1 Gas phase study
3.1.1 Hydrogen bonds formed with the N-terminal cysteine
The N-terminal cysteine accepts two hydrogen bonds from nearby helix backbone amides. The lengths of these hydrogen bonds are smaller in the - than in the 310-helices for an equal number of residues (Table 6.2). When the number of amino acids of the helix increases, the hydrogen bond length slightly decreases. In order to give a quantitative assessment of the intra-molecular hydrogen bonds, the stretching frequencies of the NH bonds involved in hydrogen bonding were calculated. Upon formation of a hydrogen bond, a red shift in these NH stretching frequencies occurs. To translate the magnitude of this red shift to the hydrogen bond strength, a series of CH3-CO-NH(--- SX)-CH3 hydrogen bonded model systems was used. In this series, plotting the calculated H-bond strengths versus the calculated NH bond stretching frequencies yields a linear correlation (Fig. 6.2). This curve was used to estimate the hydrogen bond strength in the helices from the calculated stretching frequencies (Table 6.2). For both the and 310-helices, the hydrogen bond strength increases with increasing number of amino acids. These results also indicate that the hydrogen bonds between the helix backbone and SCys are stronger in the helices as compared to the 310-helices.
3.1.2 Helical length
The macro-dipole increases with the number of amino acids (and consequently with the number of micro-dipoles) (Table 6.3). The less optimal orientation of the hydrogen bonds and micro-dipoles in a 310-helix14 can explain the lower value of the macro-dipole of a 310-helix compared to an helix of the same length. The dipoles of the fully optimised 310- and -helices (Intermediate 1 and Intermediate 2) of the same length are only very slightly different (Table 6.3), illustrating the convergence of structures during a full optimization (See also Table 6.1). When the N-terminal cysteine is present, the value of the macro-dipole increases due to the electrostatic interaction between the charged sulfur atom and the helix-backbone (Table 6.3). This interaction seems to cause a better conservation of the initial helical structure during a full optimization (Table 6.1 and 6.3).
101
Donor---Acceptor A. Pure 310- and -helices 2AA 3AA S-310 4AA 6AA 8AA 2AA 3AA S- 4AA 6AA 8AA B. Intermediate structures S_Intermediate 1 4AA 6AA 4AA 6AA
NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S
l ()
3.01 3.41 3.01 3.37 3.01 3.34 3.01 3.31 3.01 3.29 3.04 3.20 3.04 3.19 3.04 3.16 3.03 3.13 3.03 3.12 3.08 3.10 3.10 3.07 3.09 3.09 3.09 3.06
a ()
119 125 120 126 119 125 118 128 119 128 119 142 119 144 119 142 119 145 119 145 116 149 116 149 116 149 116 149
(cm-1)
3277 3405 3272 3378 3272 3320 3282 3303 3270 3293 3329 3138 3336 3068 3322 3001 3308 2929 3305 2891 3388 2892 3388 2838 3383 2856 3381 2798
H-bond strength (kcal/mol)

-16.3 -14.2 -16.4 -14.6 -16.4 -15.6 -16.3 -15.9 -16.5 -16.1 -15.5 -18.7 -15.3 -19.9 -15.6 -21.0 -15.8 -22.2 -15.9 -22.9 -14.5 -22.8 -14.5 -23.8 -14.5 -23.5 -14.6 -24.4
S_Intermediate 2
Table 6.2: Characteristics of the hydrogen bonds formed between SCys10 and the helix backbone in gas phase. Length (l) (distance measured in from H-donor to H-acceptor) and donorhydrogen-acceptor angle a () of the hydrogen bonds formed between SCys and the helix-backbone. Stretching frequencies (cm-1) and hydrogen bond strengths (kcal/mol) obtained via the linear relation ship of figure 6.2 from the stretching frequencies of the bond between the hydrogen donor and the hydrogen, A. for the pure 310- and -helices and B. for the intermediate structures, as a function of the number of residues.
102

Dipole (D) B3LYP/6-31G* Alanine Cysteine 310 S_310 S_ Intermediate 1 Intermediate 2 S_Intermediate 1 S_Intermediate 2 1 AA 3.6407 4.3963 2 AA 3.60 4.78 10.63 9.67 3 AA 7.74 8.20 12.28 11.29 4 AA 10.47 11.88 16.01 13.18 8.23 8.22 10.72 11.87 6 AA 18.70 19.26 24.50 23.32 16.33 16.38 20.32 21.16 8 AA 34.33 32.39 -
Table 6.3: Macro-dipole moments of the different helical types with different length.
Surprisingly, in the presence of an N-terminal cysteine, the helix macro-dipole is smaller than that of a 310-helix. The hydrogen bonds formed between the helical backbone and SCys are stronger than these formed between the 310-helical backbone and SCys (Table 6.2). It would be interesting to know to which extend the dipole effect and the terminal hydrogen bonds contribute to the pKa perturbation of the cysteine
3.1.3 Protonaffinity and pKa
From Table 6.4 it can be seen that the proton affinity (PA) of the N-terminal cysteine decreases when the number of amino acids increases. In parallel, the macro-dipole increases, as does the strength of the backbone amide-SCys hydrogen bonds. For helices of the same length, the PA of the N-terminal cysteine is lower in helices than in 310helices. This observation can be explained by the stronger backbone amide-SCys hydrogen bonds found in the helices (Table 6.2). Although the dipole moment - and its interaction with the charged cysteine residue - is larger in the 310-helix, the net effect on the PA is larger in the -helix. This indicates that it is mainly the hydrogen bonds with SCys, which influence the PA, and by extension the pKa, since a relation exists between PA and pKa32. Among others, the NPA charge has been shown to be an effective descriptor for the pKa1,32. For a series of thiolates, a linear relationship is obtained between the NPA-charge of the sulfur atom and the experimental pKa (Fig. 6.4). The more negative the NPA charge on the sulfur atom, the higher the tendency to bind a proton and as a result, the more basic (i. e. higher pKa) the compound is. This linear relationship can be used as a calibration curve to quantify the pKa perturbing effect. 103
PA S (a.u.) 310-helices Cysteine S_310_2 S_310_3 S_310_4 S_310_6 S_310_8 -helices S_ _2 S_ _3 S_ _4 S_ _6 S_ _8 Intermediate structures S_Intermediate 1_4 S_Intermediate 1_6 S_Intermediate 2_4 S_Intermediate 2_6 -0.556 -0.528 -0.519 -0.514 -0.503 -0.519 -0.510 -0.502 -0.478 -0.508 -0.496 -0.501 -0.497
NPA S (a.u.) -0.684 -0.642 -0.633 -0.626 -0.621 -0.616 -0.623 -0.610 -0.600 -0.586 -0.578 -0.601 -0.593 -0.594 -0.584
pKa 8.3 6.8 6.4 6.1 5.9 5.7 6.0 5.5 5.1 4.6 4.3 5.2 4.8 4.9 4.5
Additional pKa decrease per amino acid -1.9 -0.4 -0.3 -0.2 -0.2 -2.7 -0.5 -0.4 -0.5 -0.3
Table 6.4: Proton affinity (PA), NPA charge and pKa of N-terminal SCys in 310-helices, in -helices and in Intermediate structures in gas phase.
When calibrating the calculated NPA charges of the SN-terminal cysteines, a pKa decrease with the number of amino acids is found. For helices of the same length, a lower pKa value of the N-terminal cysteine residue is found in the helix (Table 6.4) as compared to the 310-helix. This is in accordance with the trends indicated for the PA and in agreement with the higher hydrogen bond strengths found for hydrogen bonds with the N-terminal SCys atom in the -helices as compared to the 310-helices. Compared to isolated cysteine, the addition of one extra amino acid causes the formation of two hydrogen bonds between the backbone amides and SCys (Table 6.2). These two hydrogen bonds are responsible for a substantial decrease in pKa (1.9 units in 310-helices and 2.7 units in -helices) (Table 6.4). The addition of a third, fourth, residue in the conformation of a 310- or -helix, results in the strengthening of these hydrogen bonds (Table 6.2). The more residues the helix counts, the larger the macro-dipole, and apparently the stronger the hydrogen bonds to SCys are. The latter may be expected from the electrostatic nature of hydrogen bonds. Per extra residue, a further pKa decrease is found, however to a lesser extent (Table 6.4). This decrease diminishes for every additional residue, leading to a plateau value. As a result, additional residues in a helix strengthen the terminal hydrogen bonds, but 104
Chapter VI: pKa perturbation of N-terminal cysteine cause a subordinate effect on pKa. Interestingly, the macro-dipole increases linearly with the number of residues (Table 6.3), in contrast to the pKa effect, suggesting a subordinate effect of the helix macrodipole. The pKa and PA values of the N-terminal cysteines present in the intermediate helical structures obtained after full optimization (Intermediate 1 and Intermediate 2) are lower than in clear 310- or helices with the same number of amino acids. The hydrogen bonds formed between the backboneamides and SCys found in Intermediate 1 and Intermediate 2 are stronger than these found in the 310and -helices of the same length (Table 6.2). On the other hand, the macro-dipoles of the intermediates are smaller than these of 310- or -helices (Table 6.3). Here again, the terminal hydrogen bonds have the major influence on the pKa perturbation.
3.2 Study in aqueous solution

Because helices in biological systems (e. g. proteins) are for the most part exposed to solvent, we will also use a solvent model to test the validity of the observations in gas phase. Therefore, the S-- and S310 helices were optimised with fixed and (vide supra) in water, using a PCM-model30. All results obtained in gas phase remain valid for the solvated helices. For helices of the same length, the macro-dipoles of the solvated 310-helices are larger than these of the solvated helices while the NPA charge of the N-terminal SCys atom present in the helices are higher (less negative) than these of SCys in the 310-helices (Table 6.5).
Dipole (D) 310-helices Cysteine S_310_2 S_310_3 S_310_4 -helices S_ _2 S_ _3 S_ _4 S_ _6 6.45 14.33 14.38 17.97 11.64 12.77 15.32 26.65 NPA S (a.u.) -0.824 -0.791 -0.786 -0.782 -0.763 -0.758 -0.753 -0.754 pKa 8.3 8.06 7.92 7.80 7.22 7.04 6.90 6.93 Additional pKa decrease per amino acid -0.64 -0.14 -0.12 -1.48 -0.18 -0.14 +0.03
Table 6.5: Helix-macro-dipole, NPA charge and pKa of N-terminal SCys obtained in aqueous solution in 310-helices and in -helices.
105
As was done in gas phase, a linear relationship between the NPA charge on the sulfur atoms of the substituted thiolates and their experimental pKa is obtained in aqueous solution (Fig. 6.4). When calibrating the NPA charge of SCys of the solvated helices, a decrease of the pKa with the helical length is found. As was the case in gas phase, this decrease is accompanied by the increase of the backbone amide-SCys hydrogen bond strength as measured from frequency calculations (Table 6.6).
Donor---Acceptor NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S NHN-terminus --- S NH1 --- S l () 3.19 3.70 3.18 3.68 3.17 3.69 3.13 3.36 3.13 3.34 3.14 3.33 3.17 3.33 a () 110 104 111 105 108 110 112 131 112 131 112 131 110 129
(cm-1)
3310 3504 3353 3464 3316 3503 3318 3427 3314 3404 3313 3406 3315 3372
2AA S-310 3AA 4AA 2AA 3AA S- 4AA 6AA
Table 6.6: Characteristics of the hydrogen bonds formed between SCys10 and the helix backbone in aqueous solution. Length (l) (distance measured in from H-donor to H-acceptor) and donor-hydrogen-acceptor angle a () of the hydrogen bonds formed between SCys and the helixbackbone in aqueous solution. Stretching frequencies (cm-1) of the bond between the hydrogen donor and the hydrogen.
The extra decrease of the pKa for every additional residue diminishes, which means that once the hydrogen bonds are formed (when residue 2 is present), the next residues, which strengthen the hydrogen bonds, cause a subordinate effect on pKa perturbation. The very small pKa increase with 0.03 units found in the presence of a 6-residue -helix is likely to be due to reaching the plateau pKa value, rather than representing a relevant effect. For helices with the same length, the dipoles are larger in water as compared to gas phase. In gas phase, the pKa value of 4.3 obtained for a cysteine residue present at the N-terminal of an -helix with 8 residues, is too low in comparison with literature values1,5,6. On the other hand, in aqueous solution a decrease of the N-terminal cysteine pKa with 1.8 units due to a 6-residue -helix is found. This is in the range of pKa values of N-terminal cysteines obtained by experimental studies5,6. This more 106
Chapter VI: pKa perturbation of N-terminal cysteine realistic pKa decrease calculated in aqueous solution, is consistent with the diminished electrostatic effect of the hydrogen bonds in solvent in comparison to gas phase. Although 310-helices are very important structural features in protein functioning (e.g. in Cycline Dependent Kinase33 and in HIV antibodies34), to the best of our knowledge, no role in pKa perturbation has been described yet for 310-helices. The present theoretical study shows that 310-helices can indeed lower the pKa, albeit to a lesser degree than -helices. The calculations in gas phase and in solvent point out that the pKa lowering effect on the N-terminal cysteine of an - or 310-helix largely finds its origin in the terminal hydrogen bonds formed with this Nterminal residue. These results strengthen the view of the short-range nature of the helical effect on pKa perturbation and the significance of the terminal hydrogen bonds herein, as was proposed previously from electrostatic free energy calculations27, mutagenesis studies19 and quantum chemical studies28. There is no reason why the results of the present study on a N-terminal cysteine residue would not be valid for other N- or C-terminal residues, provided that hydrogen bonds with donor or acceptor atoms of the helix can be formed. However, extension of the present results to systems with N- or C-terminal residues different from cysteine, needs further confirmation. In the absence of such hydrogen bonds, the macro-dipole effect alone is responsible for charge stabilization at these residues as has been concluded before for the Cys-His ion pair in papain35.
4. Conclusion
We conclude from this study that the dominant pKa perturbing effect of helices on an N-terminal cysteine largely finds its origin in the terminal hydrogen bonds, which are strengthened as the helix length increases with every extra amino acid added. Each additional residue has a subordinate effect on the pKa, which may lead to a final plateau value of the pKa. Similar trends are found in gas phase and in solvent, but the latter are found to be necessary to obtain reliable values for the pKa decrease.
References
1. 2. 3. 4. 5. 6. 7. 8. Roos, G., Messens, J., Loverix, S., Wyns, L., Geerlings, P., J. Phys. Chem. B. 2004, 108, 17216. Lodi, P. J., Knowles, J. R., Biochemistry 1993, 32, 4338. Sali, D., Fersht, A.R, Bycroft M., Nature 1988, 335, 6192. Perutz, M. F., Gronenborn, A. M., Clore, G. M., Fogg, J. H., Shih, D. T.-b., J. Mol. Biol. 1985, 183, 491. Schlesinger, P., Westley, J., J. Biol. Chem. 1973, 240, 780. Forman-Kay, J. D., Clore, G. M., Gronenborn, A. M., Biochemistry 1992, 31, 3442. Joshi, H. V., Meier, M. S., J. Am. Chem. Soc. 1996, 118, 12038. Van Duijnen, P. T., Thole, B. T., Hol, W. G., Biophys. Chem. 1979, 9, 273.
107
9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
31.
32. 33. 34. 35.
a. Richardson, J. S., Adv. Protein Chem. 1981, 34, 168. b. Creighton, T. E. Proteins: Structures and Molecular Properties, 2nd. ed., Freeman: New York, 1993. Hunter, T., Pines, J., Cell 1994, 79, 573. Hashimoto, Y., Kohri, K., Kaneko, Y., Morisaki, H., Kato, T., Ikeda, K., Nakanishi, M., J. Biol. Chem. 1998, 273, 16544. a. Wu, Y.-D., Zhao, Y.-L., J. Am. Chem. Soc. 2001, 123, 5313. b. Pal, L. Basu, G., Chakrabarti, P., Proteins: Struct., Funct. and Gen. 2002, 48, 571. Topol, I. A., Burt, S. K., Deretey, E., Tang, T.-H., Perczel, A., Rashin, A., Csizmadia, I. G., J. Am. Chem. Soc. 2001, 123, 6054. Tran, T. T., Zeng, J, Treutlein, H., Burgess, A. W., J. Am. Chem. Soc. 2002, 124, 5222. Schulz, G. E., Schirmer, R. H., Principles of Protein Structure, Springer-Verlag, New-York, 1979. Pauling, L., The Nature of the Chemical Bond, Cornell University Press, Ithaca, New-York, 1960. Hol, W. G. J., Van Duijnen, P. T., Berendsen, H. J. C., Nature 1978, 294, 443. Wada, A., Adv. Biophys. 1976, 9, 1. Sancho, J., Serrano, L., Fersht, A. R., Biochemistry 1992, 31, 22. Sheridan, R. P., Levy, R. M., Salemme, F. R., Proc. Natl. Acad. Sci. U. S. A. 1982, 79, 4545. Presnell, S. R., Cohen, F. E., Proc. Natl. Acad. Sci. U. S. A. 1989, 86, 6592. Richardson, J. S., Richardson, D. C., Science 1988, 240, 1648. Shoemaker, K. R., Kim, P. S., Brems, D. N., Marqusee, S., York, E. J., Chiken, I. M., Steward, J. M., Baldwin, R. L. Proc. Natl. Acad. Sci. U. S. A. 1985, 82, 2349. Fairman, R., Shoemaker, K. R., York, E. J., Steward, J. M., Baldwin, R. L., Proteins: Struct., Func., Genet. 1989, 5, 1. Sali, D., Bycroft, M., Fersht, A. R., Nature 1988, 335, 740. Nicholson, H., Becktel, W. J., Matthews, B. W., Nature 1988, 336, 651. qvist, J., Luecke, H., Quiocho, F. A., Warshel, A., Proc. Natl. Acad. Sci. U. S. A. 1991, 88, 2026. Rullmann, J. A., Bellido, M. N., van Duijnen, P. T., J. Mol. Biol. 1989, 206, 101. a. Boys, S. F., Bernardi, F., Mol. Phys. 1970, 19, 553. b. Simon, S., Duran, M., Dannenberg, J. J., J. Chem. Phys. 1996, 105, 11024. a. Miertus, S., Scrocco, E., Tomasi, J., Chem. Phys. 1981, 55, 117. b. Mennucci, B., Tomasi, J., J. Chem. Phys. 1997, 106, 5151. c. Cammi, R., Mennucci, B., Tomasi, J., J. Phys. Chem. A 2000, 104, 5631. d. Cossi, M., Scalmani, G., Rega, N., Barone, V., J. Chem. Phys. 2002, 117, 43. Gaussian 03, Revision A.1, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuceria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J.A. Pople, Gaussian, Inc., Pittsburgh PA, 2003. Gross, K. C., Seybold, P. G., Peralta-Inga, Z., Murray, J. S., Politzer, P., J. Org. Chem. 2001, 66, 6919. Hashimoto, Y., Kohri. K., Kaneko, Y., Morisaki, H., Kato, T., Ikeda, K., Nakanishi, M., J. Biol. Chem. 1998, 273, 16544. Biron, Z., Khare, S., Samson, A. O., Hayek, Y., Naider, F., Anglister, J., Biochemistry 2002, 41, 12687. van Duijnen, P. T., Thole, B. T., Broer, R., Nieuwpoort, W. C., Int. J. Quant. Chem. 1980, 17, 651.
108
Chapter VII The activation of electrophile, nucleophile and leaving group during the reaction catalysed by pI258 arsenate reductase
I have an old belief that a good observer really means a good theorist.
(Charles Darwin)
110
Chapter VII: Ground state activation by pI258 ArsC

The reduction of arsenate to arsenite by pI258 ArsC combines a nucleophilic displacement reaction with a unique intra-molecular disulfide cascade. Within this reaction mechanism the oxidative equivalents are translocated from the active site to the surface of ArsC. The first reaction step in the reduction of arsenate by pI258 ArsC consists of a nucleophilic displacement reaction carried out by Cys10 on di-anionic arsenate. The second step involves the nucleophilic attack of Cys82 on the Cys10-arseno intermediate formed during the first reaction step. The onset of the second step is studied here using quantum chemical calculations in a DFT context. The optimised geometry of the Cys10-arseno adduct in the ArsC catalytic site (sequence motif: C10-T11-G12-N13-S14-C15-R16-S17) forms the starting point for all subsequent calculations. Thermodynamic data and a HSAB reactivity analysis show a preferential nucleophilic attack on a mono-anionic Cys10-arseno adduct, which is stabilized by Ser17. The P-loop active site of pI258 ArsC activates first a hydroxyl and subsequently arsenite as leaving group, as is clear from an increase in their calculated nucleofugality upon going from gas to solvent phase to enzymatic environment. Further, the enzymatic environment stabilizes the thiolate form of the nucleophile Cys82 with 3.3 pH units via the presence of the 8-residue helix flanked by Cys82 and Cys89 (redox helix) and via a hydrogen bond with Thr11. The importance of Thr11 in the pKa regulation of Cys82 was confirmed by the observed decrease in the kcat of the Thr11Ala mutant compared to wild-type ArsC. During the final reaction step, Cys89 is activated as a nucleophile by structural alterations of the redox helix functioning as a pKa control switch of Cys89, necessary to expose a Cys82-Cys89 disulfide.
1. Introduction
After the first reaction step1 (Chapter IV), in this chapter, the second reaction step in the reduction of arsenate to arsenite by pI258 arsenate reductase (ArsC) and the looping-out of the redox helix (Fig. 2.3, Chapter II) are studied by combining quantum chemical calculations in a computational and conceptual Density Functional Theory (DFT)2,3 context with kinetic studies. Knowledge of the correct protonation state of the covalent Cys10-arseno adduct the product of the first reaction step - (Fig. 2.3, Chapter II) is crucial for understanding the second reaction step of the ArsC mechanism. This adduct is expected to be di-anionic since energetic arguments and a reactivity analysis have pointed out that during the first reaction step the nucleophilic attack of Cys10 occurs on di-anionic arsenate1,4. However, in view of the acid dissociation constants for arsenate (pKa = 2.2, 6.76, 11.53), a mono-anionic as well as a di-anionic adduct are likely to exist at the pH of maximum activity (pH = 8.0)5 of pI258 ArsC. In this study, arguments are given to discern between the two possible protonation states of the enzyme-arseno adduct. The leaving group capacity of a functional group is described by the nucleofugality7. We have used the nucleofugality to determine the effect of the ArsC environment on both the hydroxyl and the arsenite 111
leaving group during respectively the first and second catalysed reaction step (Fig. 2.3, Chapter II, Step 1 and 2). Besides leaving group stabilization, leaving group expulsion can also be aided by weakening of the scissile bond in the reactant state. Among others, the Wiberg bond order8 provides a chemical interpretation of covalent bonds. Here, this index is used to gauge the effect of ArsC on the covalency of the scissile S-As bond. As mentioned in previous chapters (II and IV), at the pH optimum of enzymatic catalysis of pI258 ArsC (pH = 8.0)5, a substantial amount of free cysteine (pKa = 8.3) is present in the thiol form, which is a far inferior nucleophile than the thiolate form9. We envisage two means by which ArsC can lower the pKa of the Cys82 nucleophile, and as such favouring the nucleophilic thiolate form. Firstly, Cys82 accepts a hydrogen bond from the hydroxyl group of Thr11 (3.09 , 1LK0)10. Secondly, Cys82 is positioned at the N-terminal of a short -helix, the so-called redox helix, extending from amino acid 82 to 89ref. 10,11. High-level quantum chemical calculations are employed to estimate the effects of both Thr11 and the redox helix macro-dipole on the pKa of Cys82, since the experimental determination of the pKa of the Cys10 thiol group is not straightforward (cfr. Chapter II). When applicable, these results are discussed in the light of experimental data, among which novel kinetic data for the ArsC Thr11Ala mutant. After the second reaction step (Fig. 2.3, Chapter II), when the Cys10-Cys82 disulfide intermediate has been formed (S-S distance of 2.03 , 1LK0)10, the conformation of the redox helix has changed into a transitional conformation between -helix and loop (Fig. 2.3, Chapter II)10. The subsequent third reaction step (Fig. 2.3, Chapter II, Step 3) consists of the nucleophilic attack of Cys89 on the Cys10Cys82 disulfide to form the Cys82-Cys89 disulfide and to regenerate Cys1010,11. Cys82 and Cys89 have to execute their nucleophilic attacks in a successive way. Since Cys82 performs the first nucleophilic attack, the pKa of Cys89 has to be maintained high, ensuring a protonated, non-active nucleophile. After the second reaction step, Cys89 has to be activated in order to enable the progress of the enzymatic reaction cascade. This activation mechanism is investigated by a pKa analysis of Cys89 during the successive reaction steps.

pI258 ArsC is a relatively small enzyme (131 amino acids)11, but contains still too many electrons for high level quantum chemical calculations. Therefore, the enzyme needs to be described by an adequate model, combining accuracy with computational tractability. The entire enzyme cannot be taken into account for technical reasons (lack of parameters of the As atom in force fields and semi-empirical methods). Since all essential intermediates in the reaction mechanism of ArsC have been visualized with X-ray crystallography supplemented with NMR10, the unique opportunity is offered to use reliable 112
Chapter VII: Ground state activation by pI258 ArsC geometries in which all the important interactions with the system of interest are taken into account as starting point for computational studies.
2.1 Electrophile
The model system for the quantum mechanical calculations was constructed starting from the X-ray structure of the Cys15Ala mutant of ArsC complexed with arsenite (PDB 1LJU)10. As in the study of the first reaction step1, the computational model for the second step included the catalytic sequence motif, Cys10-X-X-Asn13-X-X-Arg16-Ser17. Amino acids 10 and 17 were respectively terminated with NHCOH and CONH2. The side chains of residues 11, 12, 14 and 15 were terminated at C, since they are positioned at the periphery of the substrate-binding loop where no interactions with the substrate occur. The model also contains four well-positioned water molecules present in the active site of 1LJU10, of which one can be considered as the leaving group of arsenate during the first reaction step (Fig. 2.3, Chapter II, Step 1). The structure was fully optimized using a two-layer QM/QM ONIOM12-15 approach. The most relevant parts, being the covalent Cys10- (mono- or di-anionic)-arseno adduct: Cys10S-HAsO3- or Cys10S-AsO32-, and the leaving water molecule (H2O (L)), form the inner layer and is treated at B3LYP/6-311++G**. The remaining part of the system, the ligand-binding loop, constitutes the outer layer and is described by the computationally less demanding HF/3-21G method. The resulting model is called wild type (WT) throughout. The Ser17Ala, Arg16Ala and the Asn13Ala mutants were built in silico1,16 starting from the coordinates of the WT model. They were fully optimized using the same approach as described above. An electrostatic model in which the optimized WT and mutant structures were represented as ChelpG point charges17,18 was used as an approximation for the enzymatic environment of ArsC, when the properties of the Cys10-arseno adduct are calculated. The ChelpG charges are calculated at B3LYP/631G**, which is a satisfying calculation level since these charges have to represent the electrostatic effect of the ArsC active site18,19. The reactivity of Cys82 towards the Cys10-arseno adduct is assessed by the difference in local softness s(r) (HSAB principle)3 of the interacting parts s(r) = |s+(As) s-(S)| (eq. 3.94, Chapter III). Free SCH3-AsO32-/SCH3-HAsO3- and CH3S- were optimized at the B3LYP/6-311++G** level in gas phase and in solvent (water or = 20.7) using the PCM20 model. A dielectric constant () of 20.7 maximizes the agreement between calculated and measured pKa values of surface groups21-22 and is therefore appropriate to represent the solvent exposed11 active site of ArsC. All subsequent single point calculations were performed at the highly diffuse B3LYP/6-311++G** level since the As atom is involved. 113
2.2 Nucleophile
The enzymatic environment of the Cys82 nucleophile was built using the coordinates from free wild type ArsC (PDB 1LJL)10. Thr11 was modelled by HOCH3 and the 82-89 -helix was taken as a whole and was terminated on both sides with CONH2. For the Cys89 nucleophile, the coordinates of the partially unfolded 82-89 helix were taken from the Cys89Leu mutant (A chain in PDB 1LK0)10. In both structures hydrogen atoms were placed and optimized at the B3LYP/6-31G level. The side chains of Cys89 and Cys82 or the Cys10-Cys82 disulfide were subsequently optimised at the B3LYP/6-31G* level. To explore and quantify the effects of Thr11 and the redox helix on the basicities of Cys82 and Cys89, the linear relationship between the NPA charge and the pKa, presented in Chapter VI23 was used. NPA charges were calculated at the B3LYP/6-31+G** level in gas phase.
2.3 Leaving group activation

The nucleofugalities of the hydroxyl-group (OH-, with q = 0) and the arsenite-group (HAsO32-, with q = -1) were assessed at the B3LYP/6-311++G** level in gas phase and in solvent (water or = 20.7), using a PCM20 model, and subsequently in the ArsC enzymatic environment, using a point charge model (vide ( IE 3EA) 2 Enucleofuge (eq. 3.91, Chapter III) combined with e with = 1.841 supra). E nucleofuge = 8( IE EA) eV-1 (eq. 3.91, Chapter III) were used as working equations. All calculations were performed using the GAUSSIAN 03 package24.

3.1 Protonation state of the covalent arseno-enzyme adduct
3.1.1 Structural considerations
As already mentioned in the introduction, the Cys10-arseno adduct product of the first reaction step (Fig. 2.3, Chapter II, Step 1) - can be either mono- or di-anionic. Structural details (e. g. bond lengths and angles) are often indicative for the particular protonation state of a molecule. In the covalent Cys10arseno adduct of the X-ray structure 1LJU, the arsenic and oxygen atoms of the arsenite group lie in one 114
Chapter VII: Ground state activation by pI258 ArsC plane with the attacking sulfur atom and the hydroxyl leaving group at the apices of a trigonal bipyramid10. This geometry is reminiscent of the transition state for phosphoryl transfer reactions10. After full optimisation of the ArsC active site with either a mono- (Fig. 7.1) or a di-anionic sulfur-arseno adduct (ONIOM B3LYP/6-311++G**//HF/3-21G), the planar geometry of the -AsO32-/-HAsO3- group has disappeared and the oxygen atoms take up tetrahedral positions around the central As atom, as is the case in optimised (B3LYP/6-311++G**) free CH3S-AsO32-/CH3S-HAsO3- in gas phase and aqueous solution. The S-As bond in the di-anionic adduct is longer than in the mono-anionic adduct by 0.24 in gas phase, 0.07 in solvent phase, and 0.004 in the ArsC active site model (Table 7.1).
Model Gas phase = 20.7 Aqueous solution ArsC (Wild Type) Asn13Ala Ser17Ala Arg16Ala l () (S-As) CH3S-AsO322.55 2.35 2.34 2.22 l () (S-As) CH3S-HAsO32.31 2.28 2.27 2.22 2.23 2.22 2.23
Table 7.1: Scissile S-As bond length. S-As bond lengths l () in CH3S-AsO32- and CH3S-HAsO3- in gas phase, solution and in the ArsC active site model.
The bond lengths of the mono- and di-anionic adduct in the optimised wild type model are remarkably close together and are in the range of those observed in R773 E. coli arsenate reductase: 2.18 /2.22 , (PDB 1JZW)25 and 1.87/2.21 (PDB 1SK1)26. For the latter two structures, the authors assume a mono-anionic sulfur-arseno adduct, in line with a pH of 4.8 in the crystal. However, the S. aureus pI258 ArsC structure was obtained at pH 8.25 (close to the pH of maximum activity), suggesting a di-anionic adduct. No decisive answer concerning the protonation state of the Cys10-As adduct in pI258 ArsC can be given from these structural data. Therefore, we compared the interactions formed between the leaving arsenite group and the active site (Fig. 7.1) for the two protonation states. The mono-anionic adduct forms 2 hydrogen bonds less with the P-loop than its di-anionic counterpart and the hydrogen bond lengths in the mono-anionic adduct are in general longer.
115
Figure 7.1: Stereo view of the optimized (2-layer ONIOM scheme: B3LYP/6-311++G**//HF/3-21G) wild type product structure with Cys10-HAsO3-. The figure was generated by using MacPyMol (Delano Scientific LLC 2005) by Messens, J.
3.1.2 Thermodynamic considerations
Despite of the above-mentioned structural observations, the mono-anionic adduct complex is stabilized by 324 kcal/mol (B3LYP/6-31G**) compared to the di-anionic adduct. This stabilization energy is the proton affinity of the di-anionic adduct, which is 7 kcal/mol more than in aqueous solution. Since the proton affinity is linearly related to the pKa1,16,27, ArsC decreases the Cys10-arseno basicity and drives the Cys10-AsO32-/Cys10-HAsO3- equilibrium toward the mono-anionic form as compared to aqueous solution. The ArsC active site is readily solvent accessible11. In principle, the protonation of a di-anionic Cys10arseno adduct by a water molecule could be feasible. An energetic evaluation of the reactions A and B (below) in aqueous solution (PCM, B3LYP/6-311++G**) gives an indication of which reaction pathway (with its specific protonation state) is more likely: A: CH3S- + CH3S-AsO32- + H2O B: CH3S- + CH3S-AsO32CH3S- + CH3S-HAsO3- + OHCH3S-SCH3 + HAsO32- + OHCH3S-SCH3 + AsO33-
From the calculated reaction energies in aqueous solution, reaction A (+26 kcal/mol), in which protonation of the Cys10-AsO32- group occurs, is more favourable than reaction B (+35 kcal/mol), in which the Cys10-AsO32- group doesnt become protonated.
116

3.1.3 Reactivity analysis
Apart from thermodynamics, in which the reactants and products of a given reaction are compared, it is possible to discern between reaction pathways by assessing the reactivity between two interacting partners in the reactant state via the application of the HSAB principle28. In this context, the smaller the difference in local softness (calculated at the B3LYP/6-311++G** level) of the interacting partners, the higher the reactivity. In the gas phase, the lower difference in local softness (s) favours the nucleophilic attack of CH3S- on CH3S-AsO32-, rather than on CH3S-HAsO3- (Table 7.2). In a continuum solvent model representing the ArsC environment, this sequence is reversed and the attack on a mono-anionic adduct is favored. This result is also obtained when the active site of ArsC is modeled through point charges (see 2.1 Electrophile), implying that the HSAB principle predicts the nucleophilic attack of Cys82 on a monoanionic Cys10-HAsO3- adduct. Taken together, thermodynamic and reactivity considerations both argue for the protonation of the dianionic covalent adduct formed after the first reaction step (Fig. 2.3, Chapter II, Step 1), before the second reaction step (Fig. 2.3, Chapter II, Step 2) occurs.
CH3S-AsO32s (au) 8.167 4.364 4.344 6.745 6.736 6.695 6.694 CH3S-HAsO3s (au) 8.258 4.044 4.041 6.144 6.287 6.247 7.139
Gas phase = 20.7 Aqueous solution Wild Type Asn13Ala Arg16Ala Ser17Ala
Table 7.2: Reactivity analysis. Reactivity of Cys82 towards CH3S-AsO32- and CH3S-HAsO3- in gas phase and solvent and in ArsC as measured by the difference in local softness (s(r) = |s+(As) s-(S)|).
The local softness difference between Cys82 and CH3S-HAsO3- is optimal in the presence of the complete active site environment. Removal of the side chains of Asn13, Arg16 or Ser17 results in a larger s and hence a lower reactivity (Table 7.2). Ser17 has the strongest influence on the reactivity, which seems reasonable since it is the only residue that interacts directly with the arsenite leaving group. The Ser17 hydroxyl group donates a hydrogen bond to one of the arsenite oxygens and appears to be the main activator of the electrophile during the onset of the second reaction step. 117
3.1.4 Optimized product structure and structural comparison with the ArsC-arsenate complex
Starting from the X-ray structure of ArsC complexed with arsenite (PDB 1LJU)10, a model of the Cys10-arseno adduct of ArsC (i. e. the product of the first reaction step) is optimized using a 2-layer QM/QM ONIOM scheme (B3LYP/6-311++G**//HF/3-21G). Kinetic and 1H-15N heteronuclear single quantum correlation nuclear magnetic resonance (HSQC NMR) spectroscopy experiments5 demonstrated a dynamic character of the active site P-loop in the absence of tetrahedral oxyanions. This large flexibility is also seen in the spectrum of different conformations adopted by the P-loop. A configuration change (- flip from -L to conformation) of Gly12-Asn13 bond is observed in X-ray structures of the thioredoxin-coupled family of arsenate reductases (B. subtilis ArsC and S. aureus pI258 ArsC)11 and discussed in Chapter VIII29. The dynamic character of the active site P-loop was also shown with NMR for LMW PTPases30 and Cdc25A/B31. A full relaxation of the Cys10-arseno covalent adduct (product of the first reaction step) is necessary, seen the flexibility of the active site and since the starting X-ray structure 1LJU, rather shows the release of arsenite (after the second reaction step has occurred) than a true enzyme-arseno covalent adduct. Superposition of the optimised Cys10-arseno structure on 1LJU gives a root mean square deviation (r.m.s.d.) of 0.7 for 8 pairs of C atoms (PyMol, Delano Scientific LLC, 2006). The overall loop conformation of the ArsC active site is preserved. The main difference with the 1LJU structure is the shift of the direct Arg16-arsenite interaction to an indirect interaction between Arg16 and the Cys10-arseno arsenite group through water molecules in the optimized product. The dihedral angles of the peptide bonds found in the optimized enzyme-substrate complex of ArsC deviate on average by eight degrees from planarity (between a deviation-maximum and -minimum of 23 and 0.3 degrees respectively). This is within the range of reported experimental statistical data32. In the ArsC-arsenate complex, a Cys10-K+ interaction network (going from SCys10, over OSer17 to NAsn13 and terminated by a potassium ion) is involved in the nucleophilic activation of Cys101 (Fig. 7.2A). The structural integrity of this network is maintained in the adduct complex with the exception of the hydrogen bond between Cys10 and Ser17. The interaction between HOSer17 and SCys10 shifts to an interaction between HOSer17 and one of the oxygen atoms of the arsenite group of Cys10-HAsO32(Fig. 7.2B). In the reactant complex, NH and NH of Arg16 form each a hydrogen bond with the oxygen atoms of arsenate (Fig. 7.2A)1. On the other hand in the adduct complex, Arg16 forms two hydrogen bonds with water molecules (Fig. 7.2B). In the ArsC-arsenate complex, the hydrogen bonds between the backbone amides of the substrate binding loop and arsenate play a major role in charge stabilization1. These interactions are not present in the arseno-adduct, with the exception of this with the backbone amide group of serine 14 (Table 7.3). The only direct interactions with the arseno moiety that 118
Chapter VII: Ground state activation by pI258 ArsC are present in both the arsenate- and the adduct-complex are these with the structural water molecules (Table 7.3). In the second reaction step, Cys82 prefers a nucleophilic attack on a mono-anionic Cys10-HAsO3adduct, with the expulsion of a HAsO32- leaving group (Fig. 2.3, Chapter II, Step 2). In view of the pKa of arsenite (9.2, 14.2, 19.2), this group will be protonated. A good candidate therefore is the structural water molecule accepting two hydrogen bonds from Arg16.
Figure 7.2: Structural comparison between A. the ArsC-arsenate and B. ArsC-Cys10-HAsO3- complex.
Donor--Acceptor HOH(1)--O(3) N11H--O(3) HOH(3)--O(1) OH(Ser17)--O(1) HOH(4)--O(2)(H) l () (donor acceptor) 2.56 2.79 2.65 2.65 2.87 a () 174 160 174 154 173
Table 7.3: Interactions with the leaving group -HAsO3- in the wild type product structure. l gives the distance between donor and acceptor and a gives the angle between donor-proton-acceptor. NxH stands for backbone amide group of residue x.
119
3.2 Leaving group activation

One of the ways to enhance the reaction rate of nucleophilic displacement reactions is by facilitating the expulsion of the leaving group. Generally, this can be accomplished by general acid catalysis in which the leaving group is protonated by an acidic group on the enzyme. However, in ArsC no general acid candidate is found in the vicinity of the adduct. Alternatively, stabilization of the leaving group and/or weakening of the scissile bond can aid leaving group expulsion. Both these features can be quantified via theoretical methods. Stabilization of the leaving group can be computed by the nucleofugality7, a group property independent of the bonding partner or the bond strength. Weakening of the scissile bond before cleavage can be quantified via the bond order. Both quantities are calculated at the B3LYP/6311++G** level.
3.2.1 Nucleofugality
During the first reaction step catalysed by ArsC, a hydroxyl-group is released from arsenate (Fig. 2.3, Chapter II, Step 1). A crucial hydrogen-bond between the leaving OH-group and Arg16NH is observed (2.82 ). The nucleofugality of an OH-group in gas phase is notably lower than in wild type enzyme (Table 7.4), indicating that the enzymatic environment activates the leaving OH-group. In comparison to gas phase, OH- is much more stabilized in aqueous solution. Since the nucleofugality of the OH-group in the Arg16Ala mutant decreases significantly in comparison to the wild type enzyme (Table 7.4), Arg16 can be identified as leaving group activator. This is in agreement with previous work (chapter IV)1, in which Arg16 is found to lengthen the As-OH bond. The wild type enzyme increases the leaving group ability of HAsO32- in comparison to gas phase and (aqueous) solution (Table 7.4A). This is accomplished through a direct interaction with Ser17 (2.65 ) and through indirect interactions with both Asn13 (via Ser17, Ser17O-Asn13N: 2.81 ) and with Arg16 in particular (via two water molecules: Arg16N-H2O(1): 2.68 ; Arg16N-H2O(3): 2.66 ) (Fig. 7.2): the nucleofugality of HAsO32- in Arg16Ala ArsC is even lower than in the gas phase. The relative effects of the mutations on the stability of the leaving group are in accordance with the effects found for these mutations on the rate constant kcat10. For the Ser17Ala and Asn13Ala mutants, kcat drops with a factor of 5 and 7 respectively10. The much larger drop in nucleofugality in the absence of Arg16 is in accordance with the observed non-activity of the Arg16Lys mutant. Nucleofugality is a characteristic of the leaving group on itself, independent of bond cleavage. In the next paragraph we discuss leaving group activation from the perspective of the assisted S-As bond cleavage.
120

A. Gas Phase Water Wild Type Arg16Ala Gas Phase = 20.7 Water Wild Type Ser17Ala Asn13Ala Arg16Ala Enucleofuge (kcal/mol) 16.4 -4.0 9.3 39.4 69.3 28.2 40.8 11.6 17.8 18.2 44.6
0.270 1.381 0.478 0.043 0.004 0.105 0.034 0.396 0.240 0.234 0.028
B.
Table 7.4: Leaving group capacity of OH- and HAsO32-. A. Leaving group energies (Enucleofuge) (kcal/mol) and nucleofugality () of the OH-group in gas phase, solvent and in the ArsC active site model. B. Leaving group energies (Enucleofuge) (kcal/mol) and nucleofugality () of HAsO32- in gas phase, solvent and in the ArsC active site model.
3.2.2 Strength of the scissile S-As bond
We have compared the strength of the S-As bond in CH3S-HAsO3- in gas phase, aqueous solution and in the ArsC environment by calculating the Wiberg bond order (WBO)8: the higher the WBO, the larger the covalency and the stronger the bond (Table 7.5). In general, the S-As bond is rather weak: the bond strength is lower than this of a true single covalent bond, which gives a WBO of 18. The WBO of the SAs bond in CH3S-HAsO3- is larger in ArsC (0.87) than in gas phase (0.72) or aqueous solution (0.80) (Table 7.5), but can still be considered as rather low. In parallel, the SCys10-As bond length in SCys10-HAsO3- in the optimized wild type product structure is smaller than this of the S-As bonds in free CH3S-HAsO3- in gas phase or in solvent (Table 7.1). This is a bit surprising since one of the means to initiate enzyme-catalyzed cleavage of a covalent bond is by diminishing the strength of the scissile bond in the reactant state. Seen the intrinsic weak character of such S-As bonds, this ArsC induced bond weakening may not be necessary to initiate enzymatic bond cleavage. The SCys10-As bond length in Cys10-HAsO3- in the optimized Ser17Ala, Arg16Ala and the Asn13Ala mutants are respectively 0.003 , 0.011 and 0.001 larger than this in the optimized wild type structure, which does not represent a significant change (Table 7.1). The corresponding decrease of the WBO in the optimized mutant structures is also minimal (Table 7.5).
121
CH3S-HAsO3Gas phase = 20.7 Aqueous solution Wild Type Asn13Ala Ser17Ala Arg16Ala
BO 0.721 0.794 0.799 0.872 0.866 0.850 0.849
Table 7.5: S-As bond strength. Wiberg Bond Order (WBO) of the S-As bond in CH3S-HAsO3- in gas phase, solvent, and in the ArsC active site model.
3.3 Activation of Cys82

At the optimum pH for enzymatic catalysis by S. aureus ArsC (pH = 8.0)5, a substantial amount of free cysteine (pKa = 8.3) is present in the thiol form, which is a much weaker nucleophile than the thiolate form9. The enzyme could enhance the second reaction step (Figure 1, Step 2) by either deprotonating Cys82 in the transition state (general base catalysis) or by preferably populating the thiolate form of Cys82 in the reactant state (nucleophilic catalysis). Thr11 is the only residue that interacts directly, via a hydrogen bond, with Cys8210. On the other hand, Cys82 is the N-terminal residue of a short -helix spanning the residues 82 to 89 (the redox helix, Fig. 2.3, Chapter II). Since neither Thr11 nor the redox helix can function as a general base, general base catalysis can be excluded. Hydrogen bonds and helices are known to have a pKa lowering effect on the acceptor molecule and the N-terminal residue respectively1,23,33. The pKa of a thiolate compound can be effectively estimated via the NPA (natural population analysis) charge on the S atom23,34 of the nucleophilic thiolate. The calibration curve between the NPA-charge on the sulfur atoms of a series of substituted thiolates and the experimental pKa, presented in Chapter VI23, was used here to translate changes in NPA charge to changes in the acid dissociation constant. Accordingly, the HOThr11 --- SCys82 hydrogen bond suppresses the pKa of Cys82 with respect to free cysteine (pKa = 8.3) with 2.2 units (Table 7.6). Indeed, the Thr11Ala mutant displays an overall kcat value of 90 min-1 ref.35, which is lower than the kcat for wild type ArsC (215 min-1)5. Thr11 interacts only with the catalytic residue Cys82, which is the nucleophile in the second reaction step. We may consequently expect the largest effect of the Thr11Ala mutation on the second reaction step. The presence of the redox helix (structure from 1LJL)11 provides an additional pKa decrease of 1.1 units. This value may well be an underestimate since the gas phase model of the helix includes two negatively charged Asp residues (Asp84 and Asp86) whose electrostatic effect is sure to be dampened in aqueous 122
Chapter VII: Ground state activation by pI258 ArsC solution. All in all, the accumulated effects of Thr11 and the redox helix by large favour the thiolate form of Cys82 in the reactant state at pH 8.0, as such, contributing to nucleophilic catalysis in ArsC.
NPA (au) -0.700 -0.655 -0.752 -0.586 pKa -2.2 -1.1 +2.7 -3.4
SCys82 + Thr11 SCys82 + -helix (PDB: 1LJL) SCys89 + -helix (PDB: 1LJL) SCys89 + partially unfolded -helix (PDB: 1LK0)
Table 7.6: Activation of the nucleophilic cysteines. NPA charge (au) and pKa of Cys82 and Cys89 with respect to free cysteine (pKa = 8.3) in the presence of several structural elements of ArsC.
Cys89 at the C-terminal of the redox helix functions as a nucleophile in the third reaction step (Fig. 2.3, Chapter II, Step 3), in which a Cys82-Cys89 disulfide is formed and Cys10 is regenerated for a subsequent catalytic cycle. To prevent Cys89 from performing an unwanted nucleophilic attack prior to the third reaction step, it has to be kept in the inactive thiol form until the Cys10-Cys82 disulfide is formed. The macro-dipole of the redox helix raises the Cys89 pKa with 2.7 units with respect to free cysteine (Table 7.6). After the formation of the Cys10-Cys82 disulfide bridge in the second reaction step, the 82-89 redox helix partially unfolds (PDB 1LK0)10 (Fig. 2.3, Chapter II). The resulting absence of a helix macro-dipole lowers the Cys89 pKa with 3.4 units compared to the pKa of free cysteine (Table 7.6), generating the nucleophilic thiolate form. Though these values may be an overestimate because of the presence of the Asp residues in the helix model (vide supra), the successive increase and decrease of the Cys89 pKa is clear. Accordingly, we propose that the transition of the redox helix to an intermediate structure between helix and loop during the second reaction step provokes the third reaction step. Hence, ArsC provides a logical sequence of reaction steps in its disulfide cascade.
4. Conclusion
Both the HSAB reactivity analysis and the calculated thermodynamics point to a mono-anionic Cys10arseno adduct in ArsC prior to the nucleophilic attack by Cys82. The HSAB analysis indicates Ser17 as the major activator of the electrophilic Cys10-arseno adduct. Calculation of the nucleofugality indicates that the enzyme increases the leaving group capacity of OH- (first reaction step) and of HAsO32- (second reaction step). On the other hand, Thr11 and the Cys82-Cys89 redox helix activate Cys82 by decreasing its pKa, in agreement with the lowered kcat for the Thr11Ala mutant. Prior to the third reaction step, Cys89 is kept in the non-active high pKa form by the presence of the Cys82-Cys89 redox helix. This helix partially unfolds when the Cys10-Cys82 disulfide is formed favouring the thiolate form of Cys89 and enabling the third reaction step to occur. 123
References
1. Roos, G., Messens, J., Loverix, S., Wyns, L., Geerlings, P., J. Phys. Chem. B 2004, 108, 17216. 2. a. Parr, R. G., Yang, W., Density-Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, 1998. b. Koch, C. W., Holthausen, M. C., A Chemist`s Guide to Density Functional Theory, Second Edition, Wiley VCH, Weinheim, Germany, 2001. 3. a. Parr, R. G., Yang, W., Ann. Rev. Phys. Chem. 1995, 46, 701. b. Chermette, H., J. Comp. Chem. 1999, 20, 129. c. Geerlings, P., De Proft, F., Int. J. Quant. Chem. 2000, 80, 227. d. De Proft, F., Geerlings, P., Chem Rev. 2001, 101, 1451. e. Geerlings, P., De Proft, F., Langenaeker, W., Chem. Rev. 2003, 103, 1793. 4. Roos, G., Loverix, S., De Proft, F., Wyns, L., Geerlings, P., J. Phys. Chem. A 2003, 107, 6828. 5. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. 6. Fersht, A., Enzyme Structure and Mechanism, W. H. Freeman and Company, New York, 1984. 7. a. Ayers, P. W., Anderson, J. S. M., Rodriguez, J. I., Jawed, Z., Phys. Chem. Chem. Phys. 2005, 7, 1918. b. Ayers, P. W., Anderson, J. S. M., Bartolotti, J. L., Int. J. Quant. Chem. 2005, 101, 520. 8. Wiberg, K. B., Tetrahedron 1968, 24, 1083. 9. Dantzman, C. L., Kiessling, L. L., J. Am. Chem. Soc. 1997, 118, 11715. 10. Messens, J., Martins, J. C., Van Belle, K., Brosens, E., Desmyter, A., De Gieter, M., Wieruszeski, J-M., Willem, R., Wyns, L., Zegers, I., Proc. Natl. Acad. Sci. USA 2002, 99, 8506. 11. Zegers, I., Martins, J. C, Willem, R., Wyns, L., Messens, J., Nat. Struct. Biol. 2001, 8, 843. 12. Svensson, M., Humbel, S., Froese, R. D. J., Sieber, S., Morokuma, K., J. Phys. Chem. 1996, 100, 19357. 13. Dapprich, S., Komromi, I., Byun, K. S., Morokuma, K., Frisch, M. J., J. Mol. Struct. (Theochem) 1999, 462, 1. 14. a. Torrent, M., Vreven, T., Musaev, G., Morokuma, K., Farkas, ., Schlegel, H. B., J. Am. Chem. Soc. 2002, 124, 192. b. Vreven, T., Morokuma, K., J. Phys. Chem. A 2002, 106, 6167. 15. Vreven, T., Morokuma, K., Farkas, ., Schlegel, B. J., Frisch, M. J., J. Comp. Chem. 2003, 24, 760. 16. a. Mignon, P., Steyaert, J., Loris, R., Geerlings, P., Loverix, S., J. Biol. Chem. 2002, 277, 36770. b. Mignon, P., Loverix, S., Steyaert, J., Geerlings, P., Int. J. Quant. Chem. 2004, 99, 53. c. Versees, W., Loverix, S., Vandemeulebroeke, A., Geerlings, P., Steyaert, J., J. Mol. Biol. 2004, 338, 1. 17. a. Breneman, C. M., Wiberg, K. B., J. Comp. Chem. 1990, 11, 361. b. Hayes, D. M., Kollman, P. A., J. Am. Chem. Soc. 1976, 98, 3335. 18. Sigfridsson, E., Ryde U., J. Comp. Chem. 1998, 19, 377. 19. Baeten, A., Maes, D., Geerlings, P., J. Theoret. Biol. 1998, 195, 27. 20. a. Miertus, S., Scrocco, E., Tomasi, J., Chem. Phys. 1981, 55, 117. b. Mennucci, B., Tomasi, J., J. Chem. Phys. 1997, 106, 5151. c. Cammi, R., Mennucci, B., Tomasi, J., J. Phys. Chem. A 2000, 104, 5631. d. Cossi, M., Scalmani, G., Rega, N., Barone, V., J. Chem. Phys. 2002, 117, 43. 21. a. Schutz, C., Warshel, A., Proteins: Struct., Funct. and Genet. 2001, 44, 400. b. Fitch, C. A., Karp, D. A., Lee, K. K., Stites, W. E., Lattman, E. E., Garcia-Moreno, E. B., Biophys. J. 2002, 82, 3289. 22. Dillet, V., Van Etten, R., Bashford, D., J. Phys Chem B 2000, 104, 11321. 23. Roos, G., Loverix, S., Geerlings, P., J. Phys. Chem. B 2006, 110, 557.
124

24. Gaussian 03, Revision A.1, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuceria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M Hada, M. Ehara, k. Toyota, R. Fukuda, J. Hasegawa M. Ishida, T. nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J.A. Pople, Gaussian, Inc., Pittsburgh PA, 2003. 25. Martin, P., Demel, S., Shi, J., Gladysheva, T., Gatti, D. L., Rosen, B. P., Edwards, B. F., Enzyme Structure 2001, 9, 1071. 26. Demel, S., Shi, J., Martin, P., Rosen, B. P., Edwards, B. F., Protein Sci. 2004, 13, 2330. 27. Jeffrey, G. A., An Introduction to Hydrogen Bonding, Oxford University Press, New York, 1997. 28. Pearson, R. G., Chemical Hardness, Wiley-VCH, Weinheim, Germany, 1997. 29. Roos, G., Buts, L., Van Belle, K., Brosens, E., Geerlings, P., Loris, R., Wyns, L., Messens, J., J. Mol. Biol. 2006, 360, 826. 30. Gustafson, C. L. T., Stauffacher, C. V., Hallenga, K., Van Etten, R. L., Protein Sci. 2005, 14, 2515. 31. a. Fauman, E. B., Cogswell, J. P., Lovejoy, B., Rocque, W. J., Holmes, W., Montana, V. G., Piwnica-Worms, H., Rink, M. J., Saper, M. A., Cell 1998, 93, 617. b. Reynolds, R. A., Yem, A. W., Wolfe, C. L., Deibel, M. R. Jr., Chidester, C. G., Watenpaugh, K. D., J. Mol. Biol. 1999, 293, 559. c. Buhrman, G., Parker, B., Sohn, J., Rudolph, J., Mattos, C., Biochemistry 2005, 44, 5307. 32. a. Ramachandran G. N., Biopolymers 1968, 6, 1494. b. Scarsdale, J. N., Van Alsenoy, C., Klimkowski, V. J., Schaefer, L., Momany, F. A., J. Am. Chem. Soc. 1983, 105, 3438. c. Schubert, H. L., Fauman, E. B., Stuckey, J. A., Dixon, J. E., Saper, M. A., Protein Sci. 1995, 4, 1904. 33. a. Hol, W. G. J., Van Duijnen, P. T., Berendsen, H. J. C., Nature 1978, 273, 443. b. Sancho, J., Serrano, L., Fersht, A. R., Biochemistry 1982, 31, 2253. 34. Gross, K. C., Seybold, P. G., Peralta-Inga, Z., Murray, J. S., Politzer, P., J. Org. Chem. 2001, 66, 6919. 35. The kinetic parameters of ArsC Thr11Ala were determined by Karolien Van Belle, Elke Brosens and Joris Messens as described in Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146.
125
126
Chapter VIII Interplay between ion binding and catalysis in the thioredoxin-coupled arsenate reductase family
The mathematical predictions of quantum mechanics yield results that are in agreement with experimental findings. That is the reason we use quantum theory. That quantum theory fits experiment is what validates the theory, but why experiment should give such peculiar results is a mystery. This is the shock to which Bohr referred.
(Marvin Chester with slight modifications)
128
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC

In the thioredoxin (Trx)-coupled arsenate reductase family, arsenate reductase from Staphylococcus aureus plasmid pI258 (Sa_ArsC) and from Bacillus subtilis (Bs_ArsC) are structurally related detoxification enzymes. Catalysis of the reduction of arsenate to arsenite involves a P-loop (Cys10Thr11Gly12Asn13Ser14Cys15Arg16) structural motif and a disulfide cascade between three conserved cysteines (Cys10, Cys82 and Cys89). For its activity, Sa_ArsC benefits from the binding of tetrahedral oxyanions in the P-loop active site and from the binding of potassium in a specific cation-binding site. In contrast, the steady-state kinetic parameters of Bs_ArsC are not affected by sulphate or potassium. The commonly occurring mutation of a histidine (H62), located about 6 from the potassium-binding site in Sa_ArsC, to a glutamine uncouples the kinetic dependency on potassium. In addition, the binding affinity for potassium is affected by the presence of a lysine (K33) or an aspartic acid (D33) in combination with two negative charges (D30 and E31) on the surface of Trx-coupled arsenate reductases. In the Ploop of the Trx-coupled arsenate reductase family, the peptide bond between Gly12 and Asn13 can adopt two distinct conformations. The unique geometry of the P-loop with Asn13 in conformation, which is not observed in structurally related LMW PTPases, is stabilized by tetrahedral oxyanions and decreases the pKa of Cys10 and Cys82. Tetrahedral oxyanions stabilizes the P-loop in its catalytically most active form, which might explain the observed increase in kcat for Sa_ArsC. Therefore, a subtle interplay of potassium and sulphate dictates the kinetics of thioredoxin-coupled arsenate reductases.
1. Introduction
In chapter IV we have seen that a K+-Cys10 hydrogen bonding network via Asn13 and Ser17 activates the nucleophile and stabilizes the thiolate form of Cys10 by lowering its pKa to 6.2ref.1. The presence of K+ in a specific potassium binding site is an interesting feature of Staphylococcus aureus arsenate reductase from plasmid pI258 (Sa_ArsC) which has not been observed among other members of the same arsenate reductase family nor among the structurally related LMW PTPases2. The binding of K+ is an enthalpy-driven process. K+ binding stabilizes Sa_ArsC and increases the specific activity with a factor of 5ref.2. Furthermore, Sa_ArsC is the only arsenate reductase for which the kinetics is characterized by a very unusual biphasic Michaelis Menten profile3. Tetrahedral oxyanions essentially eliminate this behaviour at millimolar concentrations and increase the kcat of Sa_ArsC with a factor of approximately 5ref.4. Also for the complete resonance assignment in NMR, the binding of sulphate with residues located in the P-loop was shown to be necessary for arresting the dynamic character of the active site4,5. A PSI-BLAST6 search in the non-redundant SwissProt sequence database7 with Sa_ArsC as an entry identified 21 members of sequence related (sequence identity higher than 60%) arsenate reductases (Fig. 8.1). A similar search against the PDB protein database8 results in 2 arsenate reductases: Sa_ArsC and 129
ArsC from Bacillus subtilis (Bs_ArsC). Their crystal structures are similar (root mean square deviation (r.m.s.d.) of 0.87 for 127 C atoms)9,10 and both arsenate reductases use a similar thioredoxin (Trx)coupled reduction mechanism with a reversible conformational switch of the short -helix10,11. However, in the published Bs_ArsC structure no K+ is present in the equivalent K+-binding site of Sa_ArsC. The only difference in the neighbourhood of the K+-binding site between the structures of Bs_ArsC and Sa_ArsC is the presence of a glutamine (Gln62) at the position of a histidine (His62) in Sa_ArsC; a conserved mutation within this family (Fig. 8.1). Another difference is localized about 9 away from the K+-binding site at the surface of the protein: an asparagine (Asn33) and a glycine (Gly31) in Sa_ArsC instead of a lysine (Lys33) and a glutamate (Glu31) in Bs_ArsC (Fig. 8.1 and 8.2). The impact of the non-conserved histidine, lysine and glycine within the Trx-coupled family of arsenate reductases on the binding of K+ and on catalysis is studied. The structures of Sa_ArsC mutants H62Q and of C10SC15A with sulphate in the active site are solved. Further, in the P-loop of the Trx-coupled arsenate reductase family, the peptide bond between Gly12 and Asn13 can adopt two distinct conformations, which is not observed in structurally related LMW PTPases. This unique geometry of the P-loop with Asn13 in conformation is analysed and discussed. Quantum chemical calculations elucidate the K+-binding properties of the cation-binding pocket on itself and provide insight in the effect of geometry change of the P-loop on the pKa of the catalytic cysteines. All together, we show with this study that mutations within the Trx-coupled family of arsenate reductases lead to subtle different ion-dependent kinetic features.
Figure 8.1: The Trx-coupled arsenate reductase family. Arsenate reductase sequences from the SwissProt16 database after a PSI-BLAST15 search with Sa_ArsC as entry. Multiple sequence alignment computed with CLUSTAL W50 with a numbering based on the numbering of Sa_ArsC. Sa_ArsC and Bs_ArsC are indicated with an arrow. The percentage of conservation within this family is indicated by a colour change from black to red (below 70%). The redox active cysteines are indicated with an asterisk and the positions 30-33 and 62 are grey boxed. The consensus sequence is an artificial sequence, which reflects the majority of the presence of an amino acid at a certain position. Abbreviations: Bc: Bacillus cereus; Bt: Bacillus thuringiensis; Ba: Bacillus anthracis; Bm: Bacillus megaterium; Bl: Bacillus licheniformis; Bs: Bacillus subtilis; Oi: Oceanobacillus iheyensis; Bcl: Bacillus clausii; Bh: Bacillus halodurans; Sa: Staphylococcus aureus; Sx: Staphylococcus xylosus; Sh: Staphylococcus haemolyticus; Ss: Staphylococcus saprophyticus; Se: Staphylococcus epidermidis and Sa_ac: Staphylococcus aureus subsp. aureus col. The locus name is added the moment the source name abbreviation is not exclusive. Figure created by Messens, J.
130
131
Figure 8.2: Stereoview of the refined structure of Bs_ArsC. Lysine 33 (blue), aspartate 30 and glutamate 31 (red), the P-loop active site (residue 10-16) (red tube) with asparagine 13 (red stick representation), the redox active cysteines (yellow), sulphate (atom type), the cation-binding site residues (green) and sodium (magenta) are shown. The figure was generated by using MacPyMol (Delano Scientific LLC 2005) by Messens, J.

In order to study the binding affinity of K+ in wild type Sa_ArsC (PDB 1LJL), in Sa_ArsC C10S C15A harbouring perchlorate (PDB 1JFV), in Sa_ArsC H62Q (PDB 2CD7) and in Bs_ArsC in complex with sulphate (PDB 1JL3) after an extra round of refinement (vide infra), a model system including the socalled K+ binding pocket was constructed. This pocket consists of Asp65, Glu21, both modelled as CH3COO-, of Ser36, modelled as HOCH2-CH2-CONH2, of His62/Gln62 and Thr63, which were fully incorporated and on both the C- and N- side terminated by CO-NH, and of Asn13 modelled as NH2-COCH3 (Fig. 8.3). In a second model system constructed from 1LJL and 1JFV, Asn13 was fully incorporated together with the peptide bonds between Asn13 and Gly12 and Asn13 and Ser14 to see the effect of the - flip on the binding of K+. A potassium ion and two water molecules completed the model system. In the pocket of Bs_ArsC no other water molecules but the central water molecule was observed in the X-ray structure. This central water molecule was replaced by K+ and Na+. The Na+ ion 132
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC of the 2CD7 pocket was replaced by K+. In all the model systems hydrogen atoms were placed and optimized at the B3LYP/6-31G* level in gas phase, while the heavy atoms except for the cations were fixed. For energy calculations, a continuum solvation shell was added with a dielectric constant () of 20 to represent the protein environment1.
Figure 8.3: Model systems for calculation of the binding energy. Coordinates were obtained from wild type Sa_ArsC (1LJL).
To calculate the binding affinity of the cations (Eint) in their binding pockets, the following scheme is adopted:
Pocketgas phase + Cationgas phase

des pocket des cation
Complexgas phase
sol complex
(8.1)
Pocket = 20
+ Cation = 20
Complex = 20
We have virtually started with the solvated educts and have calculated their desolvation energy using PCM on the B3LYP/6-31+G** level. Next, the total energy difference of educts and products is calculated in gasphase also on the B3LYP/6-31+G** level. Here, the basis set superposition error (BSSE) was taken into account by the counterpoise correction (CP) proposed by Boys and Bernardi12. Then, the solvation energy of the complex is calculated (PCM, B3LYP/6-31+G**). 133
The energy needed to bring the cations from water to = 20.0 (Edes K+ = 3.3 kcal/mol; Edes Na+ = 4.1 kcal/mol; Edes H2O = 0.08 kcal/mol) was added. We came to the following equation; Eint are the presented values: Eint = Eint, gas phase, BSSE corrected+ Edes pocket + Edes cation + Esol complex + Edes (8.2)
To investigate the influence of sulphate binding and the according Gly12-Asn13 - flip on the pKa drop of Cys10 and Cys82 induced by Arg16, model systems consisting of Arg16, Ser17, Cys10, Thr11, Asp105 (represented as CH2COO-), the peptide bond between Phe103 and Asp104, Cys82 (represented as CH3S-) and one or two water molecules were constructed from the X-ray structures of 1JL3 and 1LJL respectively (Fig. 8.4).
Figure 8.4: Model systems for pKa estimation in presence (A) and absence (B) of Arg16. Coordinates were obtained from wild type Sa_ArsC (1LJL).
Hydrogen atoms were placed and optimized at the B3LYP/6-31G* level, while the heavy atoms were fixed. To explore and quantify the effects of Arg16 on the basicities of Cys10 and Cys82, a natural population analysis (NPA) on SCys10 and SCys82 in the presence and absence of Arg16 was performed at the B3LYP/6-31G* level in a continuum solvent model (PCM) with = 20. A similar linear relationship between the calculated NPA charge and the pKa, as presented in Chapter VI13 was used to translate the changes in NPA charge on the sulfur atom to changes in the pKa (Fig. 8.5). 134

1 3 2 4 5 y = -34.891x - 19.11 R = 0.99
11 10 9 8 7 6 5
pH
6 7
-0.9 -0.85 -0.8
4 3 2 -0.6
NPA (au)
-0.75
-0.7
-0.65
Figure 8.5: NPApKa correlation curve. NPA-charge (of the sulfur atoms of the thiolates)-pKa calibration curve obtained for a series of substituted thiolates (methanethiol (1), benzenemethanethiol (2), mercaptoethanol (3), cysteine (4), trifluoroethanethiol (5), thioacetic acid (6) and trifluoromethanethiol (7)) in solution ( = 20.7) obtained at the B3LYP/6-31G* level on the gas phase geometry.
The P-loops (for a detailed model system, see ref. 1) of he following arsenate reductase structures: 1FXI, 1JF8, 1JL3 and 1JFV were fully optimised with and without a sulphate ligand at the HF/3-21G level. Subsequent energy calculations are performed on the B3LYP/6-31+G** level. All calculations were performed using the GAUSSIAN 03 package14. The experiments reported in this chapter were designed and coordinated by Joris Messens and performed by Lieven Buts (isothermal titration calorimetry (ITC)), Elke Brosens (site-directed mutagenesis, expression and purification), Karolien Van Belle (kinetic analysis), Remy Loris (crystallization, data collection and structure determination) and Joris Messens (structure determination). Details on the methodology can be found in Roos et al.15 The results of this experimental work support the quantum chemical study and are consequently included in the results and discussion section (sections 3 and 4).
3. Results
3.1 The kinetics of Sa_ArsC and Bs_ArsC
The impact of potassium and the tetrahedral oxyanion sulphate on the steady-state kinetic parameters of both S. aureus and B. subtilus Trx-coupled arsenate reductases is studied (Table 8.1). 135
Comparison of the steady-state kinetic data of Sa_ArsC with those of Bs_ArsC in the optimal conditions for Sa_ArsC (50 mM Tris/HCl, pH 8.0, 50 mM K2SO4 and 0.1 mM EDTA) does not show any extreme differences in absolute kinetic parameter values (Table 8.1). Changing potassium for sodium in the presence of sulphate has no effect on the kinetic parameters of Sa_ArsC and Bs_ArsC (Table 8.1). However, when potassium is replaced by sodium in the absence of sulphate, the difference between Sa_ArsC and Bs_ArsC becomes more pronounced (Table 8.1). The kinetic parameters of Bs_ArsC are potassium independent, while the specificity constant of Sa_ArsC decreases with a factor of 4 in the absence of potassium. In Sa_ArsC, removing sulphate either in the presence or absence of potassium results in a significant drop of the kcat. A corresponding difference in kcat is absent in Bs_ArsC. As such, the activity of Bs_ArsC is sulphate independent and there is no extreme biphasic Michaelis-Menten profile as observed in Sa_ArsC4. From thermodynamic cycles analysis16 it becomes clear that the insignificant coupling energy between the replacement of potassium and sulphate (Gtotal = 0.5 kJ/mol) makes Bs_ArsC a potassium and sulphate independent member of the Trx-coupled family of arsenate reductases. For comparison, the coupling energy (Gtotal) for Sa_ArsC is 3.5 kJ/mol (Further details can be found in Roos et al15).
conditions K2SO4 KCl Na2SO4 NaCl K2SO4 KCl Na2SO4 NaCl K2SO4 KCl Na2SO4 NaCl Sa_ArsC wild type KM (M) 81 14 9 1 61 4 22 4 47 4 54 7 64 6 58 5 131 11 87 14 80 12 123 18 kcat (min-1) 219 12 54 2 172 8 35 2 95 2 96 3 120 3 80 3 182 6 42 2 119 4 45 2 kcat/KM (M-1s-1) (x104) 4.5 1.8 10.0 1.5 4.7 0.53 2.7 0.64 3.3 0.35 3.0 0.48 3.1 0.37 2.3 0.28 2.3 0.27 0.8 0.16 2.5 0.46 0.6 0.11
Table 8.1: Ion-dependent steady-state kinetics of the Trx-coupled arsenate reductases
Sa_ArsC H62Q
Bs_ArsC wild type
136
3.2 The kinetic parameters of Sa_ArsC H62Q

Although both Sa_ArsC and Bs_ArsC belong to the Trx-coupled arsenate reductases and use the same mechanism to reduce arsenate to arsenite9,10, subtle local structural differences seem to determine the sulphate and cation dependency. The only residue that has been observed to be different beween Sa_ArsC and Bs_ArsC in the direct environment of the K+-binding site is located on position 62. In Sa_ArsC, a histidine was found on position 62, while in wild type Bs_ArsC a glutamine is observed. The histidine to glutamine mutation at position 62 is a conserved mutation in the Sa_ArsC related Trxcoupled arsenate reductases family (Fig. 8.1). Half of the arsenate reductase entries in this family from the SwissProt data bank7 have a histidine replaced by a glutamine at this position, showing the evolutionary importance of this mutation. Wild type Sa_ArsC was subtilised by introducing a glutamine on position 62. We determined the steady-state kinetic parameters of Sa_ArsC H62Q in the presence and absence of potassium and sulphate (Table 8.1). In Sa_ArsC H62Q, like in wild type Sa_ArsC, we observe a significant increase in kcat in the presence of sulphate. However, in wild type Bs_ArsC no sulphate dependency is observed. On the other hand, exchanging potassium for sodium does not affect the kinetic parameters of Sa_ArsC H62Q. The H62Q point mutation thus uncouples the sulphate effect from the potassium effect on the kinetics of Sa_ArsC. The total free energy of coupling (Gtotal) becomes lower than 1 kJ/mol.
3.3 Structure of Sa_ArsC H62Q

After purification and crystallization, the X-ray structure of Sa_ArsC H62Q was determined to a resolution of 1.5 (Table 8.2). Only the crystals obtained in the presence of 100 mM NaCl diffracted, while similar crystals obtained in the presence of 100 mM KCl showed no useful diffraction. After superimposing this structure with the reduced form of wild-type Sa_ArsC (PDB 1LJL) using the SSM algorithm17, a main chain r.m.s.d. of 0.6 is obtained, making both structures very similar. ArsC H62Q is also characterized by a well-ordered redox helix flanked by the catalytic important cysteines 82 and 89 (Fig. 8.2), typical for the reduced form of thioredoxin-coupled arsenate reductases18. Looking in detail at the typical potassium binding site next to the P-loop (Fig. 8.6A), we observe spherical electron density that can be interpreted as a sodium ion. This is the most obvious choice, as the crystals were obtained from a protein in a buffer solution containing 100 mM NaCl. Further evidence for Na+ binding came from the refinement with a potassium ion instead of sodium. This resulted in a significant excess of negative Fo-Fc difference density, indicating that an atom or group with significantly fewer electrons than potassium occupies the site. 137
Figure 8.6. The cation-binding site. A. A stereo view of the 2Fo-Fc electron density map contoured at 1 of the potassium-binding site in Sa_ArsC H62Q. A sodium ion (blue sphere) coordinated by five protein oxygen atoms (stick representation) and one ordered water molecules (red spheres) is shown. B. Corresponding view of the sodium-binding site in chain D of Bs_ArsC after sodium was placed in the electron density and after refinement with REFMAC19. Black stippled lines indicate coordination bonds. Both figure were generated using MacPyMol (Delano Scientific LLC 2005) by Messens, J.
Sa_ArsC H62Q PDB: 2CD7 15.0 1.5 (1.55 - 1.5) P212121 a = 33.6; b = 33.2; c = 100.6 0.8123 518229 16688 89.2 (91.0) 3.4 (21.9) 20.8 (6.3) 0.183 / 0.218 0.009 1.192 188 Na+ Sa_ArsC C10SC15A + sulphate PDB: 2FXI 20.0 1.8 (1.86 - 1.8) P212121 a = 33.6; b = 33.6; c = 99.6 0.934 36130 10780 97.9 (83.9) 3.7 (8.0) 15.5 (6.6) 0.189 / 0.233 0.005 1.131 168 K+, SO42-
Resolution limits () Space group Unit cell () Wavelength () Measured reflections Unique reflections Completeness (%) Rmerge (%) I/I Refinement statistics Rcryst/Rfree r.m.s. bond lengths () r.m.s. angles () Number of waters Ligands
Table 8.2: Data collection and refinement statistics. Values between parentheses correspond to the I I highest resolution shell. Rmerge = hkl hkl hkl I
138
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC The presence of a water molecule is unlikely, because the maximum coordination number of a water molecule is four, while we observe an octahedral conformation with six binding partners around this electron density (Fig. 8.6A). Six oxygen atoms coordinate the Na+ ion, and one water molecule is involved. The coordinating water oxygen is located at a distance of 2.12 from Na+. Threonine 63 coordinates through a main chain carbonyl oxygen (Na+ - O: 2.44 ), whereas S36 provides two coordination partners: the main chain carbonyl oxygen (Na+ - O: 2.47 ) and the side chain hydroxyl oxygen (Na+ - O: 2.81 ). The carboxylate group of D65 forms a single interaction with a Na+ - O distance of 2.41 . Finally, asparagine 13 coordinates through the oxygen of the side chain amide (Na+ O: 2.51 ). The distances are within error equal to the theoretical ideal distance for a Na+-O pair in an octahedral coordination shell with 6 coordination partners (2.46 )20. The observed coordination bond lengths for the sodium towards its ligands in the Sa_ArsC H62Q mutant structure are on average 0.34 shorter as compared to the bond lengths to coordinate potassium in the other Sa_ArsC structures2. In wild type Sa_ArsC, seven oxygen atoms, belonging to six ligands, coordinate the K+ ion: two well defined water molecules, Asn13, Ser36, Thr63 and Asp65ref.2.
3.4 The cation-binding site in Bs_ArsC

In the structure of Bs_ArsC (PDB 1JL3)9 water molecules occupy the cation-binding site. However, in view of the coordination environment and the fact that the crystals of Bs_ArsC were grown in the presence of 100 mM Na citrate21, we may still consider a sodium ion binding site. After changing the rotamer conformation of the side chain of Asn13 and reconsidering the electron density, a consistent site with five protein oxygen atoms and two well-defined water molecules coordinating a sodium ion was created (Fig. 8.2 and 8.6B). An extra refinement round analogues to the Sa_ArsC refinement against the deposited X-ray data resulted in a decrease of Rcryst from 22.6 % to 21.5 % and of Rfree from 24.6 % to 23.8 %, supporting our interpretation of the binding site density. Remarkably, in Bs_ArsC the central electron density in the cation-binding site has other oxygen partners and the coordination distances towards the different oxygen partners are not the same as in Sa_ArsC (Fig. 8.6). The interacting water molecules are located on different positions as compared to the water in the cation-binding site of Sa_ArsC H62Q and the average distance between the oxygen ligands and the ion is larger (2.96 ). In comparison to the binding of Na+ (-20 kcal/mol), quantum chemical calculations indicate that the binding of water (11 kcal/mol) in the cation-binding site of Bs_ArsC is highly unlikely.
139
3.5 The binding of potassium

The question remains however whether this active site is more likely to harbour a sodium ion than a potassium ion. To address this question we used quantum chemical (QC) calculation techniques to evaluate the cation-binding site on itself and isothermal titration calorimetry (ITC) to evaluate the binding in the cation-binding site in the protein environment. To calculate the binding of potassium, models are made based on the cation-binding site of several arsenate reductase crystal structures (Fig. 8.3). ITC titration curves agree with a 1:1 binding model for the binding of K+ to the different arsenate reductase variants. Sodium showed no significant binding to wild type Sa_ArsC2, Sa_ArsC H62Q as well as to Bs_ArsC (Table 8.3).
arsenate reductases Sa_ArsC wild type Sa_ArsC H62Q Sa_ArsC H62QN33K Sa_ArsC H62QN33KE30DG31E Bs_ArsC wild type ligand K+ K+ K+ K+ K+ H2O K+ K+ KD (M)b 263 500 170 4000 > 10 mM 217 1400 binding energy Eint (kcal/mol) -51 -12 n.c.a n.c.a n.c.a +11 -31 n.c.a
Bs_ArsC K33N Bs_ArsC K33D a n.c., not calculated b KD for Na+ was for all the mutants > 10 mM
Table 8.3: Ligand binding in ArsC. Binding of cations and water in the cation-binding site of different arsenate reductases (ITC measured dissociation constants) and in a model based on of the respective cation-binding sites (quantum chemical calculated binding energies, Eint).
Potassium was found to bind in the higher micromolar range with Sa_ArsC H62Q having a two times lower affinity as wild type Sa_ArsC. Quantum chemical calculations point in the same direction. On the other hand, ITC measurement shows that Bs_ArsC has no detectable binding for potassium. That Bs_ArsC was not binding potassium in the micromolar range was expected based on the kinetic data. Remarkable however is the fact that based on the quantum chemical calculations, the cationbinding site itself is a good harbour for potassium (Table 8.3). That Sa_ArsC H62Q was still binding potassium better than sodium was unexpected, because the cation-binding site of Bs_ArsC was 140
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC introduced in Sa_ArsC. So, there must be another important residue that determines the binding of potassium in Bs_ArsC.
3.6 Lysine 33 on the surface of Bs_ArsC

Based on SwissProt database7 analysis, Bs_ArsC was found to be the only representative structure within the Trx-coupled arsenate reductase family with a positive charge in the form of a lysine on position 33 (Fig. 8.1). From recent structural genomics projects new structures became available, such as the structure of arsenate reductase from Archaeoglobus fulgidus Dsm 4304 (Af_ArsC) (PDB 1Y1L). Although, this 15 kDa enzyme has only 26% sequence identity with Sa_ArsC, their structures are similar (r.m.s.d. of 1.5 for 119 pairs of C atoms) and the position of the 3 catalytic cysteines is conserved (Fig. 8.6). From the structural point of view, Af_ArsC will be a member of the Trx-coupled family. Important here is that on the surface of Af_ArsC in exactly the same position as in Bs_ArsC a conserved lysine is observed. The presence of a conserved positive charge at this site might prevent the cation from entering the cation-binding site of Bs_ArsC. The corresponding residue in Sa_ArsC is asparagine 33 (Fig. 8.1). Therefore, we constructed the Bs_ArsC K33N mutant and measured with ITC the binding of potassium (Table 8.3). Remarkably, after removing the positive charge on the surface of wild type Bs_ArsC, it turned out that it now binds K+ in the micromolar range (Bs_ArsC K33N in Table 8.3). As such, a lysine in this position at the surface of Bs_ArsC (Fig. 8.1) acts as a gatekeeper against high affinity binding of potassium in its cation-binding site. Both Sa_ArsC and Bs_ArsC K33N, which contain the consensus sequence (DEWN) in the 30-33 regio (Fig. 8.1), are binding potassium in the micromolar rang. This observation confirms the quantum chemical calculations (Table 8.3), where the binding energy of potassium in a model based on the cation-binding site of Bs_ArsC is in the same range as the K+ binding energy of Sa_ArsC H62Q. Introducing a lysine on the surface of ArsC H62Q did not abolish the binding affinity for potassium (Sa_ArsC H62Q N33K in Table 8.3). Next to the lysine 33, other amino acids seem to be essential for preventing the binding of potassium in Bs_ArsC.
141
3.7 Negative charges on the surface of Trx-coupled arsenate reductases

A PSI-BLAST6 comparison study (Fig. 8.1) within this family of arsenate reductases argues for the influence of a negative charge on position 31. Bs_ArsC has two negative charged residues next to each other in the form of an aspartate and glutamate followed by the conserved tryptophan (W32) (Asp30Glu31Tryp32Lys33) (Fig. 8.2). About half (11 out of 21) of the arsenate reductases in the Trxcoupled family have evolved in this direction, while Sa_ArsC (Asp30Gly31Tryp32Asn33) has a glycine at position 31 (Fig. 8.1). To study the influence of the negative charge in the 30-33 region, we constructed two mutants: Sa_ArsC E30D G31E N33K H62Q and Bs_ArsC K33D and measured the binding of K+ with ITC (Table 8.3). Both enzymes failed to bind potassium in the micromolar range. Based on the observation for the quadruple Sa_ArsC mutant, we conclude that the combination of a negative charge on position 31 and a positive charge on position 33 reduces the binding affinity of potassium. The presence of only a positive charge as in Sa_ArsC H62Q N33K was not sufficient to prevent micromolar binding of potassium (Table 8.3). Further, members of the Trx-coupled family with three negative charges in the 30-33 regio represented by the Bs_ArsC K33D mutant also do not bind potassium in the micromolar range.
3.8 The link between the P-loop and the cation-binding site
The only residue of the P-loop that coordinates K+ or Na+ with its side chain in the Trx-coupled family of arsenate reductases is asparagine 13 (Fig. 8.2). A Sa_ArsC N13A mutant was constructed and the kinetic parameters are determined. A low kcat value of 20 min-1 and 15 min-1 respectively in the presence and absence of sulphate shows that the mutation makes the kinetics of Sa_ArsC sulphate independent. This mutation further decreases the kcat with a factor of about 10 compared with the wild type enzyme. The P-loop conformations of arsenate reductases from the Trx-coupled family9,10,18,22, the structurally related LMW tyrosine phosphatases23,24,25 and cell cycle controlling phosphatases (Cdc25A and Cdc25B)26,27,28 were analysed. In some P-loops the backbone conformation between the second and third residue following the nucleophilic cysteine is flipped from the left-handed L to a conformation. The peptide flip is absent in the P-loop of LMW PTPases (Cys-Leu-Gly-Asn-Ile-Cys-Arg) where the conserved asparagines exist in a left-handed L conformation. In the cell cycling controlling phosphatases (for example Cdc25A (1C25) of S433: +42) the first serine of the P-loop sequence motif (Cys-Glu-Phe-Ser-Ser-Glu-Arg) is also in the left-handed conformation. In the Trx-coupled 142
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC arsenate reductase family, there are structures with and without - flip. In the sulphate harbouring Ploop (PDB 2FXI) of Sa_ArsC C10SC15A (Table 8.2 and Fig. 8.7) asparagine 13 is in the left-handed L conformation, like in phosphatases ( of N13: + 40). On the other hand, in the P-loop of Sa_ArsC C10SC15A occupied by the oxyanion perchlorate ( of N13: -134) and in wild type Bs_ArsC occupied by sulphate, Asn13 has a conformation with a negative -angle of - 133 (Fig. 8.7). As such, a direct link between kinetic activation by oxyanions and the - flip in the active site based on X-ray structures is not possible.
Figure 8.7: The conformational change in the P-loop of Sa_ArsC. A. The 2Fo-Fc electron density map contoured at 1 of the P-loop of Sa_ArsC C10SC15A (PDB: 2FXI). B. The P-loop active site of Sa_ArsC C10SC15A harbouring a sulphate molecule (PDB: 2FXI) (green) with the peptide bond between Gly12 and Asn13 in a left-handed L conformation. On top is the P-loop (PDB: 1JFV)10 (salmon) that binds a perchlorate (not shown) with the peptide bond between Gly12 and Asn13 flipped to a conformation visualized. Structures were superposed with the SSM algorithm17. Figures were generated by using MacPyMol (Delano Scientific LLC 2005) by Messens, J.
As the P-loop structural motif is not visible with NMR in the absence of tetrahedral oxyanions4,11, we investigated the intrinsic stability of the P-loop with Asn13 in and L conformation in the presence and absence of sulphate. In the absence of tetrahedral oxyanions both conformations are equally stable, in agreement with the observed flexibility of the empty P-loop (Table 8.4). In the presence of tetrahedral oxyanions, the conformation of Asn13 is the most stable. Tetrahedral oxyanions stabilize the conformation of Asn13 in the P-loop.
143
The function of the - flipping was investigated with quantum chemical calculations. A - flip like in Sa_ArsC C10SC15A (PDB entry 1JFV) increases the binding energy for potassium (-48 kcal/mol) only with 3 kcal/mol in comparison to Sa_ArsC wild type (PDB entry 1LJL) (-51 kcal/mol) where no - flip is present. As such, the - flip has no significant effect on the binding of potassium, which was expected because the side chain of Asn13 is on exactly the same position (Fig. 8.7). However, the pKa of two catalytic cysteines in the neighbourhood of the P-loop (Cys10 and Cys82) is affected by the - flip. With wild type Sa_ArsC (1LJL) and Bs_ArsC (1JL3) as a model (Fig. 8.4), the contribution of Arg16 towards the hydrogen bonding with Cys10 (3.5 ) and Cys82 (3.8 ) induced by the - flip results in a significant drop in Cys-S pKa of respectively 0.32 and 0.95 pKa units. In the absence of a flip, with Asn13 in the L conformation, the hydrogen bonding distances from Arg16 to Cys10 and Cys82 are considerably larger (respectively 5.6 and 4.7 ). Here, the effect of the presence of Arg16 is only 0.16 and 0.03 pKa units for Cys10 and Cys82.
arsenate reductases with Asn13 in 1JFV 1JL3
a E
arsenate reductase with Asn13 in L 1FXI 1JF8 1FXI 1JF8
Ea in absence of sulphate 1 -3 0.1 -4
Ea in presence of sulphate -14 -26 -113 -26
= E() - E(L)
Table 8.4: Stability of the conformation compared to L Calculated energy differences (E) in kcal/mol between P-loops of arsenate reductases (PDB entry codes) with Asn13 in and L conformation in the presence and absence of sulphate. A more positive value in E is a more stabilized L conformation of Asn13 relative to . A more negative value in E is a more stabilized conformation of Asn13 relative to L.
In the Trx-coupled family of arsenate reductases, the - flip activates the catalytic cysteines through the relative position of the cysteines to Arg16. The role of arginines in reducing the pKa of the nucleophilic cysteine is also observed in the structure of the glutaredoxin-coupled arsenate reductase from E. coli plasmid R773 (a cluster of 3 arginines lowers the pKa to 6.4)22,29. As such, the involvement of an arginine in arsenate reductases as a general activator of the nucleophilic cysteine seems to be a common mechanism.
144
4. Discussion
Our results indicate that the arsenate reductase cation-binding sites as observed in B. subtilis and in S. aureus are designed to bind potassium. These are the only members of the Trx-coupled arsenate reductase family for which a protein structure is known. However, the positive charge in the form of lysine 33 on the surface of Bs_ArsC, as also observed in the structure of arsenate reductase from Archaeoglobus fulgidus, significantly decreases the binding affinity for potassium with its consequential effect on the kinetics. When Lys is replaced by Asn as in Sa_ArsC, the dissociation constant for potassium drops to the micromolar range. Therefore, we believe that electrostatic repulsion by the positive charge of lysine 33 on the surface of Bs_ArsC (Fig. 8.1) is responsible for decreasing the efficiency of potassium binding. From the Sa_ArsC quadruple mutant ITC measurements, we can conclude that repulsive effect of a lysine on position 33 only works in combination with two negative charged residues on position 30 and 31. Analysing the SwissProt data bank7 with Sa_ArsC as an entry of PSI-BLAST6, results in 4 arsenate reductases with a sequence identity higher than 60% and with a negative charge on position 30, 31 and 33 in the form of an aspartate and a glutamate. In these arsenate reductases - represented in our study by Ba_K33D - not electrostatic repulsion, but electrostatic attraction increases the KD for K+ out of the micromolar range. Extrapolating these results within the Trx-coupled arsenate reductases family (Fig. 8.1), we may conclude that the majority of arsenate reductases will bind potassium in the micromolar range. Quantum chemical calculations on a model based on the structure of Bs_ArsC (PDB entry 1JL3)9 and reanalysis of the deposited X-ray data indicate that the cation-binding site is highly unlikely to harbour a water molecule. Crystals of Bs_ArsC were grown in the presence of 100 mM Na+ ref.21, making sodium the best guest ion in this condition, although its affinity for the cation binding site is more than 50 times lower as compared with the affinity of potassium for the same site. Looking at the Na+-O pair distances after refinement of Bs_ArsC (1JL3), the distances are much closer to the average distance of a K+- O pair (2.8 ) than to the average distance of a Na+- O pair (2.4 )30, in agreement with a site designed for K+. Unlike the case of K+, Na+ binding sites in proteins are often overlooked20 in crystal structures of proteins due to the identical number of electrons carried by this metal ion and a water molecule. Hence, an electron density due to a bound Na+ ion is often erroneously attributed to a water oxygen atom in the refinement procedure20. Enzymes activated by metal ions evolved to take advantage of the large availability of Na+ outside the cell and K+ inside the cell to optimize their catalytic function. Indeed, a strong correlation exists between the preference for K+ or Na+ and the intracellular or extracellular localization of such enzymes31. As the arsenate reductases of the Trx-coupled family are cytoplasmic redox enzymes, occupancy of the cation site by potassium is more likely than sodium. In LMW PTPases, although structurally related, the cation-binding site is occupied by the N2 of a histidine instead of a cation. 145
The effect of potassium on the kinetics might be due to an overall structure-scaffolding role of potassium towards the active site or might orient Asn13 in the optimal position for polarizing the electrons of Ser17, which in turn assists in keeping Cys10 in its thiolate form necessary for nucleophilic attack of arsenate1. The tetrahedral oxyanion sulphate has been shown to stabilize wild type Sa_ArsC by binding in the active site (P-loop) with a dissociation constant of activation of 16 mM4,32. While structuring the substrate binding site in its active conformation, sulphate is competing with the tetrahedral substrate arsenate (increase in KM). The structuring of the P-loop upon the addition of a compound that is known to bind in the active site has also been observed for other enzymes containing a CX5R active site motif. Arsenate reductase from E. coli33, belonging to the glutaredoxin (Grx) coupled family of arsenate reductases and LMW PTPases shares the same feature34,35,36,37. Although not absolutely required, potassium as well as a tetrahedral oxyanion optimizes the catalytic function of arsenate reductase from the Trx-coupled family. The low KM of 9 M for Sa_ArsC in the presence of KCl results in the highest specificity constant (Table 8.1). The low KM for an enzyme that is part of a heavy metal detoxification system might be important in vivo. Trapping arsenate the moment it enters the cell might prevent blocking the essential oxidative phosphorylation. On the other hand, a high kcat will reduce arsenate to arsenite, so that it will be recognized by the arsenite specific efflux system38. The fact that arsenate is taken up by the phosphate transport system, makes possible uptake of other tetrahedral oxyanions feasible38. As such, tetrahedral oxyanions will be present in the direct environment of the arsenate reductases and their structuring effect of the P-loop active site will increase the kcat (Table 8.1)4. In the P-loops of the Trx-coupled arsenate reductase family, asparagine 13 (Fig. 8.6) shows conformational flexibility linked to its catalytic function. Arsenate reductase structures with bound tetrahedral oxyanions show - flipping of the bond between the second and the third residue following the nucleophilic cysteine. Others do not. As such, the -angle shift of asparagines 13 could not directly be linked to the presence at millimolar levels of tetrahedral oxyanions. Removing sulphate has a significant effect on the kcat of Sa_ArsC (Table 8.1). It is however tempting to conclude that tetrahedral oxyanions at millimolar concentrations might help to induce a - flip with a decrease of the pKa of Cys10 and Cys82 as a consequence. Such an induction mechanism might explain the P-loop structuring feature in the presence of sulphate with an increase in KM, due to the competition for the same catalytic site, but with an important increase in kcat for Sa_ArsC (Table 8.1)4. The catalytic most active form is the one with Asn13 in conformation, which is stabilized by the presence of a tetrahedral oxyanion. A conclusive link between oxyanion induction and - flipping in the P-loop in solution must await a detailed study of tetrahedral oxyanions on the P-loop dynamics in Sa_ArsC.
146
Chapter VIII: K+ and SO42- affect Trx-coupled ArsC After the P-loop has taken its most active conformation (Arg16 in hydrogen bonding position with Cys10 and Cys82 due to the - flip of the Gly12-Asn13 bond), the stabilized thiolate of the nucleophilc cysteine is ready to perform a nucleophilic attack.
5. Conclusion
We conclude that the cation-binding site of the Trx-coupled family of arsenate reductases is designed to harbour potassium. Based on X-ray structure analysis and quantum chemical models, the main chain conformational change between Gly12 and Asn13 in the active site P-loop seems to be typical for arsenate reductases as in other P-loops from structurally related enzymes - flipping does not occur. This - flip activates ArsC by bringing two cysteines of the disulfide cascade reaction mechanism within hydrogen bonding distance of Arg16. Tetrahedral oxyanions stabilize - flipping and as such bring the P-loop in its catalytically most active form, which might explain the observed increase in kcat for Sa_ArsC in the presence of millimolar concentration of sulphate. The subtle interplay between ions and enzymes during catalysis as observed in thioredoxin-coupled arsenate reductases invites the revisiting of other catalytic mechanisms.
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. Roos, G., Messens, J., Loverix, S., Wyns, L., Geerlings, P., J. Phys. Chem. B 2004, 108, 17216. Lah, N., Lah, J., Zegers, I., Wyns, L., Messens, J., J. Biol. Chem. 2003, 278, 24673. Ji, G., Garber, E. A., Armes, L. G., Chen, C. M., Fuchs, J. A., Silver, S., Biochemistry 1994, 33, 7294. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. Jacobs, D. M., Messens, J., Wechselberger, R. W., Brosens, E., Willem, R., Wyns, L., Martins, J. C., J. Biomol. NMR 2001, 20, 95. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., Lipman, D. J., Nucleic Acids Res. 1997, 25, 3389. Bairoch, A., Apweiler, R., Nucleic Acids Res. 1999, 27, 49. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., Bourne, P. E., Nucleic Acids Res. 2000, 28, 235. Bennett, M. S., Guan, Z., Laurberg, M., Su, X. D., Proc. Natl. Acad. Sci. USA 2001, 98, 13577. Messens, J., Martins, J. C., Van Belle, K., Brosens, E., Desmyter, A., De Gieter, M., Wieruszeski, J. M., Willem, R., Wyns, L., Zegers, I., Proc. Natl. Acad. Sci. USA 2002, 99, 8506. Guo, X., Li, Y., Peng, K., Hu, Y., Li, C., Xia, B., Jin, C., J. Biol. Chem. 2005, 280, 39601. Boys, S. F., Bernardi, F., Mol. Phys. 1970, 19, 553. Roos, G., Loverix, S., Geerlings, P., J. Phys. Chem. B 2006, 110, 557. Frisch, M. J., Trucks, G. W., Schlegel, H. B., Scuceria, G. E., Robb, M. A., Cheeseman, J. R., J. A. Montgomery, J., T. Vreven, K. N. K., J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A.
147
15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.
Petersson, H. Nakatsuji, M Hada, M. Ehara, k. Toyota, R. Fukuda, Hasegawa, J., M. Ishida, T. nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez & J.A. Pople. (2003). Gaussian 03, Revision A.1, Gaussian, Inc., Pittsburgh, PA. Roos, G., Buts., L., Van Belle, K., Brosens, E., Geerlings, P., Loris, R., Wyns, L., Messens, J., J. Mol. Biol. 2006, 360, 826. Horovitz, A., Fold. Des. 1996, 1, R121. Krissinel, E., Henrick, K., Acta Crystallogr. D 2004, 60, 2256. Zegers, I., Martins, J. C., Willem, R., Wyns, L., Messens, J., Nat. Struct. Biol. 2001, 8, 843. Collaborative Computational Project, Number 4, Acta Crystallogr. D. 1994, 50, 760. Nayal, M., Di Cera, E., J. Mol. Biol. 1996, 256, 228. Guan, Z., Hederstedt, L., Li, J., Su, X. D., Acta Crystallogr. D 2001, 57, 1718. DeMel, S., Shi, J., Martin, P., Rosen, B. P., Edwards, B. F., Protein Sci. 2004, 13, 2330. Zhang, Z. Y., Palfey, B. A., Wu, L., Zhao, Y., Biochemistry 1995, 34, 16389. Wang, S., Tabernero, L., Zhang, M., Harms, E., Van Etten, R. L., Stauffacher, C. V., Biochemistry 2000, 39, 1903. Su, X. D., Taddei, N., Stefani, M., Ramponi, G., Nordlund, P., Nature 1994, 370, 575. Reynolds, R. A., Yem, A. W., Wolfe, C. L., Deibel, M. R., Jr, Chidester, C. G., Watenpaugh, K. D., J. Mol. Biol. 1999, 293, 559. Buhrman, G., Parker, B., Sohn, J., Rudolph, J., Mattos, C., Biochemistry 2005, 44, 5307. Fauman, E. B., Cogswell, J. P., Lovejoy, B., Rocque, W. J., Holmes, W., Montana, V. G., Piwnica-Worms, H., Rink, M. J., Saper, M. A., Cell 1998, 93, 617. Shi, J., Mukhopadhyay, R., Rosen, B. P., FEMS Microbiol. Lett. 2003, 227, 295. Harding, M. M., Acta Crystallogr. D 2002, 58, 872. Cera, E. D., J. Biol Chem. 2006, 281, 1305. Messens, J., Martins, J. C., Zegers, I., Van Belle, K., Brosens, E., Wyns, L., J. Chromatogr. B 2003, 790, 217. Stevens, S. Y., Hu, W., Gladysheva, T., Rosen, B. P., Zuiderweg, E. R., Lee, L., Biochemistry 1999, 38, 10178. Logan, T. M., Zhou, M. M., Nettesheim, D. G., Meadows, R. P., Van Etten, R. L., Fesik, S. W., Biochemistry 1994, 33, 11087. Zhou, M. M., Logan, T. M., Theriault, Y., Van Etten, R. L., Fesik, S. W., Biochemistry 1994, 33, 5221. Gustafson, C. L., Stauffacher, C. V., Hallenga, K., Van Etten, R. L., Protein Sci. 2005, 14, 2515. Laurence, J. S., Hallenga, K., Stauffacher, C. V., J. Biomol. NMR 2004, 29, 417. Silver, S., Keach, D., Proc. Natl. Acad. Sci. USA 1982, 79, 6114.
148
Chapter IX The conserved active site proline determines the reducing power of S. aureus thioredoxin
Experiments are the only means of knowledge at our disposal. The rest is poetry, imagination.
(Max Planck)
150
Chapter IX: Why is Trx a reducing catalyst?

Nature uses thioredoxin-like folds in several disulfide bond oxidoreductases. Each of them has a typical active site Cys-X-X-Cys sequence motif, the hallmark of thioredoxin being Trp-Cys-GlyPro-Cys. The intriguing role of the highly conserved proline in the ubiquitous reducing agent thioredoxin was studied by site-specific mutagenesis of Staphylococcus aureus thioredoxin (Sa_Trx). We present X-ray structures, redox potential, pKa, steady-state kinetic parameters, and thermodynamic stabilities. By replacing the central proline to a threonine/serine, no extra hydrogen bonds with the sulfur of the nucleophilic cysteine are introduced. The only structural difference is that the immediate chemical surrounding of the nucleophilic cysteine becomes more hydrophilic. The pKa of the nucleophilic cysteine decreases with approximately one pH unit and its redox potential increases with 30 mV. Thioredoxin becomes more oxidizing and the efficiency to catalyse substrate reduction (kcat/KM) decreases 7-fold relative to wild type Sa_Trx. The oxidized form of wild type Sa_Trx is far more stable than the reduced form over the whole temperature range. The driving force to reduce substrate proteins is the relative stability of the oxidized versus the reduced form ((T1/2)ox/red). This driving force is decreased in the Sa_Trx P31T mutant. (T1/2)ox/red drops from 15.5 C (wild type) to 5.8 C (P31T mutant). In conclusion, the active site proline in thioredoxin determines the driving potential for substrate reduction.
1. Introduction
After the third reaction step, thioredoxin (Trx) regenerates the reduced form of arsenate reductase (ArsC) for a subsequent catalytic cycle (Fig. 2.3, Chapter II, Step 4)1. Details about the ArsC-Trx interaction can be found in Messens et al2. In this chapter, we discuss the reduction power of thioredoxin. Thioredoxins reduce disulfide bonds in proteins quickly. For example, thioredoxin from E. coli (Ec_Trx) reduces the disulfide bonds of insulin 104-fold faster than dithiothreitol does3. Its function has been implicated in many pathways and it provides a protective role against many different types of damaging stresses4,5. Thioredoxin is a small, 12 kDa protein found in all living cells from archaea to humans. The structures of both the oxidized and reduced form of Ec_Trx were solved by NMR6. In the PDB, the reduced form of thioredoxin is always a solution structure. Many X-ray structures in the oxidized state exist. All thioredoxins have similar three-dimensional structures comprising a central core of five -strands surrounded by four -helices. All feature a conserved active-site loop containing two redox-active cysteine residues in the sequence Trp-Cys-Gly-Pro-Cys7. The oxidized form, thioredoxin-S2, contains a disulfide bridge between those two cysteines that is reduced to a dithiol by the NADPH-dependent flavoprotein, thioredoxin reductase. The reduced form, thioredoxin-(SH)2, is a powerful and general protein disulfide bond oxidoreductase, with a redox potential of 268 mV for Sa_Trx2 and 270 mV for 151
Figure 9.1: Sequence conservation in thioredoxin family. A. Active site proline and tryptophan are highly conserved in thioredoxin. The multiple sequence alignment was computed with the program BLASTP (BLOSUM62 as matrix) with the complete Sa_Trx sequence as entry in the SWISSPROT database and resulted in 103 sequences, for which 97 are thioredoxins. The amount of conservation was plotted for each residue and the highly conserved residues are indicated in red. B. The active site tryptophan is absent in DsbA and glutaredoxin (Grx). Comparison of the conservation of the active site in Trx, DsbA and Grx within the SWISSPROT database. The size of the amino acid single-letter code is proportional to the occurrence of that amino acid at each position. Graphs were made with CLC Protein Workbench 2.0.2 by Messens, J.
152
Chapter IX: Why is Trx a reducing catalyst? Ec_Trx8. For comparison, DsbA, which is the strongest natural oxidant, has a redox potential of 119 mV9. The pKa of the N-terminal cysteine of Ec_Trx (~7) is significantly lower compared to the pKa of a cysteine in the absence of a structured protein environment (~8.3)10,11,12. This low pKa enables thioredoxin to act as a nucleophile and to attack disulfides in proteins. Thioredoxin catalyses the reduction of many exposed disulfides in proteins13 such as ribonucleotide reductase, insulin, methionine sulfoxide reductase14, and S. aureus pI258 arsenate reductase15. The SWISSPROT database16 harbours 97 thioredoxin sequences (Fig. 9.1A). Some of these proteins have an overall sequence identity lower than 30% as compared to Sa_Trx, but even then the DF*A*WCGPC active site motif is highly conserved (Fig. 1A and B). Next to the active site cysteines are a proline and a tryptophan highly conserved. This trypthophan is also present in the Trx-like domains of protein disulfide isomerase (PDI)17 and of phosphoinositide-specific phospholipase C18. In two other members of the thioredoxin superfamily, DsbA and glutaredoxin, the CXXC-preceding tryptophan is absent (Fig. 1B). Here we investigate the role of the active site proline in Sa_Trx. We mutated the active site proline and solved the crystal structures of wild type and Sa_Trx mutants (P31T, P31S, and P31TC32S) (Table 9.1). The redox potential, pKa of the active site cysteines, the reducing activity, and the thermodynamic stability of these mutants are determined. Based on these results, we have qualified the presence of a proline in the active site of this ubiquitous reducing agent.
153
wild type Resolution limits () Space group Unit cell () Wavelength () Measured reflections Unique reflections Completeness (%) Rmerge I/I Refinement statistics Rcryst/Rfree Ramachandran profile Core (%) Allowed (%) Disallowed (%) r.m.s deviation bonds () r.m.s. angles () Number of waters PDB entry 15.0 - 2.2 (2.3 - 2.2) P212121 a=41.0 b=49.7 c=54.5 0.812 43223 (3076) 6123 (587) 97.7 (98.5) 0.093 (0.449) 14.8 (4.3) 0.203/0.249 86.7 12.2 1.1 0.006 1.305 22 2O7K
P31T 15.0 - 2.2 (2.3 - 2.2) P212121 a=41.7 b=49.5 c=55.6 0.918 29248 (2788) 6140 (590) 99.4 (99.0) 0.058 (0.431) 16.89 (4.2) 0.196/0.246 92.3 7.7 0 0.006 1.359 40 2O85
P31S 15.0 - 2.4 (2.5 - 2.4) P212121 a=41.2 b=49.6 c=54.8 0.813 18019 (1806) 4694 (464) 99.7 (99.6) 0.071 (0.311) 13.17 (4.3) 0.193/259 90.1 9.9 0 0.007 1.324 11 2O87
P31TC32S 15.0 - 2.55 (2.65 - 2.55) P212121 a=41.3 b=49.2 c=54.9 0.934 14609 (1378) 3928 (376) 99.4 (99.2) 0.094 (0.454) 9.85 (3.5) 0.202/0.255 89.9 11.0 0 0.007 1.344 0 2O89
Table 9.1: Data collection and refinement statistics of Sa_Trx. Values between parentheses I I correspond to the highest resolution shell. Rmerge = hkl hkl hkl I
154
2. Material and Methods

2.1 Expression and purification19,20
Wild type thioredoxin was transformed and expressed in E. coli strain BL21 (DE3). All Trx mutants were transformed and expressed in E. coli strain BL21 (AI). Cells were grown for 4 h at 37 C in Terrific broth (TB) with ampicillin (100 g/ml). Induction at a cell density of OD600 = 0.9 was carried out overnight with 1 mM IPTG (wild type) or with 0.2% arabinose (mutants) at 28 C. Cultures were harvested at a cell density of OD600 = 20 and suspended in 20 mM Tris/HCl (pH 7.9), 5 mM imidazole, 1M NaCl, 1 mM DTT, 0.1 mg/ml AEBSF and 1 g/ml leupeptin (argon flushed). Following French press disruption, 50 g/ml DNase I (EC 3.1.21.1; Sigma, St. Louis, MO.) and 20 mg MgCl2 were added to the lysate. After 30 min at room temperature, cell debris were removed by centrifugation for 30 min at 12,000xg at 4 C, fresh 1 mM DTT was added and the lysate was filtered. The lysate was purified at 10 ml/min on a 3 ml Ni2+-NTA Superflow (Qiagen, Valencia, Calif.) immobilized-metal-affinity column equilibrated with 20 mM Tris/HCl (pH 7.9), 1 M NaCl, 1 mM DTT and eluted with a linear gradient to 1M imidazole in the same buffer. The proteins under the main elution peak were concentrated, fresh 1 mM DTT was added and further purified on a Superdex75 PG (16/90) gel filtration column (APB) in 20 mM Tris/HCl (pH 8.0), 150 mM NaCl, 1 mM DTT. Under these conditions a final yield of approximately 11 mg/l culture was obtained. Purity was checked on SDS-PAGE and Trx was stored frozen. All buffers were argon flushed and runs were performed on a FPLC (APB) at room temperature. Prior to crystallization the N-terminal His6 tag was removed by adding thrombin protease (5 units/mg recombinant protein, Calbiochem) to the protein. After incubation for 24 h at 26 C, thrombin was removed on a benzamidine Sepharose column (GE Healthcare, Uppsala, Sweden). To remove residual His-tagged proteins, the sample was applied to a Ni2+-Sepharose column (GE Healthcare, Uppsala, Sweden). The flow through fractions were pooled, dialyzed against 20 mM Tris/HCl pH 8.0, 100 mM KCl, 1 mM DTT and concentrated to 15 mg/ml.
2.2 Crystallization20
Crystallization conditions for wild type Sa_Trx and for the P31S mutant were screened using the hanging drop method employing a series of commercially available sparse matrix screens: Hampton Research Crystal Screen, Crystal Screen II, Natrix screen, PEG-ion screen and malonate grid screen. All crystallization experiments were preformed at 20 C by the hanging-drop vapour-diffusion method in 24-well plates. In each trial, a hanging drop of 1.0 l of protein solution (15 mg/ml) mixed with 1.0 l 155
of precipitant solution was equilibrated against a reservoir containing 500 l of precipitant solution. The initial crystallization conditions for the P31T mutant were screened by the sparse-matrix method using the Hampton Screen kit.
2.3 Crystal structure determination

The structures were determined by molecular replacement using AMORE21 with Alicyclobacillus acidocaldarius thioredoxine (40% sequence identity) as starting model (PDB entry 1NW2). Refinement against structure factor amplitudes was done using the maximum likelihood function of CNS 1.022. Bulk solvent and anisotropic B-factor corrections were used throughout the refinement. After an initial rigid body refinement followed by a slow cool stage, electron density maps were generated and the model was rebuilt manually with the graphics program TURBO23. From then on rounds of refinement of atomic positions and individual B-factors were alternated with model building until no further improvement could be obtained. The final coordinates of the wild type structure were then used as starting model for refinement against the data of the mutant proteins using the same protocol as described above. The structures were controlled on their quality with PROCHECK24. All coordinates and structure factor data have been deposited in the Protein Data Bank.
2.4 Determination of the pKa of the nucleophilic cysteine

The thiolate ion has a higher absorption at 240 nm than the unionised thiolate group, allowing the determination of the thiol pKa by monitoring UV absorption during pH titration25. The pH of a Sa_Trx solution (0.5 mg/ml) was buffered with a poly-buffer solution containing 10 mM sodium citrate, borate and phosphate. Sa_Trx was first reduced with 100 mM DTT for 15 min at room temperature. As a reference, oxidized Sa_Trx was incubated for 30 min at room temperature with 40 mM diamide. The excess of DTT and diamide were removed on a Superdex75 HR column (GE Healthcare, Uppsala, Sweden) equilibrated with poly-buffer solution at pH 8.5 (for wild type, P31S, P31T, and P31TC32S Sa_Trx) or 9.5 (for C32A Sa_Trx). The redox state of Sa_Trx was checked on a Vydac C18 column. For titration experiments, 100 mM HCl was stepwise added to the Sa_Trx solution in portions of 20 to 50 l. The absorbance at 280 nm and 240 nm was recorded on a Cary 100 Bio UV-visible spectrophotometer (Varian, Palo Alto, CA). As the absorbance of Sa_Trx at 280 nm was proved to be pH independent, (A240red/A280red)/(A240ox/A280ox) was used as a measure of the fraction of the Cys29 thiolate. 156
Chapter IX: Why is Trx a reducing catalyst? The pH dependent absorption was fitted according to the Henderson-Hasselbach equation:
Aexp = ASH +
(AS ASH ) 1 + 10 ( pKa pH )
(9.1)
in which Aexp is A240 A280 for the experimental determined value, ASH is the A240 A280 value for the protonated form and AS is the A240 A280 for the deprotonated form.
2.5 Fluorescence spectroscopy

The unfolding of wild type and mutants of Sa_Trx was induced in 0-7.9 M urea (high purity enzyme grade, Rose chemicals, London, UK). Fluorescence emission spectra were recorded on a LS 55 luminescence spectrometer (Perkin Elmer, Wellesley) equipped with a thermally controlled cell holder and a cuvette with a 10 mm path length. Urea-induced unfolding was measured using the changes in intrinsic emission fluorescence between 300 and 400 nm upon excitation at 275 nm. The measurements were performed at different temperatures, varying from 10 C to 55C, in 50 mM Na phosphate, pH 7.8. Each sample contained 2.8-5 M of protein and reduced samples additionally contained 2.5 mM DTT. Samples were incubated at least four hours before the measurements in order to achieve equilibrium. The reversibility of the unfolding transitions was checked by diluting the samples from 7.9 M urea to pre-transition urea concentrations and re-scanning the emission spectrum. All urea-induced unfolding curves were analyzed by fitting the two state unfolding model N K D to the experimental data.
2.6 Analysis of urea-induced unfolding data

All the data were analyzed assuming a two-state process:
N D with K = =e (1 )
K
GT RT
(9.2)
where K is the equilibrium denaturation constant, and R is the universal gas constant and is the fraction of protein in the denatured state that depends on the temperature or denaturant concentration, respectively. According to this model, one can express an average of a physical property, (in our case, is the partial molar enthalpy of the protein or the fluorescence intensity (FL)), in terms of the
157
corresponding contributions N and D, which characterize native (N) and denatured (D) states, respectively26:
= N + ( D N )
(9.3)
For the transitions followed by fluorescence, N and D are assumed to be linear functions of the concentration of urea. The conformational stability of a protein that unfolds in a two state fashion, expressed in terms of the corresponding standard free energy change, GT , can be obtained by applying the Gibbs-Helmholtz equation:
GT = T [H T1/2 (1 T 1 T1/2 ) + C p (1 T1/ 2 T ln(T T1/2 ))]
(9.4)
In this expression T1/2 is the melting temperature at which = 0.5, H T1/2 is the standard enthalpy of denaturation at T1/2 , and C p is the difference in heat capacity between the folded and unfolded state
assumed to be temperature independent. The urea-induced unfolding profiles at a given temperature were described by equations 9.2 and 9.4 combined with the empirical relation:
GT = G H 2O ,T m[urea ]
(9.5)
Here, GH 2 O,T is the standard Gibbs free energy of unfolding in the absence of denaturant and m is a
proportionality coefficient. These values were then fitted to eq. 9.4 using the Levenberg-Marquardt non linear 2 regression procedure27 to extract the thermodynamic parameters H T1/2 , C p and T1/ 2 in the absence of denaturant. Details on the site-directed mutagenesis (Elke Brosens), Differential Scanning Calorimetry (DSC) (Abel Garcia-Pino), determination of the redox potential and kinetic assays (Karolien Van Belle), and mass spectrometry (Guy Vandenbussche) can be found in Roos et al.28 The results of this experimental work support the other parts and are consequently included in the results and discussion section (sections 3 and 4).
158
3. Results
3.1 The X-ray structures of Sa_Trx
Sa_Trx in its oxidized state displays the expected thioredoxin fold consisting of a central core of five strands enclosed by four -helices (Fig. 9.2). It is structurally most similar to the corresponding E. coli enzyme (PDB entry 2TRX), with an r.m.s.d. deviation of 0.9 for 95 superimposable C atoms29. The active site containing Trp28-Cys29-Gly30-Pro31-Cys32 is located on the surface of the protein. In the oxidized form, the disulfide bond between residues 29 and 32 is oriented towards the interior of the molecule. This active site disulfide is located at the amino end of the 2-helix in a short segment that is separated from the rest of the helix by a kink due to the presence of proline 37. Behind the disulfide, a conserved Pro73 is located, which is typically in cis conformation (Fig. 9.2). Further, the conserved positive charge, often present in the form of an arginine in thioredoxins from other species, is in Sa_Trx a lysine (Sa_Trx: Lys33) (Fig. 9.1A). P31S and P31T are structurally conservative mutations (r.m.s.d. of 0.12 and 0.15 for P31T and P31S respectively for all 104 C atoms of Sa_Trx P31T and P31S on wild type Sa_Trx). This is equivalent to what was observed for the proline to serine mutation in the E. coli enzyme30,31. In Sa_Trx, the C and the C atoms of Pro31, Ser31 and Thr31 in wild type and the respective mutants are on exactly the same position. The hydroxyl groups of the Ser31 (O) and Thr31 (O1) are pointing in the opposite direction in their respective structures. Only the O of Ser31 is in contact with the C of Pro73 (2.9 ); Thr31 has no contacting neighbours. As it was not possible to reduce Sa_Trx in the crystal or crystallize its reduced state, we determined the crystal structure of the double mutant P31T/C32S as a mimic of reduced Sa_Trx. Yet again, the C32S mutation is a conservative mutation that does not perturb globally nor locally the structure. Sa_Trx P31TC32S also contains the essential features observed in reduced Ec_Trx (PDB entry 1XOB)6 (r.m.s.d. of 0.85 for 97 equivalent C atoms). The side chains of Cys29 and Ser32 are oriented towards the interior of the protein, suggesting the need of a conformational change during its attack on a substrate protein. The orientation of the SH group of Cys29 in the Sa_Trx P31TC32S mutant and the SH group of the nucleophilic cysteine in Ec_Trx is identically. The side chain sulfur atom of Cys29 is tilted away from Ser32 to accommodate for the increase in the S-S distance that occurs upon reduction of the wild-type protein (Fig. 9.3).
159
Figure 9.2: Sa_Trx has a typical thioredoxin-fold. Ribbon diagram of the overall structure of Sa_Trx visualized from two different positions. The residues responsible for proper kinks (yellow), structure elements that influence the pKa of the nucleophilic Cys29 (red), the conserved hydrophobic area (purple) and the CGPC active site motif (green atom type) are shown. The figure was generated using MacPyMol (Delano Scientific LLC 2005) by Messens, J.
No additional hydrogen bond is introduced between the backbone amides (Thr31, Ser32) and the nucleophilic cysteine 29 (Fig. 9.3B). An important difference with wild type Sa_Trx is that by replacing the proline by a threonine or serine, the environment of the nucleophilic Cys29 becomes more hydrophilic.
160
Figure 9.3: The active site of A. oxidized wild type and B. reduced P31TC32S Sa_Trx. The distances () from SCys29 and N1Trp28 towards neighbour atoms are indicated. The figure was generated using MacPyMol (Delano Scientific LLC 2005) by Messens, J.
3.2 The pKa of the cysteines

Active site residues determine the pKa of the nucleophilic cysteine in Sa_Trx (Table 9.2, Fig. 9.4A). The pKas of the nucleophilic Cys29 and the buried Cys32 were determined by pH titration at 240 nm, since the thiolate ion has a higher absorption at this wavelength than the thiol group25. Compared to wild type Sa_Trx (pKa ~7.1), the pKa of Cys29 decreases to 6.0 and 6.2 for respectively the P31T and P31S mutants.
Sa_Trx P31T P31S P31TC32S Wild Type6 C32A error pKa Cys29 6.0 6.2 6.4 7.1 7.7 0.1 E (mV) -236 -244 n.d.* -268 n.d.* 1
Table 9.2: pKa and redox potential of Sa_Trx. n.d.*: not determined
The influence of the C-terminal Cys32 (measured pKa ~ 9) in WCGPC motif on the pKa of the nucleophilic Cys29 was measured by replacing Cys32 with an alanine and a serine. Both Sa_Trx C32A and P31TC32S show an increase of the pKa of respectively 0.6 and 0.4 units relative to wild type and Sa_P31T. As such, the C-terminal cysteine contributes in keeping the pKa of the nucleophilic cysteine low. 161
Figure 9.4: Titration curves for wild type and the Sa_Trx mutants. A. The pKa of the nucleophilic cysteine. The specific absorption of the thiolate ion at 240 nm as function of the pH is shown. All data were fitted with the Henderson-Hasselbach equation. B. Redox equilibrium with glutathione. Percentage of thioredoxin active site in the reduced form, calculated from the reversed-phase chromatographic profile after peak integration, as a function of redox potential are shown.
3.3 The redox potential of the proline and tryptophan mutants

By replacing the active site proline 31 of the wild type Sa_Trx by a threonine or a serine Sa_Trx P31T/S becomes a more oxidizing enzyme. The redox potential increases (Table 9.2, Fig. 9.4B). This observation is in line with a decrease of the pKa of the nucleophilic Cys29 (vide supra). The relative oxidizing power of P31S and P31T Sa_Trx was determined using glutathione as a standard. The glutathione redox scale compares the ability of proteins to transfer their disulfides to reduced glutathione32. The differential elution on reversed-phase chromatography of the oxidized and reduced ' forms of P31S and P31T Sa_Trx as function of [GSH]2/[GSSG] (i. e. Eh ) allows the determination of the midpoint redox potential of the active-site disulphide (Cys29-Cys32). The molecular mass of the proteins associated to the two elution peaks was determined by mass spectrometry and showed a difference in 2 Da, consistent with the formation of a disulphide bridge between Cys32 and the nucleophilic Cys29. 162
3.4 Disulfide reducing activity of Sa_Trx mutants

Replacing the active site proline by a threonine or a serine in Sa_Trx decreases the efficiency of catalyis with a factor of seven (Table 9.3).
kcat (min-1) Sa_Trx KM (M) P31T 32 17 P31S 142 65 Wild Type6 33 114 Table 9.3: Kinetic parameters of thioredoxin and its active site proline mutants kcat/KM (M-1s-1) 0.9 104 0.8 104 5.8 104
To determine Michaelis-Menten steady-state kinetic parameters of active site proline mutants, the reaction was followed indirectly by observing the rate of NADP formation2. The coupling enzyme in the cascade towards NADPH is thioredoxin reductase, delivering reducing equivalents to the oxidized thioredoxin. Previously, we have determined the conditions to ensure that thioredoxin reductase and NADPH are not rate limiting2. With oxidized pI258 arsenate reductase as the substrate, Sa_Trx P31T and P31S mutants display a specificity constant (kcat/KM) of 0.8 x 104 M-1s-1 (Table 9.3). As oxidized thioredoxin itself has to be reduced by thioredoxine reductase in the cascade reaction, we analysed the oxidized Sa_Trx mutants as substrate. In the presence of an excess of oxidized P31T and P31S Sa_Trx (100 M), we found that they are about twice as good a substrate for S. aureus thioredoxin reductase as oxidized wild type Sa_Trx.
3.5 Thermal stability

The presence of a proline in the active site of wild type Sa_Trx keeps the difference in stability between the oxidized and reduced form high (Fig. 9.5A). DSC measurements are scan-rate independent for oxidized as well as reduced wild type and P31T Sa_Trx (Fig. 9.5B). All transitions show at least 90% reversibility. Furthermore, the ratio of calorimetric enthalpy ( H c ) and vant Hoff enthalpy ( H vh ), H c H vh , is essentially one in all cases (Table 9.4).
163
Figure 9.5: Thermodynamic stability of Sa_Trx. A. Thermodynamic stabilities as function of temperature are shown for the oxidized (full symbols) and reduced (open symbols) states of wild type (black) and P31T (red). Curves were calculated with Gibbs-Helmholtz eq. 9.4 based on urea-unfolding data (full lines) and on the temperature induced unfolding (dotted lines) (Table 9.4). Indicating data points are calculated form the fluorometric urea-unfolding profiles (Fig. 9.6). The difference between the melting temperature of the oxidized and reduced form are indicated (double headed arrows). B. The thermaldenaturation of wild type and Sa_Trx P31T monitored by DSC.
Sa_Trx variant redox state
T1/ 2
(C) DSC 73.2 57.6 70.3 64.5 0.4 urea 74.5 55.7 69.7 62.9
o C p
H vh
(T)
(kcal/mol)
H c
(kcal/mol)
(kcal/mol/C) DSC 1.7 1.9 1.9 1.9 0.1 urea 1.5 1.9 1.8 2.1
(T)
H c H vh
(T1/2 )ox /red (C)

DSC 15.5 5.8 urea 18.7 6.8
wild type P31T Error*
oxidized reduced oxidized reduced
DSC 77.7 57.2 72.4 63.9 2.3
urea 71.6 57.6 70.4 66.3
DSC 72.7 59.3 69.5 63.1 3.2
DSC 0.93 1.02 0.96 0.99
Table 9.4: Redox state dependent thermodynamic parameters of Sa_Trx obtained from the model analysis of DSC endotherms and the fluorometric urea unfolding profiles. *The relative parameters errors were estimated from repetitive experiments by variation of possible base-line positions and are higher than those derived from the fitting of the model function to the individual unfolding curves.
164
Chapter IX: Why is Trx a reducing catalyst? Insights into the nature of the structural changes upon Sa_Trx unfolding are provided by the modeldependent H vh and model-independent H c values. If both enthalpies are equal within the experimental error, then the observed transition proceeds in a two-state manner, i.e., the concentration of intermediates between the native and the unfolded state is negligibly small. This suggests a simple twostate equilibrium of a monomeric protein between a folded and an unfolded state. Non-linear fitting of the two state unfolding model to the data provided the transition temperatures T1/2 and enthalpies for the endotherms (Table 9.4). Removing the proline results in a drop of (T1/2)ox/red from 15.5 C (wild type) to 5.8 C (P31T mutant).
3.6 Urea-induced unfolding

Chemical unfolding of the oxidized and reduced forms of Sa_Trx gives similar thermodynamic parameters as temperature unfolding (compare the full and dotted lines in Fig. 9.5A, Table 9.4). The oxidized form is more stable than the reduced form and stability difference between the oxidized and reduced form decreases when the active site proline is replaced by a threonine (compare de difference between the red and the black lines in figure 9.6).
Figure 9.6: Fraction of unfolding () of oxidized (filled symbols) and reduced (open symbols) wild type Sa_Trx (black) and Sa_Trx P31T (red) at 25 C in function of urea concentration. The was calculated from equation 9.3 with data from the fluorescence emission spectra recorded for oxidized and reduced Sa_Trx at respectively 356 nm and 358 nm (Fig. 9.7).
165
We monitored urea-induced unfolding of oxidized and reduced wild type Sa_Trx and the P31T mutant by measuring the intrinsic fluorescence spectra upon excitation at 275 nm. Residues that contribute to the emission spectra are tyrosine and tryptophan residues. Sa_Trx contains two tryptophan residues, W28 and W25, rather exposed to the bulk solvent in the folded state and two tyrosine residues, Y46 and Y67, of which Y67 is fairly buried. The fluorescence spectra of wild type Sa_Trx and the P31T mutant were similar. As such, we only discuss the results obtained for oxidized and reduced wild type at one single temperature (Fig. 9.7). In the folded state of oxidized wild type Sa_Trx, the signal for the tryptophan is completely quenched by the presence of the disulfide bridge in the active site33, and the barely detectable emission maximum of tyrosine at 308 nm becomes visible. In the reduced form (Fig. 9.7), we observe a red shift of the emission maximum of about 5 nm between the folded and unfolded state. The already surface exposed tryptophans go to an even more polar environment upon unfolding. The difference spectrum between the folded and unfolded states shows a maximum at 356 nm for oxidized and 358 nm for reduced Sa_Trx. For both oxidized and reduced states of wild type and mutant Sa_Trx, the unfolding was completely reversible, as checked by diluting the sample to pre-transition urea concentrations. A two state unfolding model consisting of the equilibrium between a folded and unfolded monomer adequately describes the data.
Figure 9.7: Fluorescence emission spectra for urea-induced unfolding of A. oxidized and B. reduced wild type Sa_Trx measured at 30 C, pH 7.8.
166
4. Discussion
Each of the conserved prolines (Fig. 9.1) in thioredoxin plays a different role. The conserved proline in the 2 helix (P37 in Sa_Trx) is not essential for its redox function (Fig. 9.1 and 9.3), but stabilizes the protein34. The conserved cis-proline located on the opposite site of the WCGPC active site motif (P73 in Sa_Trx) is important for maintaining the conformation of the active site and the redox potential of thioredoxin. (Fig. 9.1 and 9.3)35. Replacing it by an alanine has an effect on the efficiency of catalysis. The conserved active site proline (P31 in Sa_Trx) (Fig. 9.1 and 9.3) is the key residue that determines the reducing power of thioredoxin. Replacing the proline by a histidine in Ec_Trx increases the redox potential with 35 mV relative to wild type and the catalytic activity in disulfide reduction decreases36. However, it was also shown in Ec_Trx that replacing the active site proline to serine has only a little effect on the redox activity30,35. We showed with S. aureus thioredoxin that replacing the central proline of the WCGPC active site motif by a serine or a threonine has a gigantic effect. It results in a decrease of the pKa of the nucleophilic cysteine, an increase in redox potential, a decrease in the efficiency of catalysis, and a decrease in stability difference between the oxidized and reduced form. The pKa of the nucleophilic cysteine is not only determined by the WCGPC active site motif, but also by the presence of helix 2, and the side chains of Asp23, Lys54, and Pro7337-40 (Fig. 9.2). All these elements are structurally conserved between wild type and mutant Sa_Trx and therefore seem not responsible for modifying the pKa of the mutants. The only recognizable difference between wild type Sa_Trx and the mutants is a change to a more hydrophilic environment around the sulfur of nucleophilic Cys29 by the introduction of a free backbone amine group at position 31 and the conversion of a hydrophobic to a hydrophilic side chain. No clear hydrogen bonds are introduced. As such, in contradiction to DsbA and Grx41-43, the local effect of the central residues of the active site on the pKa may not be understood in terms of the number of hydrogen bonds that these residues can provide to stabilize the charge on the nucleophilic thiolate. Although no hydrogen bond interactions were observed in the X-ray structures, it is tempting to speculate that local flexibility might induce hydrogen bond interactions in a short time frame, like for Grx from E. coli42,43. The structure of Sa_Trx reveals a hydrophobic patch close to the accessible nucleophilic cysteine region, consisting of W25, A26 and W28 (Fig. 9.2). Multiple sequence alignment reveals that A26 and W28 are conserved in the thioredoxin family (Fig. 9.1). The indole side chain of Trp28 is turned towards the interior of the protein and covers an important part of the active site surface (Fig. 9.2). This particular positioning of the indole side chain is not due to the crystal packing, since the same position is observed in the solution structures of Ec_Trx (PDB code entries 1XOA, 1XOB)4. The indole ring of Trp28 is in van-der-Waals contact with the conserved Ala26-methylgroup. There are no direct 167
interactions of Trp28 with the nucleophilic Cys29. Trp28 interacts with Asp58 (W28N1 - D58O1: 3.3 and W28N1 - D58O2: 2.6 ) and with Ala26 (N1 - C: 3.3 ) (Fig. 9.3). Although no interaction with the nucleophilic cysteine is observed, the presence of this conserved active site tryptophan (Fig. 9.3) keeps the pKa of this cysteine low and favours the thiolate anion form in thioredoxin h from Chlamydomonas reinhardtii44. The electron-rich indole ring in the vicinity of a thiol might act as a base to help the thiol to deprotonate. In the structure of C. reinhardtii thioredoxin h45, the Trp to Ala mutation results in an increased solvent exposure of the nucleophilic cysteine with an increased pKa as a consequence. We could confirm these results with a W28A mutant of S. aureus thioredoxin, the pKa of the nucleophilic cysteine increases to a value typical for a free thiol, i.e. 8.30.1 and its redox potential decreases to -284 6 mV. Replacing the bulky indole side chain of the tryptophan by a methyl group structurally destabilises Sa_Trx in such a way that it shows a strong tendency to form multimers and shows the characteristics of a partially unfolded population (unpublished results). The correlation between a decreasing redox potential and an increasing pKa is expected from the theory of Szajewski and Whitesides, which predicts the rate constants of disulfide exchange reactions from the pKa values of all involved thiols46. When the redox potential increases, the pKa of the nucleophilic cysteine decreases. Wild type, P31T, P31S Sa_Trx, and even DsbA, follow the predicted correlation, which results in a slope of -42 3 mV/ pH unit (r2 = 0.99) (Fig. 9.8). The Sa_Trx W28A mutant is an outliner. Adding Sa_Trx W28A in this series, r2 drops to 0.94 and as such, the effect of the W28A mutation is not translated into a similar redox potential versus pH slope. Not a surprise, taken the molten-globule behaviour of this mutant into account. What exactly links the stability of the oxidized and reduced forms of Sa_Trx with disulfide bond reduction? Thioredoxin most likely reacts with glutathione according to the following scheme:
Trx(SH ) + GSSG
2
TrxSSG + GSH
Trx(SS ) + 2GSH , TrxSSG represents the mixed
disulfide formed between glutathione and Sa_Trx. This scheme might be considered as a model for any disulfide containing substrate. Since the oxidized form of wild type Sa_Trx is far more stable than the reduced form over the whole temperature range (Fig. 9.5A), the equilibrium of the reaction is driven to the right and GSSG is being reduced. Moreover, the pKa of the nucleophilic cysteine makes the thiolate unstable at neutral pH and allows thioredoxin to form mixed disulfide bonds with substrate proteins through a thermodynamically driven reaction towards the oxidized form. This makes thiordeoxin prone to reduce. For the natural oxidase DsbA the opposite is true41. Here, a low pKa of its nucleophilic cysteine and a higher stability of reduced form compared to the oxidized one establish DsbAs oxidizing power. 168
Figure 9.8: The redox potential of the active site cysteines correlates with the pKa of the nucleophilic cysteine. The redox potential decreases with 42.3 mV per pH unit (r2 = 0.99). *DsbA value was obtained from Grauschopf et al.41.
In P31T Sa_Trx, a more hydrophilic chemical surrounding of the nucleophilic cysteine results in a decreased pKa of Cys29 (Table 9.2, Fig. 9.4A) and makes the thiolate more reactive. However, the decreased difference in stability between the oxidized and reduced form compared to wild type (Fig. 9.5A), makes from this P31T mutant a less reducing agent. This was clearly observed with oxidized arsenate reductase as a substrate. This mutation has no effect on substrate recognition (the KM has not changed compared to wild type), but the rate of disulfide reduction (kcat) is about 7 times lower compared to wild type Sa_Trx (Table 9.3). According to our knowledge this is the first time that the stability of oxidized and reduced thioredoxin were analyzed with temperature and chemical unfolding over the whole temperature range (Fig. 9.5A). In general, wild type Sa_Trx is less stable compared to wild type Ec_Trx. Oxidized Ec_Trx has a GH 2 O ,T of 9.9 kcal/mol (obtained with GdmHCl induced unfolding at 25 C, pH 7.0)47 and a T1/2 of
90 C48. Oxidized Sa_Trx displays at 25C a GH 2 O ,T of 4.6 kcal/mol and a T1/2 of 74.5 C (Fig. 9.5A, Table 9.4). At the same temperature, reduced Ec_Trx has a GH 2 O ,T of 5.8 kcal/mol46, while reduced Sa_Trx displays a GH 2 O ,T of 2.6 kcal/mol (Fig. 9.5A). Based on the stability differences at 25 C, these
data strongly suggest that Ec_Trx is not only more stable than Sa_Trx, but is also a better reducing agent than Sa_Trx. 169
5. Conclusion
This study reveals two features of thioredoxin that explains its reducing behaviour: the large stabilization of the oxidized form, and the relatively high pKa of its nucleophilic cysteine compared to the oxidase DsbA. We have stressed the importance of the presence of a conserved proline in the WCGPC motif to keep the (T1/2)ox/red value high enough to drive the reduction reaction to completion. Mutation of the conserved proline alters the redox properties of Sa_Trx in a predictable way: stabilisation of the thiolate of the nucleophilic cysteine shifts the equilibrium more towards the reduced form, by which thioredoxin becomes a more oxidizing agent. Therefore, the idea first suggested by Krause et al.49 and later on explored by Mossner et al.47 that a thioredoxin can be converted to an oxidant or reductant by rational means, is extended. As such, the evolutionary choice of a proline in active site determines the redox properties of thioredoxin as reducing agent.
References
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Zegers, I., Martins, J. C., Willem, R.,Wyns, L., Messens, J., Nature Struct. Biol. 2001, 8, 843. Messens, J., Van Molle, I., Vanhaesebrouck, P., Limbourg, M., Van Belle, K., Wahni, K., Martins, J. C., Loris, R., Wyns, L., J. Mol. Biol. 2004, 339, 527. Holmgren, A., J. Biol.Chem. 1979, 254, 9627. Holmgren, A., Proc. Natl. Acad. Sci. USA 1976, 73, 2275. Cha, M. K., Kim, H. K., Kim, I. H., J. Biol. Chem. 1995, 270, 28635. Jeng, M. F., Campbell, A. P., Begley, T., Holmgren, A., Case, D. A., Wright, P. E., Dyson, H. J., Structure 1994, 2, 853. Eklund, H., Gleason, F. K., Holmgren, A., Proteins 1991, 11, 13. Aslund, F., Berndt, K. D., Holmgren, A., J. Biol. Chem. 1997, 272, 30780. Zapun, A., Bardwell, J. C., Creighton, T. E., Biochemistry 1993, 32, 5083. Kallis, G. B., Holmgren, A., J. Biol. Chem. 1980, 255, 10261. Dyson, H. J., Tennant, L. L., Holmgren, A., Biochemistry 1991, 30, 4262. Chivers, P. T., Prehoda, K. E., Volkman, B. F., Kim, B. M., Markley, J. L., Raines, R. T., Biochemistry 1997, 36, 14985. Holmgren, A., Methods Enzymol. 1984, 107, 295. Holmgren, A., Annu. Rev. Biochem. 1985, 54, 237. Messens, J., Hayburn, G., Desmyter, A., Laus, G., Wyns, L., Biochemistry 1999, 38, 16857. Bairoch, A., Apweiler, R., Nucleic Acids Res. 1999, 27, 49. Edman, J. C., Ellis, L., Blacher, R. W., Roth, R. A., Rutter, W. J., Nature 1985, 317, 267. Bennett, C. F., Balcarek, J. M., Varrichio, A., Crooke, S. T., Nature 1988, 334, 268. Messens, J., Martins, J. C., Brosens, E., Van Belle, K., Jacobs, D. M., Willem, R., Wyns, L., J. Biol. Inorg. Chem. 2002, 7, 146. Roos, G., Brosens, E., Wahni, K., Desmyter, A., Spinelli, S., Wyns, L., Messens, J., Loris, R., Acta Crystallogr. F 2002, 62, 1255. Navaza, J., Acta Crysallogrt. A 1994, 50, 157.
170

22. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T., Warren, G. L., Acta Crystallogr. D 1998, 54 (Pt 5), 905. 23. Roussel, A. & Cambillau, C. , TURBO-FRODO in Silicon Graphic Geometry Partner Directory, pp. 71-78, Silicon Graphics Mountain View, California, 1989. 24. Laskowski, R. A., MacArthur, M. W., Moss, D. S., Thornton, J. M., J. Appl. Cryst. 1993, 26, 283. 25. Nelson, J. W., Creighton, T. E., Biochemistry 1994, 33, 5974. 26. Pace, C. N., Methods Enzymol. 1986, 131, 266. 27. Press, W. H., Vetterling, W. T., In Numerical Recipes in C: The Art of Scientific Computing. 2nd edit, 2, Cambridge University Press, Cambridge, 1992. 28. Roos, G., Garcia Pino, A., Van Belle, K., Brosens, E., Wahni, K.,. Vandenbussche, G., Wyns, L., Loris, R., Messens, J., J. Mol. Biol. 2007 accepted. 29. Katti, S. K., LeMaster, D. M., Eklund, H., J. Mol. Biol. 1990, 212, 167. 30. Gleason, F. K., Lim, C. J., Gerami-Nejad, M., Fuchs, J. A., Biochemistry 1990, 29, 3701. 31. Kelley, R. F., Richards, F. M., Biochemistry 1987, 26, 6765. 32. Gilbert, H. F., Adv. Enzymol. Relat. Areas Mol. Biol. 1990, 63, 69. 33. Holmgren, A., J. Biol. Chem. 1972, 247, 1992. 34. de Lamotte-Guery, F., Pruvost, C., Minard, P., Delsuc, M. A., Miginiac-Maslow, M., Schmitter, J. M., Stein, M., Decottignies, P., Protein Eng. 1997, 10, 1425. 35. Gleason, F. K., Protein Sci. 1992, 1, 609. 36. Lundstrom, J., Krause, G., Holmgren, A., J. Biol. Chem. 1992, 267, 9047. 37. Chivers, P. T., Prehoda, K. E., Raines, R. T., Biochemistry 1997, 36, 4061. 38. Dyson, H. J., Jeng, M. F., Tennant, L. L., Slaby, I., Lindell, M., Cui, D. S., Kuprin, S., Holmgren, A., Biochemistry 1997, 36, 2622. 39. Kortemme, T., Creighton, T. E., J. Mol. Biol. 1995, 253, 799. 40. Jacobi, A., Huber-Wunderlich, M., Hennecke, J., Glockshuber, R., J. Biol. Chem. 1997, 272, 21692. 41. Grauschopf, U., Winther, J. R., Korber, P., Zander, T., Dallinger, P., Bardwell, J. C., Cell 1995, 83, 947. 42. Foloppe, N., Nilsson, L., Structure 2004, 12, 289. 43. Foloppe, N., Sagemark, J., Nordstrand, K., Berndt, K. D., Nilsson, L., J. Mol. Biol. 2001, 310, 449. 44. Krimm, I., Lemaire, S., Ruelland, E., Miginiac-Maslow, M., Jaquot, J. P., Hirasawa, M., Knaff, D. B., Lancelin, J. M., Eur. J. Biochem. 1998, 255, 185. 45. Menchise, V., Corbier, C., Didierjean, C., Saviano, M., Benedetti, E., Jacquot, J. P., Aubry, A., Biochem. J. 2001, 359, 65. 46. Szajewski, R., Whitesides, G., J. Am. Chem. Soc. 1980, 102, 2011. 47. Mossner, E., Huber-Wunderlich, M., Glockshuber, R., Protein Sci. 1998, 7, 1233. 48. Perez-Jimenez, R., Godoy-Ruiz, R., Ibarra-Molero, B., Sanchez-Ruiz, J. M., Biophys. Chem. 2005, 115, 105. 49. Krause, G., Lundstrom, J., Barea, J. L., Pueyo de la Cuesta, C., Holmgren, A., J. Biol. Chem. 1991, 266, 9494.
171
172
In conclusion
This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.
(Winston Churchill)
174
Conclusion For the reduction of arsenate, arsenate reductase (ArsC) combines a phosphatase-like nucleophilic displacement reaction in the P-loop active site with a unique intramolecular disulfide bond cascade. Three redox active cysteines are involved (Cys10, Cys82 and Cys89). After a single catalytic arsenate reduction event, oxidized ArsC exposes a disulfide between Cys82 and Cys89 on a looped-out redox helix. Thioredoxin converts oxidized ArsC back towards its initial reduced state. The structural environment created by the active site of ArsC together with the presence of the correct ions was found to be crucial to reduce arsenate to arsenite.
Protonation state of the enzyme-bound substrate and the Cys10arseno adduct

Gas phase and solvent reactivity studies indicate that the nucleophilic attack of a thiolate during the first catalytic step of ArsC occurs on di-anionic arsenate. The observed charge transfer from the bound arsenate to the P-loop strengthens the evidence for a di-anionic substrate in the presence of the negatively charged nucleophilic Cys10. Reactivity analysis and calculated thermodynamics point to a mono-anionic Cys10-arseno adduct in ArsC prior to the nucleophilic attack of Cys82.
Leaving group activation and activation of the electrophile

Activation of the electrophile in the Michaelis-complex of pI258 ArsC takes place by charge transfer from arsenate to the P-loop during the formation of the enzyme-substrate complex. Especially the central arsenic atom becomes more positively charged rendering the substrate more electrophilic and more susceptible to nucleophilic attack. During the second reaction step, Ser17 serves as the major activator of the electrophilic Cys10-arseno adduct. Arg16 is identified as the residue that destabilizes the ground state of the enzyme-substrate complex. The interaction of arsenate with N of Arg16 increases the bond length of the As-OH(LG) bond and will finally lead to the breaking of this bond in the reaction state. The P-loop active site activates first a hydroxyl and subsequently arsenite as leaving group, as is clear from an increase in their calculated nucleofugality upon going from gas to solvent phase to enzymatic environment.
175
How redox cysteines are activated

Cysteine 10
To activate the nucleophilic Cys10, three structural elements are on stage. Firstly, Cys10 is located at the Nterminal of an -helix, which induces a positive macro-dipole. Secondly, ArsC has a specific potassium binding site. After potassium has bound, a potassium-Cys10 interaction network is created. This network starts at potassium, runs over Asn13 and Ser17, and has Cys10 as final hydrogen bond acceptor. Thirdly, the presence of arsenate or another tetrahedral oxyanion in the P-loop active site flips its structure and brings Arg16 within hydrogen bonding distance to Cys10.
Cysteine 82
For the next step of the disulfide cascade (arsenite release with the formation of a Cys10-Cys82 disulfide), Cys82 needs to be stabilized in its thiolate form to attack Cys10. The enzymatic environment stabilizes the thiolate form of the nucleophilic Cys82 via the presence of the 8-residue -helix flanked by Cys82 and Cys89 (redox helix) and via a hydrogen bond with Thr11. Further, tetrahedral oxyanions do not only bring Arg16 closer to Cys10 by a structural flip in the P-loop, but also Cys82 comes within hydrogen bonding distance to Arg16. As such, the conserved Arg16 plays a double role and activates as well Cys10 as Cys82 as nucleophiles in the presence of tetrahedral oxyanions.
Cysteine 89
Prior to the third reaction step, Cys89 is kept in the non-active high pKa form by the presence of the Cys82Cys89 redox helix. During the final reaction step, Cys89 has to attack Cys82. Cys89 is activated as a nucleophile by structural alterations of the redox helix that functions as a pKa control switch for Cys89. This redox helix partially unfolds when the Cys10-Cys82 disulfide is formed favouring the thiolate form of Cys89 and enabling the third reaction step to occur.
176
Conclusion
The reducing power of thioredoxin

There are two features of thioredoxin that explain its reducing behaviour: the large stabilization of the oxidized versus the reduced form, and the relatively high pKa (~7.1) of its nucleophilic cysteine. A conserved proline in the WCGPC active site motif is crucial to keep the (T1/2)ox/red value high enough to drive the reduction reaction to completion. Mutation of the conserved proline alters the redox properties of thioredoxin in a predictable way: the thiolate of the nucleophilic cysteine is stabilized and the (T1/2)ox/red value is decreased compared to wild type, by which thioredoxin becomes a more oxidizing agent.
The story continues

In our opinion, the ArsC project has reached a satisfactory level of completeness. We have discussed the onset of the subsequent reaction steps catalysed by pI258 ArsC. As such, we have disentangled how the enzyme activates the leaving group and the electrophilic centre of the substrate, and the nucleophilic cysteines. Concerning the latter, a valuable tool based on a linear relation between charge and pKa is worked out to assess pKa values of thiolate groups, which are experimentally difficult to titrate. Starting from a good model system based on high-resolution X-ray structures, this method might be universally applied. Transition state calculations are lacking in this work. Though, to quantitatively estimate reaction rates, one should locate the transition states and compute activation energies. This is a time consuming and computationally expensive task. Therefore, in this work, the HSAB principle, which describes the characteristics of a reaction (mainly kinetic aspects) in terms of the properties of the reagents in the ground state, is used to deduce valuable information on the exact role of the catalytic important Cys10, Thr11, Arg16, Asn13 and Ser17 residues. Via the redox enzymes ArsC and thioredoxin, the ArsC project has opened a door to the fundamental study of redox reactions, which was until very recently not addressed in the field of conceptual DFT. Nevertheless, more than acid/base reactions, redox reactions are clean examples of the energy versus number of electron description, the key relation on which reactivity indices are based in the framework of conceptual DFT. In a first contribution, the expression of redox potentials of oxo acids in terms of the electrophilicity, nucleofugality and electrofugality, forming a complete set of reactivity descriptors, has led to the chemical understanding of the energetics of this redox process1. Further work will concentrate on the interpretation of disulfide forming reactions, one of the biologically most important redox reactions.
Moens, J., Geerlings, P., Roos, G., Chem. Eur. J. 2007 accepted 177
178
Appendices
Know what's weird? Day by day, nothing seems to change. But pretty soon, everything's different.
(Bill Watterson)
180
Appendix
Abbreviation list
a.u. ............................Atomic unit ArsC..........................Arsenate reductase B3LYP ......................Becke 3 Lee Yang Parr Bs_ArsC ...................B. subtilis arsenate reductase BSSE.........................Basis set superposition error B. subtilis ..................Bacillus subtilis DFT ..........................Density Functional Theory DSC ..........................Differential scanning calorimetry DsbA.........................Disulfide binding protein A EA.............................Electron affinity Ec_trx .......................E. coli thioredoxin E. coli........................Escherichia coli GRX..........................Glutaredoxin GSH ..........................Glutathione HF .............................Hartree-Fock HOMO ......................Highest occupied molecular orbital HSAB .......................Hard and soft acids and bases IE ..............................Ionisation energy ITC............................Isothermal titration calorimetry LUMO ......................Lowest unoccupied molecular orbital NADPH ....................Nicotinamide adenine dinucleotide phosphate NPA ..........................Natural population analysis PCM..........................Polarizable continuum model PDB ..........................RCSB Protein data bank (http://www.rcsb.org/pdb/home/home.do) (LMW) PTPase.........(Low molecular weight) Phosphatase QC.............................Quantum chemistry RCB ..........................Repulsive Coulomb barrier r.m.s.d. ......................Root mean square deviation Sa_ArsC....................S. aureus arsenate reductase Sa_Trx ......................S. aureus thioredoxin S. aureus ...................Staphylococcus aureus SCI-PCM ..................Self consistent isodensity polarizable continuum model S. cerevisiae ..............Saccharomyces cerevisiae Trx ............................Thioredoxin
181
182
Appendix
Publication list
1. Roos, G., Loverix, S., De Proft, F., Wyns, L., Geerlings P., A computational and conceptual DFT study of the reactivity of anionic compounds: implications for enzymatic catalysis. J. Phys. Chem. A 2003, 107, 6828. Roos, G., Messens, J., Loverix, S., Wyns, L., Geerlings, P., A computational and conceptual DFT study on the Michaelis Complex of pI258 arsenate reductase: structural aspects and activation of the electrophile and nucleophile. J. Phys. Chem. B 2004, 108, 17216. Roos, G., De Proft, F., Geerlings, P., Gas-phase stability of tetrahedral multiply charged anions: a conceptual and computational DFT study. J. Phys. Chem. A 2005, 109, 652. Roos, G., Loverix, S., Geerlings, P., Origin of the pKa perturbation of N-terminal cysteine in alpha- and 3(10)-helices: a computational DFT study. J. Phys. Chem. B 2006, 110, 557. Roos, G., Loverix, S., Brosens, E., Van Belle, K., Wyns, L., Geerlings, P., Messens, J., The activation of electrophile, nucleophile and leaving group during the reaction catalysed by pI258 arsenate reductase. ChemBioChem. 2006, 7, 981. Roos, G., Buts, L., Van Belle, K., Brosens, E., Geerlings, P., Loris, R., Wyns, L., Messens, J., Interplay between ion binding and catalysis in the thioredoxin-coupled arsenate reductase family. J. Mol. Biol. 2006, 360, 826. Roos, G., Brosens, E., Wahni, K., Desmyter, A., Spinelli, S., Wyns, L., Messens, J., Loris, R., Combining site-specific mutagenesis and seeding as a strategy to crystallize 'difficult' proteins: the case of Staphylococcus aureus thioredoxin. Acta Cryst. F. 2006, F62, 1255. Moens, J., Geerlings, P., Roos, G., A conceptual DFT approach for the evaluation and interpretation of redox potentials. Chem. Eur. J. 2007 accepted. Roos, G., Garcia-Pino, A., Van Belle, K., Brosens, E., Wahni, K.,. Vandenbussche, G., Wyns, L., Loris, R., Messens, J., Conserved active site residues determine the reducing power of S. aureus thioredoxin. J. Mol. Biol. 2007 accepted.
2.
3. 4. 5.
6.
7.
8. 9.
183
184

Goedele Roos - Theory Meets Experiment: A Combined Quantum Chemical-Experimental Study of The Reaction Mechanism of Pi258 Arsenate Reductase

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Goedele Roos - Theory Meets Experiment: A Combined Quantum Chemical-Experimental Study of The Reaction Mechanism of Pi258 Arsenate Reductase

Transféré par

Droits d'auteur :

Formats disponibles

Theory meets experiment: a combined quantum chemical-experimental study of the reaction mechanism of pI258 arsenate reductase

redox helix Cys82

Thank you Goedele

Something unknown is doing we don't know what.

1. Toxicity and defence mechanisms against arsenic compounds

2. Enzyme mechanisms: interplay between theory and experiment

1. ArsC has a PTPase I fold

Chapter II: pI258 ArsC

2. Kinetics and active site flexibility

Chapter II: pI258 ArsC

Chapter III Theoretical background

Chapter III: Theoretical background

H elec elec = E elec elec

Etot = Eelec + Enucl = Eelec +

Chapter III: Theoretical background

1 (1) 2 (1) K N (1) 1 1 ( 2) 2 ( 2) K N ( 2 )

expansion coefficients ci:

2.2 The variational method

Chapter III: Theoretical background

2.3 Closed shell systems

fulfilling the normalization condition:

and F the elements of the Fock matrix:

1 * * ( ) = ( r1 ) (r 1 ) ( r2 ) ( r2 )dr 1dr 2 r12

2.4 Open shell systems

Chapter III: Theoretical background

In the open-shell case, the Fock matrices are defined as:

with the expressions for the density matrices:

Finally, the electronic energy becomes:

2.5 Solution of the Hartree-Fock equations

3. Density Functional Theory

3.2 The Hohenberg-Kohn theorems

The expression for E [ (r )] can be rewritten as:

E [ (r )]= FHK [ (r )]+ (r ) (r )d r

FHK [ (r )]= T [ (r )]+ Vee [ (r )]

1 (r1 ) (r 2 )d r1d r 2 r12

3.3 The Kohn-Sham method

The unknown kinetic energy functional T [ (r )] can consequently be written as:

Chapter III: Theoretical background

3.4 The exchange-correlation energy functionals

E xc [ (r )]= (T [ (r )] TS [ (r )]) + (Eee [ (r )] J [ (r )])

3.4.2 Hybrid methods

3.5 The chemical potential

Comparing eq. 3.56 and eq. 3.60, one obtains:

Figure 3.1: E versus N plot for a typical chemical species.

3.6 Chemical potential derivatives

Chapter III: Theoretical background

(r ) 2E (r ) (r ' ) = (r ' ) = (r , r ') N N

The reciprocal of is the global softness S24:

The finite difference approximation for and S is given by:

Chapter III: Theoretical background

Chapter III: Theoretical background The maximum electron-transfer equals:

and the associated stabilization energy:

The local electrophilicity is then given by:

(3IE EA) 2 8( IE EA)

Eelectrofuge = E(N0-1) - E(N0+Nideal) IE = E(N0-1) - E(N0)

3.7 Hard and soft acids and bases (HSAB) principle

4.2 Minimal basis sets

4.3 Split valence basis set

Chapter III: Theoretical background

4.4 Polarization functions, diffuse functions

(r ) = N ... * ( x1 , x 2 ,..., x N )( x 1 , x 2 ,..., x N )d 1dx 2 ,..., d x N

Within the Hartree-Fock approximation (RHF case), eq. 3.99 yields: