Vous êtes sur la page 1sur 296

UNIVERSITE PARIS.

DIDEROT (Paris 7)
ECOLE DOCTORALE – COMPLEXITE DU VIVANT

DOCTORAT
Discipline – Sciences de la vie
Spécialité – Génétique

Guillaume CAMBRAY

– EVOLUTIVITE –
LE CAS DES INTEGRONS ET UTILISATION DE SEQUENCES
SYNONYMES EN EVOLUTION DIRIGEE

Thèse dirigée par le Dr. Didier MAZEL


Soutenue le 10 Juillet 2009

JURY

Mme. la Pr. Isabelle Martin-Verstraete Président


M. le Pr. Pierre Capy Rapporteur
M. le Pr. Fernando De La Cruz Rapporteur
M. le Dr. Antoine Danchin Examinateur
M. le Dr. Ivan Matic Examinateur
M. le Dr. Didier Mazel Directeur de thèse
– EVOLVABILITY –
THE INTEGRON CASE AND THE USE OF SYNONYMOUS
SEQUENCES FOR DIRECTED EVOLUTION
“The beauty of the cosmos is given not only by unity into diversity,
but also by diversity into unity.”

– Umberto Eco, in The Name of the Rose

“La science a en fait deux aspects.


Ce qu’on pourrait appeler science de jour et science de nuit.

La science de jour met en jeu des raisonnements qui s’articulent comme des engrenages, des
résultats qui ont la force de la certitude. On en admire la majestueuse ordonnance comme
celle d’un tableau de Vinci ou d’une fugue de Bach. On s’y promène comme un jardin à la
française. Consciente de sa démarche, fière de son passé, sûre de son avenir, la science de jour
avance dans la lumière et la gloire.

La science de nuit, au contraire, erre à l’aveugle. Elle hésite, trébuche, recule, transpire, se
réveille en sursaut. Doutant de tout, elle se cherche, s’interroge, se réprend sans cesse. C’est
une sorte d’atelier du possible où s’élabore ce qui deviendra le matériau de la science…
Ce qui guide l’esprit alors, c’est l’instinct, l’intuition. C’est le besoin d’y voir clair. C’est
l’acharnement à vivre. C’est le courage…”

– François Jacob
RESUME

La stabilité phénotypique est essentielle au succès d’organismes évoluant sous des


conditions constantes. L’environnement est néanmoins soumis à de perpétuelles variations
stochastiques, auxquelles les êtres vivants doivent sans cesse s’adapter. L’évolutivité
caractérise la capacité d’une population à répondre à de telles pressions sélectives par la
génération de modifications phénotypiques héritables. La majorité des mutations étant
délétères, des processus permettant de limiter la production de telles variations aux seules
périodes de stress, ou de la confiner à des loci et phénotypes bien définis, ont été sélectionnés
au cours de l'évolution.
Les intégrons en constituent une illustration particulièrement sophistiquée.
Initialement identifiés comme vecteurs de résistance à de multiples antibiotiques, ces
systèmes génétiques bactériens spécialisés dans l’échange, la collecte et l’expression de gènes
accesoires constituent une importante source de diversité génétique. Ce travail montre que les
intégrons sont directement couplés à une voie majeure de réponse au stress chez les bactéries,
le système SOS. En permettant de générer de la variabilité phénotypique en période de stress
sans affecter le reste du génome, les intégrons constituent ainsi un exemple paradigmatique
d’évolutivité.
Un autre aspect de ce travail démontre que des séquences codantes synonymes – bien
que spécifiant des protéines identiques – peuvent accéder par mutations ponctuelles à des
régions différentes de l’espace phénotypique. Utilisée de manière adéquate, cette propriété
permet d’étendre l’évolutivité d’une protéine quelconque dans le cadre d’applications
biotechnologiques.

MOTS CLES
EVOLUTION ; ADAPTABILITE ; STRESS ; BACTERIE ; ANTIBIOTIQUES ; RESISTANCE ; VARIATION DE

PHASE ; CONTINGENCY LOCI ; BET HEDGING ; LAMARCKISME ; CODE GENETIQUE ; BIOLOGIE

SYNTHETIQUE ; ESPACE GENETIQUE ; ESPACE PHENOTYPIQUE

7
8
ABSTRACT

Phenotypic stability is essential to the success of organisms evolving under steady


conditions. However, the environment is subjected to perpetual stochastic variations, to which
living beings must constantly adapt. Evolvability characterizes the ability of a population to
respond to such selective pressures through the generation of heritable phenotypic changes.
Most mutations being deleterious, processes enabling the confinement of mutations to periods
of stress, or to specific loci and well-defined phenotypes, have been selected over evolution.
Integrons constitute a particularily sophisticated illustration of such processes. Initially
identified through their involvement in multi-resistance to antibiotics, these bacterial genetic
systems are specialized in the exchange and stockpiling of accessory genes and therefore con-
stitute an important source of genetic diversity. This work shows that integrons are directly
coupled with the SOS system, a major bacterial stress response. By allowing the generation of
significant phenotypic diversity during periods of stress without impacting the rest of the ge-
nome, integrons hence constitute a paradigmatic example of evolvability.
Another aspect of this work demonstrates that synonymous coding sequences – al-
though specifying identical proteins – can access different area of the phenotypic space
through ponctual mutations. When properly exploited, this property can enhance the evolva-
bility of any protein in the context of biotechnological applications.

KEY WORDS
EVOLUTION ; ADAPTABILITY ; STRESS ; BACTERIA ; ANTIBIOTICS ; RESISTANCE ; PHASE VARIA-

TION ; CONTINGENCY LOCI ; BET HEDGING ; LAMARCKISM ; GENETIC CODE ; SYNTHETIC BIOLOGY

GENETIC SPACE ; PHENOTYPIC SPACE

9
10
REMERCIEMENTS

Je tiens avant tout à remercier l’Immense Didier Mazel pour m’avoir accueilli dans son
laboratoire à l’atmosphère si chaleureuse ; pour m’avoir permis de travailler sur des sujets
passionnants ; et aussi pour m’avoir laissé – à tort ou à raison… probablement un peu des
deux ! – une grande liberté de pensée et d’action. Tous mes remerciements également à mes
collègues, présents et passés, qui participèrent grandement à l’ambiance scientifique et
humaine d’un lieu de travail où l’on en vien finalement à passer le plus clair de son temps !
Un grand merci scientifique à mes collaborateurs, et tout particulièrement à: Luis-
Miguel Chevin, pour un travail que l’on ne publiera probablement jamais, mais qui valait
néanmoins le coup ; Olivier Tenaillon et Pierre-Alexis Gros, pour leurs conseils et coups de
main sur un papier qui eut en son temps bien du mal à passer ; Emilie Guérin, pour avoir
identifié le mode de régulation de l’intégrase et ainsi donné l’élan initial à un superbe (et le
mot est faible!) travail ; Ivan Erill pour notre collaboration bioinformatique à distance ; Pierre
Lechat, pour avoir encadré une partie du developpement d’IRMA, un algoritme de détection
d’integron (dont il n’est malheureusement pas question dans cette thèse) ; Eduardo Rocha,
pour ces conseils bioinformatiques avisés.

Je remercie bien sûr toute ma famille : ma mère, mon père, mes frères et sœurs ; mes
grands-parents, qu’ils soient toujours présents ou vivent encore dans mon cœur ; ma Mounia,
et toute ma belle famille, qui mérite bien cet adjectif. Je tiens aussi ici à exprimer toute ma
gratitude à ceux – collègues, amis, famille… – qui m’ont écouté, réconforté et supporté
pendant des périodes de doutes et de ras-le-bol un peu difficiles (la ‘science de nuit’ évoquée
quelques pages plus haut).

Enfin, je suis très reconnaissant envers les membres de mon jury et plus généralement
envers tout ceux qui liront ce travail, et ne se borneront pas, comme cela est souvent le cas, à
n’en lire que les remerciements...
Aux autres, je ne saurais trop conseiller de le faire sans plus attendre !

11
12
TABLE OF CONTENTS

RESUME ................................................................................................................................................................ 7
ABSTRACT ........................................................................................................................................................... 9
REMERCIEMENTS ........................................................................................................................................... 11
TABLE OF CONTENTS ..................................................................................................................................... 13
ABBREVIATIONS .............................................................................................................................................. 19
INTRODUCTION ........................................................................................................................................... 21
I. Control of genetic diversity ................................................................................................................22
I.1. Spontaneous mutations ............................................................................................................... 23
I.1.1. Effects and origins of mutations ..........................................................................................23
a - General overview........................................................................................................................23
b - Types of mutations .....................................................................................................................25
c - The origins of mutations.............................................................................................................26
I.1.2. Genome-wide mutation rates................................................................................................30
a - Pattern of spontaneous mutation rates ........................................................................................30
b - Mechanisms of genetic maintenance..........................................................................................33
c - The lowest… the best .................................................................................................................39
I.1.3. General mutators and the ambiguity of repair systems .................................................42
a - Natural occurrence of mutators ..................................................................................................42
b - The short-term advantages of increased mutation rates..............................................................42
c - Long-term consequences of increased mutation rates ................................................................44
d - Lessons from the mutator phenomenon......................................................................................46
I.2. Stress-induced mutagenesis ....................................................................................................... 47
I.2.1. The SOS paradigm ..................................................................................................................47
a - The SOS response to DNA damage............................................................................................47
b - Survival and variability during SOS induction...........................................................................50
c - Extending the SOS response.......................................................................................................54
I.2.2. Other examples of stress-induced mutagenesis...............................................................56
a - Mutagenesis in aging colonies....................................................................................................56
b - The competence state .................................................................................................................58

13
I.3. Programed generation of genetic variations ........................................................................60
I.3.1. Localized mutation through slipped-strand mispairing ................................................61
a - Replication slippage ...................................................................................................................61
b - Simple Sequence Repeat (SSR) are variable loci .......................................................................61
c - Phenotypic impact ......................................................................................................................64
d - SSRs as localized mutators.........................................................................................................68
I.3.2. Mutation by intragenomic recombination ........................................................................71
a - Meiotic sex .................................................................................................................................71
b - Gene conversion .........................................................................................................................71
c - Transposition ..............................................................................................................................79
d - Site-specific recombination ........................................................................................................82
I.3.3. Epigenetics .................................................................................................................................89
a - Definition....................................................................................................................................89
b - Bistable regulatory switch ..........................................................................................................90
c - DNA methylation patterns in bacteria ........................................................................................93
d - The yeast prion PSI+ ..................................................................................................................98
II. Phenotypic plasticity, genetic variations and physiological regulation ...........................99
II.1. Genetic versus physiological changes ................................................................................ 100
II.1.1. Cybernetic genomes ............................................................................................................100
II.1.2. Individual and populational adaptation..........................................................................101
II.1.3. Stochastic switches as a bet-hedging strategy .............................................................102
a - Contingency loci.......................................................................................................................102
b - Bet-hedging ..............................................................................................................................103
c - Genetic switches as crude regulatory controls..........................................................................104
d - Link with the lifestyle of organisms .........................................................................................105
II.2. Links between genetic changes and regulation .............................................................. 106
II.2.1. Impact of expression strength on sequence evolution ...............................................106
II.2.2. Genetic assimilation of long lasting regulation ...........................................................107
II.2.3. Evolution of regulatory patterns ......................................................................................108
a - Regulatory networks as evolutionary target .............................................................................108
b - Switching regulation patterns ...................................................................................................109
II.2.4. Physiological regulation of mutagenesis .......................................................................110
a - Stress-induced mutagenesis can be spatially confined .............................................................110
b - Targeted mutagenesis can be regulated ....................................................................................110
II.3. Evolvability and robustness................................................................................................... 111
II.3.1. Evolvability ............................................................................................................................111
II.3.2. Robustness..............................................................................................................................114

14
II.3.3. Links between robustness and evolvability ..................................................................116
III. The integron genetic system .........................................................................................................118
III.1. Overview of the system .......................................................................................................... 118
III.1.1. A brief historical perspective ................................................................................................118
III.1.2. Structure of integrons .............................................................................................................120
a - The functional platform ............................................................................................................120
b - The cassette array .....................................................................................................................120
III.1.3. Different flavors of integron............................................................................................121
a - Mobile integrons.......................................................................................................................121
b - Chromosomal integrons ...........................................................................................................123
III.2. Functional organization of integrons ............................................................................... 126
III.2.1. A unique site-specific recombination mechanism ....................................................126
a - Double and single stranded recombination substrates ..............................................................126
b - The different recombination reactions......................................................................................129
c - Accessory factors......................................................................................................................131
III.2.2. Expression of cassettes’ genes ........................................................................................132
a - Transcription.............................................................................................................................132
b - Translation................................................................................................................................133
III.3. Integrons and evolution ......................................................................................................... 134
III.3.1. Chromosomal integron as the source of mobile integrons .....................................135
a - Chromosomal integrons are ancient and widespread structures ...............................................135
b - Mounting of mobile integrons ..................................................................................................137
c - Resistance gene and chromosomal integrons ...........................................................................139
d - The generation of cassettes.......................................................................................................141
III.3.2. A central role in horizontal gene transfer ....................................................................142
a - Evidences for interspecies cassettes exchanges........................................................................142
b - Mobile integrons and the spread of cassette .............................................................................143
c - A cassette metagenome ............................................................................................................143
III.3.3. Integrons as sophisticated contingency loci ................................................................144
a - Working model.........................................................................................................................144
b - An unknown recombination dynamic.......................................................................................146

RESULTS ........................................................................................................................................................ 151


I. Evolution of recombination rate in integrons ...........................................................................152
Background ............................................................................................................................................. 152
Methods .................................................................................................................................................... 152
Results and discussion.......................................................................................................................... 152
Article I..................................................................................................................................................... 153

15
II. Recombination in integrons is controled by the SOS response to stress ......................171
Background ............................................................................................................................................. 171
Methods .................................................................................................................................................... 171
Results and discussion.......................................................................................................................... 171
Article II ................................................................................................................................................... 172
Article III ................................................................................................................................................. 172
III. Intrinsic evolutionary potential of genes .................................................................................212
Background ............................................................................................................................................. 212
Methods .................................................................................................................................................... 212
Results and discussion.......................................................................................................................... 212
Article IV ................................................................................................................................................. 213
DISCUSSION ................................................................................................................................................. 233
I. Integrons are powerful adaptive systems .....................................................................................234
I.1. The expression of gene cassettes ............................................................................................ 234
I.1.1. Coupling between recombination and expression ........................................................234
I.1.2. The integron system mimics inducible promoters .......................................................236
I.1.3. Increased rate of cassette evolution ..................................................................................237
I.2. Responsive and oriented mutagenesis ................................................................................. 238
I.2.1. Responsive versus constant mutation rates ....................................................................238
I.2.2. Integron and adaptive mutagenesis...................................................................................239
I.2.3. A clear case of stress-induced mutagenesis ...................................................................240
I.3. A deep connection with SOS triggers .................................................................................. 241
I.3.1. Single stranded DNA: a bridge between two systems ................................................241
I.3.2. Potential SOS triggers relevant to integrons ..................................................................243
I.3.3. SOS-controlled accessory factors? ...................................................................................244
I.4. Are integron really successful? .............................................................................................. 245
II. Implications for health .....................................................................................................................246
III. Biotechnological considerations ..................................................................................................248
III.1. The ELP principle ................................................................................................................... 248
III.2. Synthetic integrons .................................................................................................................. 249
APPENDIX ..................................................................................................................................................... 253
Epistemological considerations on the role of variations in biology.....................................254
Maintenance versus variability: a major evolutionary trade-off ..................................... 254

16
The purpose of evolution .................................................................................................................. 257
Form, function and the watchmaker ............................................................................................257
Adaptation, teleonomy and blindness .........................................................................................259
Impact of the environment .............................................................................................................. 260
What is the environment? ...............................................................................................................260
The inheritance of acquired characteristics ...............................................................................262
The Neo-Darwinian focus on selection ......................................................................................265
Anticipating and responding environmental changes.............................................................265
REFERENCES .............................................................................................................................................. 269

TABLE OF FIGURES

Figure 1 – Distribution of fitness effects of mutations...........................................................................................24


Figure 2 – Spontaneous mutation rates ..................................................................................................................32
Figure 3 – Intrinsic replication error rates..............................................................................................................34
Figure 4 - Nucleotide excision repair (NER) .........................................................................................................35
Figure 5 – Schematic pathways of base excision repair (BER)..............................................................................37
Figure 6 – Main homologous recombination (HR) pathways ................................................................................38
Figure 7 – Schematic functioning of the SOS system ............................................................................................48
Figure 8 – Disruption of the SOS response dramatically affects survival..............................................................51
Figure 9 – Survival and mutation of E. coli after starting antibiotic therapy in mice.............................................52
Figure 10 – Variation of aging-induced mutagenesis among natural E. coli isolates.............................................57
Figure 11 – Polymerase slippage............................................................................................................................62
Figure 12 – Phase variation in the biosynthesis of the LPS molecule of H. influenzae .........................................65
Figure 13 – Floculation controlled through slipped strand mispairing in S. cerevisiae..........................................66
Figure 14 – Morphological impact of repeat length variation................................................................................67
Figure 15 – Outcomes of gene conversion .............................................................................................................71
Figure 16 – Molecular models of recombination involved in gene conversion .....................................................73
Figure 17 – Gene conversion models for B. hermsii, A. marginale and T. brucei .................................................75
Figure 18 – Mating-type switching in S. cerevisiae ...............................................................................................76
Figure 19 – Stress-controlled targeting of the Ty5 retrotransposon .......................................................................80
Figure 20 – The three possible outcomes of site-specific recombination...............................................................82
Figure 21 – Types of specific inversion identified in B. fragilis ............................................................................84
Figure 22 – Phase variation of type 1 fimbrial expression by DNA inversion.......................................................86

17
Figure 23 – Hysteresis and bistability in the lac operon.........................................................................................91
Figure 24 – Epigenetic inheritance of the sporulation signal in Bacillus subtilis...................................................92
Figure 25 – DNA methylation-dependent phase variation of the pap operon in E. coli.........................................96
Figure 26 – Physiological and genetic adaptation in evolution............................................................................113
Figure 27 – Number of articles dealing with integron-mediated antibiotic resistance .........................................118
Figure 28 – Organization of the integron recombination system .........................................................................119
Figure 29 – Phylogentic distribution of attC sites found in Vibrio species..........................................................124
Figure 30 – Functional distribution of cassette-encoded proteins in vibrionales .................................................125
Figure 31 – Structure of attI sites .........................................................................................................................127
Figure 32 – Structure of attC sites........................................................................................................................128
Figure 33 – Model of atypical recombination in integrons ..................................................................................130
Figure 34 – The Pc promoter of class 1 integron .................................................................................................132
Figure 35 – Collapsed phylogenetic tree of integrases.........................................................................................134
Figure 36 – Phylogenetic tree of integron integrases ...........................................................................................136
Figure 37 – Evolution of mobile clinically derived class 1 integrons ..................................................................139
Figure 38 – Working model of integron functioning ...........................................................................................145
Figure 39 – Comparative organization of three V. cholerae chromosomal integrons ..........................................147
Figure 40 – The floral architecture of Linaria vulgaris and Linaria peloria .........................................................256
Figure 41 – Comparison of Lamarck's theory of transformation and a phylogenetic tree ...................................263

TABLE OF TABLES

Table 1 – Spontaneous mutation rate in DNA-based microbes..............................................................................31


Table 2 – MMR components and their functions...................................................................................................36
Table 3 – Simple sequence repeats in the genome of H. influenzae ......................................................................63
Table 4 – A representative list of bacterial species containing chromosomal integrons ......................................122
Table 5 – Gene cassettes shared between the integrons of Vibrio species ...........................................................147

18
ABBREVIATIONS

DSB ......................double strand break


EP..........................error-prone polymerases
HGT ......................horizontal gene transfer
HJ..........................Holiday junction
ICE........................integrative conjugative element
IS...........................insertion sequence
kb ..........................kilobase
LPS .......................lipopolysaccharide
MMR ....................mismatch repair
NER ......................nucleotide excision repair
nt ...........................nucleotide
ORF ......................open reading frame
pb ..........................base pair
ssDNA ..................single stranded DNA
SSR .......................simple sequence repeats
TA.........................toxin-antitoxin
TE .........................transposable element
TLS .......................translesion synthesis
UV ........................ultraviolet

19
20
INTRODUCTION

21
Introduction – Control of genetic diversity

The modern evolutionary synthesis is the unifying paradigm in biology. The main
lines of this theory were drawn around the 1940s, following the development of population
genetics. Since then, our knowledge of the molecular basis underlying genetic phenomena had
rapidly expanded, and some key concepts have been refined. The mechanisms governing the
generation of mutations constitute an important area of investigation. Indeed, the availability
of genetic diversity determines the adaptation rate of populations – their evolvability. Contra-
dicting one of the fundamental tenets of the neo-Darwinian theory, the last decades of re-
searches brought numerous examples showing that the occurrence of mutations is sometime
not a completely random process. In many aspects, the theory of evolution is more than an
academic discipline. It also offers an explanatory edifice to a wide range of metaphysical is-
sues and is an essential component of a materialistic worldview, almost a religion to some ex-
tent (Ruse, 2003). An epistemological account on the importance of biological variations is
thus presented in appendix to complement the scientific introduction (see p254).
This thesis focuses on evolvability. The first part of the introduction describes the dif-
ferent types of mutations and the diverse molecular mechanisms that affect their generation.
Despite my efforts to include examples from all kingdom of life, most focus on bacteria – but
after all, are not we living in a bacterial world (Gould, 1996)? The second chapter discusses
the respective impacts of genetic and physiological variations in the production of phenotypic
diversity. The last chapter is dedicated to the presentation of the integron system, the sophisti-
cated adaptive properties of which constitute the main subject of this work.

I. CONTROL OF GENETIC DIVERSITY

The availability of heritable phenotypic diversity determines the evolvability of bio-


logical populations. However, most mutations are deleterious and thus prejudicial for the
maintenance of individuals on the short-term. Although this situation settles an important evo-
lutionary trade-off, the early neo-Darwinian theory regarded mutational events as purely ran-
dom accidents. Advances in molecular biology led to the description of sophisticated
mechanisms to cope with alterations affecting DNA molecules. Alongside, many ingenious
systems able to regulate, modify and restructure the genetic information with minimal risk to
ongoing adaptation were discovered. The understanding of these specific processes of
mutagenesis is particularly important from both fundamental and applied standpoints. The re-

22
Spontaneous mutations – Effects and origins of mutations

alization that evolvability is itself an evolvable trait is an essential refinement of the theory of
evolution: evolution per se can then be considered as a biological function. Many health-
threatening phenomena – such as antibiotic resistance, microbial pathogenesis, tumor progres-
sion, genetic diseases, radiation- and chemotherapy-resistance – result from the capacity of
cells to tune their adaptation rate. Such processes may be challenged by the design of anti-
evolution drugs that would short-circuit mutagenesis responses.
This chapter first provides an overview of the processes responsible for spontaneous
mutagenesis and quickly addresses the sophisticated mechanisms developed over evolution to
keep mutations in check. Then, I describe how these mechanisms can be subverted to increase
the generation of genetic diversity. While a constitutive increase in mutation rate is necessar-
ily deleterious on the long term, its coupling with cellular responses to stress enables a refined
mechanism wherein genetic novelties are created specifically when individuals are mal-
adapted. The last part presents a detailed review of the genetic systems involved in localized
and phenotypically oriented mutations.

I.1. Spontaneous mutations

I.1.1. Effects and origins of mutations

a - General overview

The genetic information is encoded in the DNA sequences that form the genome of
living entities. These sequences constitute information because they interact in a very specific
manner with the cellular machinery (see Cybernetic genomes, p100). Generally speaking, a
mutation is a heritable alteration of the DNA sequence. Heritable modifications that do not af-
fect DNA sequences constitute a specific case of mutation – epimutations (see Epigenetics,
p79). Although this bulk affirmation will be nuanced later, mutations are a priori random
events. Indeed, none can precisely predict when and where a genetic alteration will occur, nor
can one foresee the exact functional impact of a mutation.
There are distinct forms of mutations, ranging from single base substitutions to whole
genome duplications. As we will see, these different types arise from diverse causes and their
generation may involve sophisticated repair mechanisms. The phenotypic impact of a muta-
tion depends considerably on its type, but also on its location (Streisinger et al., 1966; and see

23
Introduction – Control of genetic diversity

below). Mutations are distributed along a continuum of fitness effects, which determine their
fate in a given population. Though the exact shape of this distribution is subject to
discussions, several general principles can be outlined (Eyre-Walker and Keightley, 2007).
Most mutations are deleterious to the organism, and thus discarded by natural selection; some
mutations do not produce effects strong enough to permit selection – and are thus neutral;
while only very few are adaptive (see Figure 1). Importantly, the fate of mutations depends on
the effective size of the population: mutations that are deleterious or advantageous in a large

24
Spontaneous mutations – Effects and origins of mutations

population may be essentially neutral in a small population, wherein random drift outweighs
selection coefficients. As a rule of thumb, it is generally considered that a mutation with a se-
lection coefficient s inferior to the inverse of the effective population size Ne – i.e. Ne.s < 1 –
is effectively neutral (Kimura, 1983). In mutation accumulation and mutagenesis experiments
carried out in yeast (Wloch et al., 2001) and in the vesicular stomatitis virus (Wloch et al.,
2001) respectively, it was estimated that 30-40% of mutations are lethal in laboratory condi-
tions (see Figure 1). One can easily figure out that introducing absolutely random modifica-
tions in a complex system is much likely to impair it rather than enhancing it. Consider a
watch, for instance: removing a screw, adding a spring or slightly altering a cog will almost
certainly break the fine arrangement of the mechanism and prevent the device from being
functional. What is more, a second random alteration is very unlikely to restore the system
back to its functional state. In most cases, mutations are thus irreversible and accumulate in
genomes if they are not counter-selected – a process known as Muller’s ratchet (Muller,
1964).

b - Types of mutations

Different types of mutation have different phenotypic effects, depending of the loci af-
fected. In this respect, one can mainly distinguish mutations arising from point mutations, and
chromosomal rearrangements.

i. Point mutations

Point mutations include substitution, insertion and deletion of nucleotides. At worst, a


substitution can change a key amino-acid which is essential to the protein function; introduce
a termination codon; or affect the affinity of a DNA region to cognate proteins. At best, it may
fall into a non-informative region and be virtually neutral. When located in a coding region,
insertions and deletions almost always dramatically impact the protein sequence by shifting
the reading frame of the gene (Streisinger et al., 1966).

ii. Chromosomal rearrangements

Chromosomal mutations are structural changes of higher order that result from ille-
gitimate recombination events occurring during replication or reparation of DNA molecules.
These alterations include deletions, inversions, duplications and translocations. In this con-
text, a deletion corresponds to the loss of a whole region of a chromosome. Obviously, all
genes present in this region will be subsequently absent from the genome, and this can deeply

25
Introduction – Control of genetic diversity

affect the organism’s phenotype. Usually, an inversion has less dramatic consequences. It
mainly affects the regions at the tip of the inverted segment, though the expression of genes
located within the segment can be subtly altered (Rocha, 2004). A duplication event refers to
the repetition of a whole genomic region and results in increased gene-dosage. This can be
deleterious if a toxic gene is over-expressed, or if it destabilizes the functioning of cellular
networks. Notably, paralogous genes resulting from duplication events can evolve in diver-
gent ways and enrich the genome with new functions (Taylor and Raes, 2004; Conant and
Wolfe, 2008a; and see Impact of expression strength on sequence evolution, p106). The same
mechanisms that cause inversions, deletions or duplications can lead to translocations in mul-
tichromosomic genomes when fragments of DNA are exchanged between different chromo-
somes. All chromosomal rearrangements can alter DNA topology and therefore indirectly
affect expression of the surrounding regions (Reymond et al., 2007).

iii. Genetic exchanges

In prokaryotes, the acquisition of exogenous DNA is an important source of genetic


novelty, and a complete set of functional traits can be instantaneously gained through horizon-
tal gene transfer (HGT) (Ochman et al., 2000; Redfield, 2001). In some eukaryotes, sexual re-
production allows mixing the genetic information of two individual during meiosis, which led
to the emergence of new traits combinations (Otto and Lenormand, 2002). In both cases, the
rearrangement of pre-existing genetic variations can results in large and swift phenotypic
changes.
In eukaryotes that evolved specialized reproductive tissues, mutations show different
evolutionary impact depending on whether they affect the soma or the germen (see Appendix
– The inheritance of acquired characteristics, p262). Mutations in somatic cells may affect the
ability of the organism to survive and reproduce, but are limited to the individual and not
transmitted to the next generation. In contrast, mutations affecting the germline are largely si-
lent in parental individuals and are only expressed in their offsprings. Thus, only mutations
established in the germline are hereditary and have evolutionary consequences.

c - The origins of mutations

DNA molecules are continuously insulted by a variety of different factors including


environmental influences, mutagenic chemicals and replication. The proliferation of transpos-
able elements is also a significant source of structural variations. Most mutations are sub-
jected to advanced mechanisms of repair (see p33). Although these processes primarily

26
Spontaneous mutations – Effects and origins of mutations

evolved as caretakers of genome integrity in order to increase individual survival, their modus
operandi can also be a substantial source of mutations.

i. Mutagenic radiations

Most environments are exposed to physical radiations. Despite the protective effect of
the atmosphere, organisms are frequently exposed to mutagenic ultraviolet (UV) light. UVs
principally induce the formation of covalent bounds between adjacent pyrimidine bases on a
single DNA strand. Around 80–90% of the resulting photoproducts are cyclobutane
pyrimidine dimers, while the remaining 10–20% correspond to more mutagenic pyrimidine-
pyrimidone (6–4) photoproducts (Sancar, 2008). Both dimers alter the conformation of the
DNA double helix, thereby preventing normal transcription and replication. Besides, UVs
also promote the hydrolysis of cytosines, which ultimately results in C:G→T:A transitions
through mispairing with adenine during replication (Clancy, 2008a). Besides, organisms are
exposed to more energetic and penetrating radiations. Ionizing radiations, such as γ-rays and
X-rays, can induce double strand breaks (DSB) either directly or indirectly through the mas-
sive production of free oxygen radicals (Cadet et al., 2003).

ii. Chemical mutagens

Organisms are often exposed to specific chemical mutagens of biotic or abiotic ori-
gins. Such agents interfere with the normal behavior of the DNA molecule by preventing cor-
rect base pairing, substituting to standard bases or disrupting the integrity of the DNA helix
(Clancy, 2008b). Oxidative environments are a major source of DNA alterations. Particularly,
oxidation of guanines into 8-hydroxyguanine frequently leads to G:C→T:A and A:T→C:G
transversions. Spontaneous hydrolysis can remove purine bases from the sugar-phosphate
backbone of corresponding nucleotides. For instance, the N7 position of guanine is particu-
larly vulnerable to alkylation, and this alteration frequently results in spontaneous depurina-
tion (Mishina et al., 2006). If not repaired, such damages result in the incorporation of an
incorrect base during the next round of replication. Some bases are also subject to spontane-
ous loss of amine group. The most common deamination converts cytosine to uracil, which
can pair with adenine instead of the required guanine, leading to the fixation of a G:C→A:T
transition upon replication (Clancy, 2008b).
Aside from direct environmental exposure to mutagens, unfavorable chemical condi-
tions can also arise endogenously from the normal functioning of the cell. Indeed, numerous
metabolic by-products – notably from respiration – are free radicals or reactive oxygen spe-

27
Introduction – Control of genetic diversity

cies that can alter DNA through base oxidation, alkylation, or hydrolysis (Cadet et al., 2003).
Thus, mutations can occur spontaneously without explicit insults from the external environ-
ment.

iii. Transcription and replication

To some extent, the conformation of DNA duplexes protects its sequence from muta-
tions. Indeed, the inward orientation of amine bases toward the axis of the helix limits their
exposition to the cellular milieu (Watson and Crick, 1953). However, the two complementary
strands are separated during replication and transcription. The consequence of single stranded
DNA (ssDNA) exposure is particularly evident in bacterial genomes, wherein a sharp inver-
sion in strand content has been evidenced around the single origin and terminus of replication.
This bias reflects both the enrichment of the leading strand in coding sequences and the longer
time spent in single stranded form by the lagging strand (Lobry, 1996; Lobry and Sueoka,
2002; Rocha, 2004).
As every process involving transfer of information, replication is intrinsically inaccu-
rate by commiting punctual copying error and misincorporation – e.g. uracil opposite thymine
(but see Replication in Hi-Fi, p33). The availability and relative concentrations of the differ-
ent nucleotides in the cell can affect error frequencies.

iv. Transposition

Structural instability due to the mobilization of transposable elements (TEs) was ini-
tially observed by B. McClintock (1902–1992) in zea mays (McClintock, 1950, 1984). TEs
are mobile DNA segments that contain the information required to produce self-copies and/or
change their genomic location. These elements have been found in nearly all genomes inves-
tigated so far. One can distinguish two major classes of TEs. Class I retroelements (e.g. LTR-
retrotranposons and LINEs) comprise a reverse transcriptase that is used to process tran-
scribed RNA copies prior to reinsertion in a new location. This mechanism mediates a “copy-
and-paste” transposition process. The Class II elements are flanked by terminal inverted re-
peats that are processed by a dedicated transposase coded in the elements. This mediates a
conservative “cut-and-paste” mechanism. While both classes are presents in eukaryotes, pro-
karyotic genomes only contain class II elements. TEs are generally autonomous – though ei-
ther the reverse transcriptase (e.g. SINEs) or the transposase must be supplied in trans to
some altered elements (Miller and Capy, 2004).

28
Spontaneous mutations – Effects and origins of mutations

TEs can essentially be regarded as selfish parasites that spread within genomes – and
even between, through transposition into plasmids or viruses (Dawkins, 1976). In some eu-
karyotes, they can represent more than half of the genome. TEs play an important role in the
generation of genetic rearrangements. Because their target sites generally consist of few base
pairs only, they are expected to insert randomly in a genome. However, some exceptions have
been reported (see for instance Parks and Peters, 2009). Their mobilization can results in gene
inactivations, gene chimaerisations and altered expression patterns (Bushman, 2004), as well
as chromosomal deletions, inversions and translocations (Kazazian, 2004; Miller and Capy,
2004). Some TEs excise themselves in a precise and accurate manner while others leave scars
behind them, thereby permanently altering their target sites. As described below, chromoso-
mal rearrangements can also arise as an indirect consequence of the proliferation of TEs in the
genome.

v. Repair- and replication-mediated rearrangement

The occurrence of mutations is not always a purely random event. Instead, some pat-
terns can influence the appearance of mutations. Notably, the very structure of genomes can
promote the occurrence of chromosomal rearrangement through recombination between re-
peated sequences motifs (Rocha, 2003; Achaz et al., 2003; Cooper et al., 2007a). Such events
may result from misannealing of the replicating strand upon rescue of a stalled replication
fork or through more sophisticated repair mechanisms, such as homologous recombination
(HR, see Overview of repair mechanisms, p36 and Replication slippage, p61). In most ge-
nomes, the proliferation of TEs is an important source of large repeated sequences that can
drive chromosomal rearrangements (Kazazian, 2004). In this process, the orientation of the
repeats matters: direct repeats result in deletion or duplication, while indirect repeats generate
inversions. Tandem repeats results in specific patterns of facilitated duplication-deletion that
can prove adaptive. Transient gene amplifications can mediate specific increase in gene dos-
age (Roth et al., 2006; Hastings, 2007). In contrast, the expansion-retraction of repeated nu-
cleotides tract can alter the expression of functional protein (see Localized mutation through
slipped-strand mispairing, p61).

vi. Meiotic sex and horizontal gene transfer

Broadly speaking, sex is the combination of genetic materials from two distinct origins
to form a new genotype. In sexual eukaryotes, the process of meiosis regularly mixes the
complete sets of maternal and paternal genes, thereby randomly reassorting the alleles into

29
Introduction – Control of genetic diversity

new individuals. Although the exact evolutionary role of meiotic sex is far from being fully
understood (Otto and Lenormand, 2002), this process is driven by a dedicated molecular ma-
chinery which must have evolved to promote efficient homologous recombination (Cavalier-
Smith, 2002; Benavente and Volff, 2009; Wilkins and Holliday, 2009). In contrast, sex occurs
in prokaryotes through processes that are non-reciprocal and fragmentary. Horizontal gene
transfer (HGT) is not a regular component of these organisms’ life cycles (Redfield, 2001).
HGT can occur through: i) transduction by phages; ii) conjugation by plasmids; iii) transfor-
mation in naturally competent bacteria; and iv) is potentiated by other mobile elements such
as TEs. The repair machinery of the recipient cells is often implicated in genomic integration
of incoming DNA (see p36, HR and MMR ). As phages, conjugative plasmids and TEs are
essentially selfish elements; their involvement in genetic exchanges is likely to be an unse-
lected side effect of processes that evolved for more immediate functions. In contrast, natural
competence involves specialized and highly regulated machinery. Its implication in the gen-
eration of genetic diversity is thus more ambiguous and will be discussed in details latter (see
The competence state, p58).

I.1.2. Genome-wide mutation rates

a - Pattern of spontaneous mutation rates

i. Mutation rate

The mutation rate comprises all kinds of mutations occurring in a mutational target
during a given amount of time, including point mutations and complex chromosomal rear-
rangements. This rate is expected to vary greatly depending on target length, expression
strength (if the target is a gene) and other idiosyncrasies such as exact base sequence and the
presence of repetitive elements. The calculation of mutation rates cannot rely on post-hoc
comparison of natural sequences, because these generally result from complex and unknown
evolutionary histories. Such comparisons actually measure substitution rates, which depends
on mutation rate but also reflects the specific selective pressures exerted on the target, the
linkage with other genetic determinants, as well as demographic factors. One must dissociate
mutational events from any evolutionary process in order to obtain an unbiased picture of the
spontaneous mutation rates.

30
Spontaneous mutations – Genome-wide mutation rates

ii. DNA-based microbes display a constant genomic mutation rate

When measured in controlled experimental settings, the mutation rates exhibited by


representative organisms reveal remarkable taxonomic patterns. The most striking and most
accurate illustration concerns DNA-based microbes – which include representatives of bacte-
ria, archea, bacteriophages and unicellular eukaryotes. When expressed on a per-nucleotide
basis, mutation rates (µb) are collectively very low but vary over four orders of magnitude be-
tween different organisms (see
Table 1 and Figure 2). Interest-
ingly, genome sizes (G) are in-
versely correlated with µb. As a
direct consequence, the muta-
tion rates of diverse DNA-
based microbes are outstand-
ingly steady when extrapolated
to the genome as a whole. The
genomic mutation rates (µg) average 3.4·10-3 mutations per genome per generation and their
distribution is very narrow (2.5–4.6·10-3) (Drake et al., 1998). Hence, one single mutation is
expected to occur every 300 generations anywhere in the genome, irrespectively of the actual
microbe species. Overall, the measure of µg is biologically sound because the individual is the
entity on which selection operate.
The measure of mutation rates relies on the scoring of altered phenotypes, in cases
where the mutational target is presumably well defined. Because all mutations do not equally
impact the phenotype, the measured rate must be corrected. Furthermore, the extrapolation to
a genomic mutation rate relies on the assumption that the mutation-reporter gene is represen-
tative of the whole genome (Drake, 1991). In spite of these unavoidable approximations –and
considering the general paucity of constant values in evolutionary processes – the observation
of such a conserved constant is particularly meaningful. Specifically, this suggests that muta-
tion rates would naturally evolve toward an optimal equilibrium. This issue will be further
discussed below.

iii. Mutation rates in other organisms

Consistent measures of mutation rates are uneasy to assess experimentally in organ-


isms other than DNA-based microbes and the resulting estimates must be considered cau-

31
Introduction – Control of genetic diversity

tiously (see Figure 2). The mean µg calculated for lytic RNA viruses is ca. 1–2, but individual
values are considerably scattered. In retro-elements, including retrovirus and retrotransposons
the mean µg is ca. 0.1–0.2, again with several outliers. The fact that genome sizes do not read-
ily reflect genome contents complicates the situation in many eukaryotes. Some genomes are
indeed mostly composed of introns, TEs and no coding sequence whose functions are difficult
to appreciate. Most mutations in these regions may be neutral and are therefore unlikely to af-
fect the second-order selection of mutation rate (see The lowest… the best, p39). A proper es-
timate of µg in these species may then be based on the effective genome size, by only taking
the functional parts of the genome into account. In such cases, the mean mutation rate per ef-
fective genome has been estimated to ca. 6·10-3 (range 4–14·10-3) – values that are strikingly

32
Spontaneous mutations – Genome-wide mutation rates

close from the one observed in DNA-based microbes (see Figure 2). Nevertheless, these cal-
culations derive from very imprecise data. Notably, the actual extent of the so-called junk
DNA is far from being known. Another significant caveat is that the values mentioned above
correspond to mutation rates per effective genome per cellular generation. The strict equiva-
lent of the DNA-based microbe’s µg would arguably be better expressed per sexual genera-
tion. The corresponding mutation rates in higher eukaryotes vary widely in the range 3.6·10-2–
1.6 mutations per effective genome per sexual generation (Drake et al., 1998).

b - Mechanisms of genetic maintenance

Dispite relentless environmental injuries and repeated round of genetic information


copying, most organisms achieve strikingly low mutation rates. Accurate maintenance of the
genomic integrity is mediated by dedicated mechanisms that ensure both the fidelity of the
replication process and the reparation of DNA lesions.

i. Replication in Hi-Fi

As any copying process, DNA replication is very sensitive to noise. One would there-
fore expect it to be prone to high rate of error. In the absence of environmental injures and re-
pair mechanisms, in vivo substitution rates are in the range of 10-7–10-8 error per base pair in
both prokaryotes and eukaryotes (Kunkel, 2004 and see Figure 3). The intrinsic fidelity of
replicative polymerases is mediated by the topology of their active sites, which ensures a
stringent geometric selection for the shape and size of correct base pairs. Moreover, these po-
lymerases are unable to past replicate lesions or extend from mismatches, thereby dampening
the fixation of mutations during replication. The inhability to extend mismatches further pro-
vides polymerases with the opportunity to proofread. Replicative polymerases are indeed en-
dowed with 3’→5’ exonuclease function that permits to edit the last base replicated, and
readily correct 90–99.9% of erroneous pairing (Kunkel, 2004; and see Figure 3). Besides, or-
ganisms evolved various strategies to control the balance and availability of nucleotides and
to limit exposure to mutagens.

ii. Overview of repair mechanisms

Despite the overall fidelity of the replicative process, a substantial number of mispairs
arise during replication. Moreover, damaged bases arise continuously in a time-dependent and
replication-independent manner (Drake et al., 1998). If not repaired, these mutations are fix-
ated in the genome during replication, while their expression may lead to deleterious pheno-

33
Introduction – Control of genetic diversity

types in the meantime. Both prokaryotic and eukaryotic cells have evolved a number of
mechanisms to detect and repair various types of DNA damages. Many of the proteins in-
volved in these mechanisms are highly conserved between extremely remote organisms, illus-
trating their fundamental importance. Nevertheless, the molecular details of the underlying
pathways have considerably diversified over evolutionary times. Besides, different organisms
exhibit varying sets of mechanisms. Different types of damages are processed by specific and
sometime functionally redundant systems. Four major strategies can be distinguished: i) in
situ reversal of mutations; ii) resynthesis using the undamaged opposite strand; iii) recombi-
nation; and iv) transient tolerance of mutations.
(i) Some altered bases can be repaired in situ by specialized enzymes, thereby directly
reversing mutations. This form of repair does not require cleavage of the DNA backbone nor
polymerization, and thus limits the odds to produce breaks and replication errors. However,
the associated mechanisms are inherently limited in scope. So far, this strategy has been re-
ported for the repair of UV-induced cyclobutane pyrimidine dimers and (6–4)-photoproducts
by photoreactivation (Sancar, 2008), and for the repair of some alkylation damages (Mishina
et al., 2006). While photoreactivation use light as a source of energy to break chemical bonds,

34
Spontaneous mutations – Genome-wide mutation rates

the removal of alkyl groups involves the stoichiometric consumption of the dedicated alkyl-
transferase, which is metabolically costly.
(ii) Three repair systems rely on the more or less precise excision of erroneous bases
from the damaged strand, followed by resynthesis using the information carried on the oppo-
site strand. This strategy allows dealing with a wide variety of damages, but also promotes the
occurrence of DSBs by introducing nicks and exposing ssDNA. The nucleotide excision re-
pair (NER) pathway specifically targets mutations introducing bends in the DNA helix, such
as those produced by UV (see Figure 4). In E. coli, this function is carried by the UvrABCD
proteins (Truglio et al., 2006). Importantly, NER can be coupled to transcription, which fa-
vors the repair of regions that are likely to be of phenotypic importance in both prokaryotes
and eukaryotes (Deaconescu et al., 2007). The base excision repair (BER) fixes lesions that
are similar in size and shape to the normal ones. It is the predominant mechanism to handle
spontaneous DNA damages caused by free radicals and other reactive species. The initiation
of BER essentially relies on the recognition of lesions by specific DNA glycosylases (see
Figure 5). The glycosylase repertoire of a given genome hence specifies the range of damages

35
Introduction – Control of genetic diversity

that can be addressed (Baute and Depicker, 2008). The mismatch repair (MMR) is dedicated
to the processing of erroneous base pairs as well as various other damages. Its sole activity re-
sults in a 50-1000-fold increase in fidelity. The core MMR machinery consists of the MutSLH
proteins (Li, 2008; and see Table 2).
(iii) DSBs resulting from the cleavage of both DNA strands in the same DNA region
constitute one of the most hazardous genomic damage. Indeed, no proximal source of infor-
mation is available to direct the religation of broken ends. This situation notably arises from
exposition to ionizing radiations, replication of nicked sites, collapse of replication forks and
spontaneous cleavage of ssDNA exposed in the course of other repair mechanisms. Two main
recombination pathways are used to deal with such damages: homologous recombination
(HR) and non-homologous end joining (NHEJ). In E. coli, the RecA recombinase is the cen-
tral protein of the HR machinery. RecA specifically binds and coats ssDNA to form a recom-
binogenic nucleoprotein filaments. RecA-coated nucleofilments can invade duplex DNA
regions and search for sequence homologies in an ATP-dependent manner. The identification
of a homology results in the formation of a four-stranded DNA structure, called Holliday
junction (HJ) (see Figure 6A). At best, the resolution of HJs is only associated with the non-
reciprocical exchange of genetic information, a process called gene conversion (see p71).
However, if the contacted homology is not the true counterpart of the damaged region, resolu-
tion can lead to chromosomal rearrangements (see p29). Through its activity, the MMR ma-
chinery is involved in controlling mitotic recombination in prokaryotes and eukaryotes, as
well as crossing over during meiotic sex (see Table 2). Particularly, MMR promotes intrage-

36
Spontaneous mutations – Genome-wide mutation rates

nomic stability by preventing recombination between too divergent sequences. In HGT-prone


prokaryotes, this function is a major determinant of the species barrier (Ishino et al., 2006;
Dillingham and Kowalczykowski, 2008).

37
Introduction – Control of genetic diversity

NHEJ is a straightforward mechanism to deal with two-sided DSBs, but is far less
faithful than HR. During this process two broken DNA ends are simply joined together after
limited processing of the DNA ends, resulting in a quick but error-prone repair. The central
player in NHEJ is the DNA-end binding protein Ku. This pathway has first been evidenced in
eukaryotes (Burma et al., 2006). It is not present in E. coli, but has been found in a variety of
other bacteria and archaea (Shuman and Glickman, 2007).
(iv) The last strategy consists in tolerating damages to buy time for other repair
mechanisms to operate on the lesion. As mentionned above, the replicative polymerase cannot
accommodate most altered base pairs, leading replication forks to pause. The HR machinery
is essential to the recovery of stalled forks through the formation of a reversed structure,

38
Spontaneous mutations – Genome-wide mutation rates

whereby the annealing of the two nascent strands forms a four-way junction similar to a HJ
(see Figure 6B). In this structure, the interrupted nascent strand can be replicated using the
other nascent strand as a template, thereby allowing the lesion to be bypassed indirectly. Al-
ternatively, HJ resolution can reinstate the DNA duplex containing the lesion, while produc-
ing a one-sided DSB that can be further processed by the HR machinery as described above.
Yet, this latter process is a risky endeavor because it creates an intermediate that may promote
chromosomal rearrangements. Stalled replication forks can also recruit specialized error-
prone (EP) polymerases to past replicate the lesion (see Figure 3, p34). These poorly proces-
sive enzymes are lacking editing properties but can accommodate various altered templates
and incorporate complementary bases with varying degrees of accuracy – a process refered to
as translesion synthesis. This strategy is a two-edged sword: while accurately replicated le-
sions are transiently tolerated, the others directly results in the fixation of mutations.

c - The lowest… the best

The production of genetic variation can be regarded as a necessary evil. Most muta-
tions are deleterious (Eyre-Walker and Keightley, 2007) and their continuous appearance
jeopardizes the maintenance of an organism on the short term. Nevertheless, the production of
genetic novelty is required on the long term to keep in line with ever changing environments
(see Appendix, p260). The existence of taxonomic patterns of mutability strongly suggests
that genomic mutation rates might be adjusted to an evolutionary trade-off between these two
trends. Because repair mechanisms are genomic caretakers, they constitute privileged agents
to affect mutation rates. Accordingly, the action of this impressive arsenal of mechanisms
could be fine tuned in order to connive at just the sufficient amount of mutations necessary to
ensure proper evolutionary power, without hampering instantaneous survival. Evolution,
however, is a shortsighted process: phenotypes – and their underlying determinants – are se-
lected on the basis of their immediate reproductive advantage, not for the sake of potential fu-
ture benefits.
In this context, what would be the fate of an allele that modifies the global mutation
rate? This question was first asked by A. H. Sturtevant (1891-1970) some 70 years ago
(Sturtevant, 1937) and has been the object of numerous studies since then (Kondrashov, 1995;
Sniegowski et al., 2000; De Visser, 2002). From a theoretical point of view, a gene affecting
the mutation rate is called a modifier. A modifier allele per se has no effect on fitness, and is
thus not directly subjected to selection. Instead, it is involved in altering other genes, thereby
producing mutations which may affect fitness. The prospective selective regime undergone by

39
Introduction – Control of genetic diversity

modified genes then reflects on the modifier, because all are present in the same genome.
Modifier alleles are thus selected indirectly through their effects on other genes to which they
are genetically linked – a process termed genetic hitchhiking (Maynard-Smith and Haigh,
1974). The effectiveness of hitchhiking depends on the linkage disequilibrium between the
loci considered, i.e. on the propensity of the physical link between genes to be broken by re-
combination. Increased recombination rates decrease the average time during which the modi-
fier benefits from indirect selection by hitchhiking. The indirect selection of modifier genes is
often referred to as second-order selection, because it relates to the evolution of the capacity
to evolve.
An individual bearing a modifier allele associated with a decreased mutation rate has
higher probability to maintain its genetic integrity. Because most mutations are deleterious in
steady conditions (Eyre-Walker and Keightley, 2007), its mean fitness is on average higher
than the one of its surrounding competitors, ensuring its evolutionary success. At the popula-
tion level, the rise in frequency of the modifier reduces the overall genetic load of deleterious
alleles which is nonetheless maintained by mutation-selection balance. Thus, proximal evolu-
tionary forces tend to decrease the mutation rate as much as possible. Because recombination
erodes genetic linkage, there is theoretically a much stronger selection for the reduction of
mutation rates in asexual or selfing organisms than in sexual species. Hence, the indirect se-
lection of the weakest modifier is expected to be more significant in prokaryotes.
If selection systematically favors lower mutation rate, the following question arises:
why mutation rates are not falling to zero? There are three major answers to this question: i)
the observed level of fidelity has reached a maximum level, and cannot be heightened for
physicochemical reasons; ii) the sophisticated mechanisms required to achieve high fidelity
are costly in both energy and time, and this trades-off with the production of deleterious muta-
tions; and iii) the previous reasoning assumes the primacy of deleterious mutations, but the
rare generation of advantageous mutations can introduce a counterbalancing selection for an
increased mutation rate – thereby leading to an equilibrium.
The measures of mutation rates per base pair per generation (µb) presented earlier are
disparate and must be normalized by the genome size to reveal a constant pattern (see Figure
2, p32). Thus, different genomes achieve different absolute level of fidelity and lower muta-
tion rate are likely to be attainable. This suggests that proposition (i) must be rejected, at least
in some taxa. Hypothesis (ii) has been first posited by M. Kimura (1924–1994) (Kimura,
1967), but remains difficult to test experimentally. A proper demonstration requires the estab-
lishment of a direct link between altered mechanisms of fidelity and effective fitness. How-

40
Spontaneous mutations – Genome-wide mutation rates

ever, the mechanisms of fidelity are responsible for mutations that result in indirect pheno-
typic effects and strongly affect fitness measures. Impairing repair mechanisms may indeed
results in energy savings and increased growth rate, but the mutational load incurred on the
population might rapidly overweigh these effects. It is difficult to disentangle these direct and
indirect effects to obtain a clear picture. Nevertheless, the assumption that fidelity impinges a
physiological cost is perfectly sound. Indeed, repair mechanisms involve the synthesis of spe-
cialized machineries and their functioning is very costly in ATP. Furthermore, repair func-
tions often introduce new mutations at the expense of overall fidelity to avoid individual cell
death. Some of these mutations (e.g. illegitimate recombinations) are irreversible, while others
(tolerated mismatches) have a chance to be further repaired. That the processes of mainte-
nance and survival produce genetic alterations by themselves results in an infinite regression.
Intuitively, the zero mutation point is thus unattainable, and the costs of repair are likely to
rise sharply with the level of fidelity. Despite the lack of decisive experimental evidences, hy-
pothesis (ii) is generally given most credit. Then, selection would favor the highest practical
level of fidelity. Broadly speaking, the equilibrium between the short sighted fight against
mutations and the inherent physiological cost of doing so would thus provide enough muta-
tions for successful evolution to occur.
However – as we will see in the next sections – some increases in mutation rates can
be advantageous when the basal mutation rate does not suffice to drive efficient adaptation.
However, such increases are better kept transient in time or restricted to specific loci to be
adaptive. The necessity for a temporal containment will be first illustrated by a very rough
mechanism (see Lessons from the mutator phenomenon, p46). Well described examples of
fine-tuned global processes achieving this goal will then be presented (see Stress-induced
mutagenesis, pp 47–60). Genetic mechanisms allowing spatial targeting and phenotypic orien-
tation of mutations will be extensively described in the next chapter (see Programmed genera-
tion of genetic variations, pp 60–99). Such spatio-temporal containments of mutations do not
significantly affect the mean genomic mutation rates, urging to reject hypothesis (iii). Altera-
tions that increase mutation rates have been identified easily (see next section). The corre-
sponding mutator alleles generally result from impairment of the repair machinery. If
mutation rates were fine tuned to equilibrium between production and avoidance of altera-
tions, it would likewise be possible to identify alleles responsible for decrease in mutation
rates. Nevertheless, evidences for such anti-mutators are very sparse. EP polymerases consti-
tute anti-mutators because their inactivation may decrease the mutation rate in certain condi-
tions. However, these enzymes are deeply integrated in repair mechanisms and their

41
Introduction – Control of genetic diversity

impairment affects immediate cell survivals. The case of translesion polymerases is thus
somewhat controversial and will be further discussed later (see Survival and variability during
SOS induction, p50). Apart from these specialized polymerases, very few anti-mutator alleles
have been described and their phenotypes are generally unclear (Schaaper, 1998; Schaaper
and Dunn, 2001; Dzidic and Petranovic, 2003). While this scarcity may simply reflect the dif-
ficulty of isolation, it rather highlights the invalidity of hypothesis (iii) with respect to hy-
pothesis (ii).

I.1.3. General mutators and the ambiguity of repair systems

a - Natural occurrence of mutators

Mutator strains displaying high mutation rates have been found at significant frequen-
cies (0.1%–60%) in natural populations of pathogenic bacteria, including Escherichia coli,
Salmonella enterica, Neisseria meningitides, Haemophilus influenzae, Staphylococcus aureus,
Helicobacter pylori, Streptococcus pneumoniae, and Pseudomonas aeruginosa (Denamur and
Matic, 2006). Furthermore, among the twelve parallel cultures of E. coli experimentally
propagated for decades by Lensky and co-workers, three fixated a mutator phenotype
(Sniegowski et al., 1997). Altogether, these observations strongly suggest that global in-
creases in mutation rates can be selected in particular conditions. In E. coli, ca. 20 different
genes that are typically involved in maintaining genomes integrity can confer mutator pheno-
types of different strengths (Horst et al., 1999). Nevertheless, almost all natural mutators cor-
respond to the inactivation of the MMR genes mutS and mutL (Denamur and Matic, 2006).

b - The short-term advantages of increased mutation rates

When a population is perfectly adapted to its current environment most – if not all –
mutations have negative or at best neutral effects on fitness (Eyre-Walker and Keightley,
2007). As highlighted above, mutators are counter-selected in these conditions. However, the
likelihood that a particular mutation may prove advantageous increases when the population
is under-adapted. Increased mutation rate may then speed up adaptation when the environment
is heterogeneous, or when it change continually through time (De Visser, 2002). This effect
was observed in experimental infection of axenic mice. The mutator showed a strong advan-
tage in colonizing the mouse gut when the initial inoculum was a poorly adapted laboratory
strain. However, when the inoculated bacteria were already well-adapted to the mouse envi-

42
Spontaneous mutations – General mutators and the ambiguity of repair systems

ronment, mutator derivatives were slightly counterselected (Giraud et al., 2001). The exam-
ples mentioned above mentioned fit within this framework. Indeed, pathogens are exposed to
ever-changing environments corresponding to colonization of new hosts or new niches in the
same host, and are constantly challenged by the host immune system. In Lenski’s long term
experiment, bacteria are confronted with particularly unusual conditions to which they have
not been selected to cope with in nature (continuous exponential growth in a nutrient-limiting
medium).
When a population is challenged with a new environment, a mutator has larger oppor-
tunities to produce an advantageous mutation. The corresponding modifier allele can then
hitchhike on the mutation to reach significant frequency in the population. Because advanta-
geous mutations are rarer, the indirect selection to increase the mutation rate is largely more
sensitive to recombination than the indirect selection to decrease the mutation rate. In organ-
isms subjected to high recombination, the advantageous mutation is therefore rapidly segre-
gated away from the mutator allele that caused it (Tenaillon et al., 2000). Consequently, the
hitchhiking of the modifier allele is very transient, and the genomic mutation rate of the popu-
lation is largely unaffected. In contrast, low levels of recombination allow sustained hitchhik-
ing in bacteria, accounting for the natural occurrence of mutator phenotypes in these
organisms.
Two factors particularly influence the rise of mutators in asexual populations: i) the
strength of the mutator alleles; and ii) the population size. Computer simulation showed that
mutator alleles of large effects have the highest probability of hitchhiking (Taddei et al.,
1997b). Consequently, indirect selection on mutators does not result in fine tuning the muta-
tion rates toward an optimal equilibrium level, as exposed in hypothesis (iii) of the previous
section (see p40). In contrast, asexual populations have a propensity to fixate sharply elevated
mutation rates through hitchhiking. The role of mutator strength is better understood when
considered in conjunction with the population size. Experimental competitions between muta-
tor and non-mutator bacterial strains showed that a mutator subpopulation requires a mutation
rate which is increased by more than the inverse of its numerical disadvantage to have a sig-
nificant chance to produce the next beneficial mutation, and hence invade the population
(Chao et al., 1983). In the absence of directional selective pressure, the frequency of mutators
in bacterial populations has been estimated to ca. 10−6–10−5 (LeClerc et al., 1998; Boe et al.,
2000). Then, a typical mutator subpopulation should exhibit mutations rates increased by
>106–105-fold to consistently take the population over. Yet, mutator phenotypes do not gener-
ally exceed ca. 103-fold increase in mutagenesis (Denamur and Matic, 2006). The rise of mu-

43
Introduction – Control of genetic diversity

tator in natural population would then involve accidental enrichments in mutator individuals.
Alternatively, a robust benefit in favor of the mutator subpopulation arises when several
epistatic mutations are required to produce an adaptive phenotype. Indeed, these mutations
would occur in independent non-mutator cells while a single mutator individual has higher
probability to produce the required sequence of mutations in a given amount of time. This
suggests that larger populations favor the rise of mutators (Tenaillon et al., 1999). However, a
subtle effect may nuance this conclusion. Too large increases in the rate of beneficial muta-
tions may decrease the overall adaptation pace of a population. Indeed, several independent
advantageous clones may arise at the same time in a sufficiently large population with a
strong mutator phenotype. These clones then engage a competition with each other – a phe-
nomenon known as clonal interference (Gerrish and Lenski, 1998). The relative selective ad-
vantage of any beneficial mutation is decreased in the presence of the others, resulting in
slower rise in frequencies. Due to the hampering effect of clonal interference, large or not
well-adapted mutator populations may thus not adapt faster than similar populations with
lower mutation rate (De Visser et al., 1999).

c - Long-term consequences of increased mutation rates

Apart from the transient situation presented above, wherein the sequential production
of mutations is needed to reach an optimal fitness, the selection of a mutator must be regarded
as an indirect consequence of adaptation, and not an adaptation by itself (Sniegowski et al.,
2000). Indeed, as the population becomes adapted to its new environment, the distribution of
fitness effect is shifted toward increased prevalence of deleterious mutations (Silander et al.,
2007; Martin and Lenormand, 2006). The selective pressure against deleterious mutations rap-
idly comes to prevail and the relative advantage of being a mutator vanishes. Instead, the mu-
tator allele induces a load of deleterious alleles that accumulate in the population, and
selection for lower mutation rate is soon renewed. If the population does not revert to a non-
mutator state, the short-term benefit of mutagenesis irremediably turn into a tragic extinction
in the long run. In this light, the rise of mutators in bacteria can be viewed as a deleterious by-
product of asexuality (Sniegowski and Murphy, 2006).
The naturally occuring mutS and mutL mutators are generated by a variety of muta-
tions including frameshifts, insertions, premature stop codons and deletions. While simple re-
version may occasionally occur in the first cases, mutator populations resulting from deletions
seem doomed. The selection of compensatory mutations is a possible alternative, but is also
unprobable. Hence, a mutator population has limited possibilities to reestablish a low muta-

44
Spontaneous mutations – General mutators and the ambiguity of repair systems

tion rate. Nonetheless, a particular feature of mutS and mutL mutants may incidentally explain
their prevalence among natural mutators. The mismatch repair system is indeed involved in
setting the specificity of homologous recombination. As a result, mutS and mutL mutator
strains show a 100-fold increase in recombination rate (Denamur and Matic, 2006). No only
increased recombination may facilitate the association between beneficial mutations through
HGT, but functional alleles of the impaired mismatch repair gene may also be acquired from
surrounding non-mutator bacteria, thereby increasing the odds of reversion toward standard
mutation rate. In support to this idea, the mutS and mutL genes display a patchy pattern of se-
quence polymorphism that indicates frequent events of homologous recombination between
E. coli isolates (Denamur et al., 2000). Besides, the three lines that evolved a mutator pheno-
type in the Lenski experiment are all impaired in mutL. A 6-bp repeat present in three copies
in the wild-type gene is present in four copies in one of the three mutator lines and two copies
another (Shaver and Sniegowski, 2003). This might constitute a mechanism of facilitated and
reversible switch between mutator and non-mutator strain (see Localized mutation through
slipped-strand mispairing, p61). In the same light, eukaryotic mismatch repair genes (Chang
et al., 2001) and bacterial genes involved in stress-responses (Rocha et al., 2002) seem to be
particularly enriched in small repeated units.
In rapidly changing environments, the population is never expected to be really
adapted, and it would continuously benefit from increased genetic variance. However, even in
this specific context, the indirect selection of strong mutators becomes highly problematic
with time and reversion to a non-mutator state seems ultimately mandatory in the long-term.
Indeed, in the course of adaptation to a new environment, genomes accumulate many muta-
tions that are immediately neutral, but may reveal deleterious in subsequent environments. Al-
ternatively, mutations that improve functions that are needed in a given environment may
negatively affect other functions that are essential elsewhere – a principle termed antagonistic
pleiotropy (Cooper and Lenski, 2000). In other words, if increased rates of mutation can raise
the adaptation pace they also turn organisms into niche specialists at the expense of adaptive
flexibility in the long run (Giraud et al., 2001). This effect is potentiated by recurring selective
sweeps. Indeed, each time a particular individual generates an adaptive mutation, its whole
genotype – including currently neutral and slightly deleterious mutations – increases in fre-
quency in the population, at the expense of the overall genetic diversity. This phenonmenon
reduces the population effective size and quickly results in a mutational meltdown. Passages
through real population bottlenecks – such as those that often occur during colonization of
new hosts by pathogens – are similarly expected to facilitate the action of Muller’s ratchet.

45
Introduction – Control of genetic diversity

During an experimental study, wild-type and mutS defective cells were subjected to 40 cycles
of single-cell bottlenecks. By the end of the experiment, 4% of mutS lineages had died out,
55% had auxotrophic requirements, 70% had defects in at least one sugar or catabolic path-
way, 33% had a defect in cell motility and 26% became temperature-sensitive lethals. In sharp
contrast, only 3% of the wild-type lineages displayed detectable phenotypes (Funchain et al.,
2000). In a similar experiment involving wild-type and msh2 yeast populations, two mutator
lineages out of twelve had gone extinct by the 175th cycle, while none in the wild-type (Zeyl
et al., 2001).

d - Lessons from the mutator phenomenon

The overall picture emerging from the mutator phenomenon is one in which the muta-
tion rate is continually buffeted between its lowest standard and largely increased values, ow-
ing to successive and antagonistic indirect selections on modifiers. Mutator phenotypes can be
transiently advantageous, but soon become deleterious and standard mutation rates must be
restored to ensure long term survival. These observations reveal the evolutionary advantage of
targeted hypermutability in avoiding the accumulation of mutational loads. Despite the inher-
ent cost of producing essentially deleterious mutations, global increase in mutation rate en-
sures maximal genetic “creativity”. Any mechanism that limits the time spent in the
hypermutable state when not necessary would soften the associated genetic burden, without
restricting the range of attainable mutations. This can be achieved by favoring the constitutive
wavering between increased and standard rates of mutation, a process that relies on the indi-
rect selection of mutagenic states by environmental pressures. Alternatively, refined mecha-
nisms can target the genome-wide production of mutations to periods wherein innovation is
needed. This will be discussed in the next section (see Stress-induced mutagenesis, below).
Nevertheless, any increase in genomic mutation rate induces its share of long-term detrimen-
tal mutations. When specific genetic elements particularly benefit from increased diversity, an
advantageous strategy is to specifically target variation to these elements without incurring
deleterious mutations elsewhere in the genome. As will be exemplified throughout the next
chapter, mutations can be targeted to specific loci by a variety of mechanisms (see Pro-
grammed generation of genetic variations, pp 60-99). This strategy radically reduces the mu-
tational load – though at the cost of decreasing the scope of accessible mutations. Obviously,
an ideal adaptation system would combine the best features of the strategies mentioned above.
As we will see, the integron system achieves this goal to some extent.

46
Stress-induced mutagenesis – The SOS paradigm

I.2. Stress-induced mutagenesis

The need for genetic diversity is not constant through time but depends on changing
selective environments. Such varying demands can be fulfilled through the rise and fall of
constitutive mutator alleles. However, this process relies on stochastic cycles of second-order
selection affecting modifier alleles, which is clearly not optimal. In this light, it seems intui-
tively advantageous for organisms to evolve mechanisms whereby the environment could in-
fluence the availability of variations on which selection can act. Because they are keepers of
the genome integrity, existing repair mechanisms – such as the SOS response – stand as privi-
leged agents to mediate this phenomenon.

I.2.1. The SOS paradigm

a - The SOS response to DNA damage

Some of the pathways involved in DNA repair are subjected to specific patterns of
physiological regulation that restricts their action to periods when genomic integrity is jeop-
ardized. The setup of complex regulatory circuits in place of constitutive expression is proba-
bly motivated by the deleterious interference of some repair enzymes with the functioning of
the cell under normal conditions, as well as by the energetic cost imposed by the synthesis of
the repair machineries. In addition, this control allows coordinating the action of different re-
pair mechanisms. All organisms are endowed with integrated responses to DNA damage,
though to different degree of complexity.
The initial recognition that such physiological responses may exist initially came from
the description of the SOS system in E. coli. The SOS system is a regulatory network com-
prising ca. 50 genes in E. coli (Wade et al., 2005). The expression of at least 31 of these genes
is directly controlled by LexA, the negative regulator of the system (Fernandez De Henestrosa
et al., 2000). The other genes are likely to be either secondary targets or regulated in non-
canonical ways. LexA exerts its repressive effect by binding to specific DNA motifs known as
LexA-boxes (consensus 5’-CTGTATATATATACAG-3’). One or several LexA-boxes are
generally located in the promoter region of the regulated gene, preventing access to the poly-
merase complex by steric hindrance when LexA is bound. Derepression of the regulon de-
pends on the auto-cleavage of LexA – a phenomenon mediated by the pleiotropic RecA
protein associated with ssDNA. RecA constitutes the positive regulator of the SOS response,

47
Introduction – Control of genetic diversity

and its action is conditioned by the presence of excess single-stranded DNA (see Figure 7).
In E. coli, the SOS genes are involved in various functions, including DNA replication
(umuC, umuD, dinB, polB); DNA recombination (recA, ruvA, ruvB, RecN); transcription
(lexA, fis, dinI, dea, suhB); nucleoside metabolism (grxA); transport (ycgH, glvB, ybeW, corA)
and cell division (sulA). This list only comprises the direct targets of LexA for which a clear
function has been defined. For a comprehensive table see Courcelle et al., 2001. The induc-
tion level of SOS-controlled genes upon activation of the response relies on several factors:
the number of binding sites, the intrinsic affinity of the sites to LexA and their location in the
regulatory region (Friedberg et al., 2005a). LexA binding may occur cooperatively when sev-
eral boxes are present in the same region (Gillor et al., 2008). Accordingly, the level of induc-
tion varies widely in both strength and dynamic among members of the regulon (Courcelle et
al., 2001; Ronen et al., 2002). These induction dynamics reflect different sensitivity to de-
creased level of LexA, the quickest genes being the ones that can be induced with limited
drop of lexA. Conversely, the genes that are induced late need significant RecA-mediated de-
pletion of LexA to be activated. Therefore, properties of the LexA-boxes are involved in fine-

48
Stress-induced mutagenesis – The SOS paradigm

tuning the hierarchy of gene induction. Because lexA is part of the SOS regulon, the master
repressor LexA controls its own expression. This negative auto-regulatory pattern typically
ensures tight regulation and rapid recovery of repressing conditions after induction (Alon,
2007). Indeed, standard SOS repression is reestablished within 1.5-2 cell cycles after induc-
tion (Ronen et al., 2002). The positive regulator RecA is also controlled by the response, re-
sulting in a positive feedback loop provided sufficient amounts of ssDNA are available. This
interplay between antagonistic feedback loop – and maybe other unclear mechanisms impli-
cating an UmuDC-mediated checkpoint – establishes a multiple peaked pattern of expression
from SOS promoters at the single cell level (Friedman et al., 2005).
RecA is primarily involved in homologous recombination by catalyzing the invasion
of duplex DNA by ssDNA (see p36). RecA monomers assemble on single-stranded DNA to
form recombinogenic nucleoprotein filaments. The processing of DSBs by the RecBCD com-
plex is a significant source of RecA nucleofilaments (Rosenberg, 2001). DSBs can arise upon
exposure to mutagenic factors either directly or as side-products of the collapsed replication
forks resulting from various DNA alterations. Besides, DNA damages can uncouple the repli-
cation of the leading and lagging strands (Pagès and Fuchs, 2003), and dissociation of the
helicase from stalled replisomes may result in continued DNA unwinding beyond arrested
forks (Rangarajan et al., 2002). Both phenomena might be non-negligible sources of RecA-
coated ssDNA, probably in a RecFOR-dependent manner (Rangarajan et al., 2002; Fujii et al.,
2006). An additional role of RecA associated to ssDNA is to promote the auto-cleavage of
LexA. Therefore, RecA plays the role of a general and integrating sensor of DNA damages in
the SOS system.
Because DNA alterations occur spontaneously as the result of endogenous metabo-
lism, ca. 0.3% of a bacterial population cultured in standard conditions undergo strong SOS
induction (McCool et al., 2004). As increased amounts of DNA lesions often arise from envi-
ronmental injuries, DNA-damage stresses are potentially a good indicator of selective envi-
ronments (Radman, 1975). Some potent inducers – such as UV radiations and the DNA
damaging antibiotic mitomycin C – are known for a long time, while others are reported con-
tinuously. Important oxidative stresses have been shown to induce the SOS response (Imlay
and Linn, 1987). The response is also promoted in aging colonies in a cAMP-dependent man-
ner suggesting links with other physiological responses (Taddei et al., 1995; see also
Mutagenesis in aging colonies, p56). Commonly used antibiotics that are known to interfere
more or less directly with proper replication (such as fluoroquinolones, rifamycins and
trimethoprim) potently induce SOS (Kelley and William, 2006). Other unexpected triggers are

49
Introduction – Control of genetic diversity

discussed below (see p54). Broadly speaking, any situation generating abnormal amount of
ssDNA can potentially lead to SOS induction.
The SOS system is referred to as a response because a substantial part of the regulon is
precisely dedicated to repairing the DNA damages that induced its derepression. The proteins
UvrA, UvrB and UvrD are essential components of the nucleotide excision repair. RecA,
RuvA and RuvB are implicated in homologous recombination and processing of stalled repli-
cation fork. Furthermore, the three SOS induced polymerases Pol II (polB), Pol IV (dinB) and
Pol V (umuCD) are implicated in translesion synthesis, thereby favoring replication in times
of stress (Napolitano et al., 2000).

b - Survival and variability during SOS induction

Repair mechanisms are devoted to maintain the genome integrity and to ensure imme-
diate survival of individuals. The upregulation of repair proteins by the SOS response is pri-
marily aimed at providing further resources, when increased amounts of DNA damage are
sensed in the cell. For instance, exposition to UVs efficiently triggers the SOS response. Fol-
lowing exposure, the NER pathway is overwhelmed by the increased number of UV-induced
photoproducts, which then cause more frequent stalling of the replication forks. The RecA
nucleofilament that results from the processing of collapsed forks leads to decreased level of
functional LexA and SOS induction. The subsequent increase in UvrABD allows NER to
cope with the excess of photoproducts. In the same light, higher levels of RecA, RecN, RuvA
and RuvB allow cells to overcome increased amounts of DSBs and stalled replication forks.
The expression of the otherwise tightly repressed sulA gene blocks cell division and causes
cell filamentation. This process constitutes a kind of check-point that buys time for cells to
perform the required repairs before potential template copies of the genetic information are
separated (Friedberg et al., 2005a). Another checkpoint may implicate Pol V (see below). The
SOS induced upregulation of LexA ensures the effective repression of the regulon as soon as
the amount of ssDNA dropped due to effective repair.
The inactivation of the SOS response leads to tremendously increased lethality upon
exposure to genotoxic stresses such as UVs (see Figure 8 and Figure 9). The response is thus
clearly involved in promoting individual survival. By doing so, increased genetic variability is
incidentally introduced in the population. Indeed, repair mechanisms occasionally produce
mutations, such as illegitimate recombination events. In this respect, the role of the SOS-
induced EP polymerases is particularly ambiguous. Their primary role is probably to alleviate
the workload of other repair mechanisms (Matic et al., 2004). By past replicating lesions that

50
Stress-induced mutagenesis – The SOS paradigm

would otherwise lead to collapse of the rep-


lication fork, they limit the production of
problematic DSBs. Besides, damaged sub-
strates can be replicated accurately during
translesion synthesis. Lesions are then tran-
siently tolerated with the perspective that
they might be properly repaired subse-
quently. However, the derepression of Pol
II, Pol IV and Pol V is responsible for a sig-
nificant increase in mutation rate.
Pol V is the most mutagenic EP po-
lymerase in E. coli. Its functional expression
is nevertheless subjected to complex control
mechanisms. PolV is constituted by two
UmuD’ monomers assembled with UmuC
(Goodman, 2002). LexA tightly represses
the umuD and umuC genes, so that no ex-
pression is detectable under normal circum-
stances. Both genes are strongly induced,
though lately in the course of the response –
and to reach fairly low absolute levels of ex-
pression (Courcelle et al., 2001). To be
functional in PolV, UmuD must undergo a
post-translational activation into UmuD’.
This modification corresponds to a self-cleavage induced by RecA nucleofilaments following
a mechanism similar to the one undergone by LexA (Friedberg et al., 2005a). UmuD and
UmuD’ each form homodimers, and interact with each other to form a heterodimer that is
more stable than either of the homodimers (Matic et al., 2004). Thus, the effective formation
of Pol V requires sufficient amount of RecA nucleofilaments to ensure the predominance of
UmuD’. In addition, interaction with a RecA nucleofilament polymerized downstream of the
stalled forks seems to be a required co-factor of the polymerase activity (Friedberg et al.,
2005b). Overall, only ca. 15 functional Pol V molecules per cell are assembled lately upon
SOS induction (Fuchs et al., 2004). The requirement for the late presence of RecA nu-
cleofilaments can be regarded as a kind of checkpoint (Opperman et al., 1999; Friedman et

51
Introduction – Control of genetic diversity

al., 2005). In this view, Pol V-


mediated translesion synthesis is used
as a last line of defense to deal with
damages that other mechanisms were
not able to overcome.
Contrasting with this situation,
Pol IV is expressed at an appreciable
concentration under normal condition
(ca. 250 molecule per cell) (Fuchs et
al., 2004). This may reflect the in-
volvement of this protein in restarting
spontaneously arrested forks. Pol IV is
encoded by dinB, the expression of
which is upregulated by a ca. 10-fold
factor relatively rapidly after SOS in-
duction. Pol IV is mostly implicated in
mismatch extension, which constitutes
a tolerance mechanism relying on the
subsequent action of MMR. However,
the EP polymerase is also able to past
replicate some types of lesions, thereby promoting mutagenesis. Pol II is occasionally able to
past replicate few lesions, but is mainly implicated in restarting stalled replication fork by in-
directly bypassing the lesion using the other nascent strand as a transient template for replica-
tion (see p38). A wide variety of lesions can be tolerated through this mechanism. Pol II is
encoded by polB and is overexpressed by a 4-fold factor very early after SOS induction.
Overall, Pol II and Pol IV seem to compete for rescuing arrested replication forks. Pol II is
overexpressed first – maybe to deal with arrested forks in the least mutagenic fashion. Pol IV
then enters in competition with Pol II, and offers increased opportunities to extend misaligned
strands and bypass certain lesions. If these mechanisms are not sufficient to deal with the ar-
rested forks, large amounts of DSBs sustaining a high level of RecA nucleofilament may be
produced, thereby favoring the derepression of umuC and umuD. Ultimately, RecA-coated
ssDNA filaments stimulate the expression of Pol V, which bypass the excess of lesions in a
mostly mutagenic manner.

52
Stress-induced mutagenesis – The SOS paradigm

Even though the hierarchical pattern of regulation imposed to the SOS-induced EP po-
lymerases may minimize their mutagenic effects, they are responsible for the accumulation of
most mutations during the SOS response. Exposure of E. coli to UV leads to a 100-fold in-
crease in mutations, which are primarily targeted at damaged sites (Friedberg et al., 2005a). In
umuC or umuD backgrounds, UV-induced mutations are largely reduced (Goodman, 2002).
This obsrvation, however, may reflect increased death rates. The important point is that muta-
tions are not introduced for their own sake, but in order to promote survival – as other repair
mechanisms do. The SOS response promotes survival of individual at the cost of increased
mutagenesis. If mutations were to be avoided at any price, the best strategy would be to let al-
tered cells die instead of repairing them. The evolution of such a behavior would require of
selective pressure at the population level, because no individual-based selection can favor
death. Such a phenomenon is conceivable in multicellular organism through apoptosis. Sev-
eral observations suggest that controlled cell death may play an important role in bacterial
populations experiencing some sorts of stressing conditions (Yarmolinsky, 1995; Lewis,
2000; Bayles, 2007; Rice and Bayles, 2008). In all these situations, termination of a subpopu-
lation promotes survival of the other cells, thereby benefiting the clonal population. In the
case of DNA-damaging stress, the existence of such a benefit is unlikely. Indeed, if a muta-
tion created while attempting repair proves deleterious, the cell will eventually die, incurring
little cost to the population. The generation of increased variability is thus much likely a by-
product of desperate effort to survive.
It remains that this shortsighted survival mechanism promotes the generation of heri-
table genetic variance in fitness. This can prove adaptive at the populational level by occa-
sionally producing mutations capable of overcoming the triggering stress. This process is
particularly well illustrated by reports on the acquisition of antibiotic resistances. For in-
stance, in a mouse infection model using E. coli, the appearance of resistant strains can be
evidenced soon after treatment with a widely used fluorquinolone antibiotic (ciprofloxacin).
However, a mutant strain that is unable to induce the SOS response does not give rise to resis-
tant derivatives in the same conditions (see Figure 9). Ciproflaxacin is a potent inducer of the
SOS response. Resistance to this antibiotic is easily acquired through point mutations in the
gyrA gene, which codes for the DNA topoisomerase targeted by the drug. Further genetic
analyses showed that all three SOS-inducible polymerase are required for the generation of
these point mutations (Cirz et al., 2005). Increased amount of HR events may similarly result
in advantageous genome rearrangements. In this light, the generation of mutations consecu-
tive to the SOS response incidentally results in a transient mutator phenotype – which can

53
Introduction – Control of genetic diversity

help bacteria to adapt stressful situations. Such a stress-induced mutagenesis can be subjected
to second-order selection, just as standard mutators do (Bjedov et al., 2003; Tenaillon et al.,
2004). However, clear-cut evidences indicating that this is indeed the case are lacking because
the observed phenomenon is also a side effect of stress-resistance mechanisms emerging from
first-order selection.

c - Extending the SOS response

Two different lines of evidence may strengthen the view that the SOS response is in-
volved in adaptive mutagenesis: i) the hijacking of the response by elements unrelated to
DNA repair; and ii) its induction by stresses that are not direcly linked to DNA damages. Par-
alleling the situation in E. coli, 33 genes are regulated by LexA in Bacillus subtilis (Au et al.,
2005). Nevertheless, the corresponding LexA-box consensus is remarkably divergent from E.
coli’s (5'-CGAACN(4)GTTCG-3' versus 5’-CTGTA(10)CAG-3’, respectively) (Groban et al.,
2005). Only a handful of genes are shared between the two regulons (recA, lexA, ruvA, ruvB,
uvrA, uvrB, and uvrC). Strikingly enough, all are involved in DNA repair. Because E.coli and
B. subtilis are very distantly related bacteria, this subset of genes may be regarded as the core
of the LexA regulon. However, distinct LexA-boxes and regulons have been identified in
other bacteria, mostly using bioinformatic inferences. When all these data are taken into ac-
count, the core regulon is drastically reduced to lexA alone – even if recA, ruvAB, uvrA and
ssb are commonly found under the control of LexA (Erill et al., 2007). Besides, three other
LexA-regulated genes involved in translesion synthesis (imuA, imuB, and dnaE2I) were found
to be widely associated with lexA and may constitute an alternative ancestral operon (Erill et
al., 2006). In this view, the use of easily implementable – though mutagenic – translesion
functions consistently predates the incorporation of more sophisticated repair mechanisms in
the SOS response. In any case, these results highlight the universality and the tremendous
plasticity of the LexA regulon over evolutionary times in both gene content and binding motif
sequences.
(i) In this context, it shall not be surprising that new genes plug into the SOS regulon
so as to take advantage of its responsitivity. Notably, several mobile genetic elements have
been more or less directly linked to the response. The lytic cycle of several phages is re-
pressed by CI, a DNA binding protein that undergoes a RecA-mediated autocatalytic cleavage
similar to LexA and UmuD (Sauer et al., 1982). This suggests either a co-option of the RecA
induction pathway by bacteriophages or a possible bacteriophage-related origin of the lexA
gene. Then, the same triggers that induce the SOS response also promote prophage excisions

54
Stress-induced mutagenesis – The SOS paradigm

in a lexA-independent fashion, allowing them to selfishly evade potentially compromised


hosts (Redfield, 2001). The mobilization and dissemination of the SXT integrative conjuga-
tive element (ICE), which carries several determinants of antibiotic resistance is controlled in
a similar fashion (Beaber et al., 2004). Some prophages are more directly plugged in the SOS
response. Two studies carried in E. coli (Shearwin et al., 1998) and S. enterica (Bunny et al.,
2002) reported situations wherein prophages are repressed by a non-cleavable version of CI,
but include a gene (tum) encoding a LexA-repressed CI antirepressor. In V. cholerae, the
cholera toxin encoding phage CTXφ is controlled by a complicated mechanism involving two
repressors, including LexA (Quinones et al., 2005).
Asides from phages, the mobilization of few TEs were linked with SOS induction.
Transposition of IS10R, the right-hand module of Tn10, is indirectly increased upon induction
of the SOS response by UVs. The exact mechanism mediating this phenomenon is unknown
but may involve upregulation of the ihfA gene encoding a subunit of the integration host fac-
tor IHF (Eichenbaum and Livneh, 1998). In contrast, LexA directly represses the transposase
carried by the IS50R element of Tn5, and genetic data show that SOS functions result in in-
creased transposition frequency. However, no such increase could be monitored using extrin-
sic inducers such as UVs or mitomycin C (Kuan and Tessman, 1991, 1992). UVs were
nonetheless found to promote the excision of Tn5 and Tn10 in a similar manner, and to lead
to Tn1 excision at higher doses (Aleshkin et al., 1998).
Most of these behaviors can be interpreted as escape of selfish elements in front of ad-
verse conditions (Matic et al., 2004). Nevertheless, they can also increase the adaptive poten-
tial of the population. Although the induction of bacteriophages results in cell lysis, they can
transfer host genes to new recipient cells and may bring together the adequate set of determi-
nant to overcome the triggering stress. For instance, the mobilization of prophages upon SOS
induction is involved in the specific recruitment, encapsidation and subsequent transfer of
pathogenicity islands present in the genome of S. aureus (Ubeda et al., 2005; Maiques et al.,
2006). In addition, a significant proportion of phages contain factors enhancing the fitness of
their host, such as virulence or antibiotic resistance determinants. This is readily illustrated by
CTXφ, which encodes the cholera toxin (Quinones et al., 2005). The spread of such determi-
nants is particularly facilitated in the case of the conjugative element SXT (Beaber et al.,
2004). TEs are also frequently associated with such factors. Furthermore, their mobilization
can have a significant impact on the architecture of individual genome (see Transposition,
p28). The transposition of both Tn5 and Tn10 has been shown to promote adaptation in a
chemostat (Chao et al., 1983). Because it may rely on an SOS-induced host factor, the in-

55
Introduction – Control of genetic diversity

creased transposition Tn10 may provide an example of SOS function that is primarily targeted
at generating variability. As we will see later, the discovery that LexA controls recombination
in integrons provides a clear-cut example that the responsitivity afforded by the SOS system
can be purposefully co-opted by an adaptive mechanism (see Results – Recombination in in-
tegrons is controled by the SOS response to stress, p171).
(ii) The adaptive potential of the SOS response can also be derived by extending the
range of evoking situations. As mentioned earlier, several antibiotics are able to trigger the re-
sponse. All of them impact mechanisms linked with DNA processing, thereby providing a
straightforward causal explanation to their inducing effects. The discovery that β-lactam anti-
biotics can induce the response was unexpected. Indeed, this class of antibiotics exerts a bac-
tericidal effect by inhibiting the synthesis of the peptidoglycan layer of bacterial cell-walls
(Koch, 2003). The stress seems to be relayed to the SOS system by the DpiBA two-
component sensor system (Miller et al., 2004). In the same light, acoustic cavitations
(Vollmer et al., 1998) and high-pressure stresses (Aertsen and Michiels, 2005) have been
shown to induce the SOS system. In the latter case, activation of the cryptic type-IV restric-
tion endonuclease Mrr would mediate induction by introducing DSBs in the genome. These
examples illustrate an evolutionary strategy that consists in diverting an existing diversity-
generating system in order to increase adaptability to unrelated stresses.

I.2.2. Other examples of stress-induced mutagenesis


Other examples of stress-induced mutagenesis have been reported in various bacteria,
yeasts and human cancer cells. A comprehensive review is provided by Galhardo et al., 2007.
Although these processes vary widely in their molecular details, all rely on a common theme,
whereby existing DNA processing machineries are more or less directly subverted by one (or
more) regulatory response to stress. Two significant processes implicated in different sorts of
mutations are discussed below. A general picture emerging from the coupling of physiological
response to diversity-generating mechanisms will be further discussed later (see Physiological
regulation of mutagenesis, p110).

a - Mutagenesis in aging colonies

A linear increase in mutagenesis has been observed in aging colonies of E. coli grow-
ing for seven days on agar plates (Taddei et al., 1995). The induction of the SOS response –
and specifically the proteins RecA, UvrB and DNA Pol I – is required in this process (Taddei

56
Stress-induced mutagenesis – Other examples of stress-induced mutagenesis

et al., 1995; Taddei et al., 1997a). Increased mutagenesis is also dependent on the excretion of
cAMP, which is part of the release of catabolite repression in response to carbon starvation
(Taddei et al., 1995). In this light, this phenomenon can be regarded as a cellular response to
starvation stress.
To assess the prevalence of
stress-induced mutagenesis in the
wild, the impact of aging was moni-
tored in a large collection of E. coli
isolates originating from a wide
range of habitats worldwide
(Bjedov et al., 2003). Among 787
natural isolates, >80% exhibited in-
creased mutagenesis in 7-days old
as compared to 1-day old colonies.
Focusing on a single isolate, the au-
thors found that the response to ag-
ing is mostly LexA-independent
and thus does not directly rely on
the SOS response – as observed for
the laboratory strain. However, it
relies on genes that are part of the
SOS regulon (polB and recA). Con-
sistent with earlier results,
mutagenesis was dependent on the cAMP/CRP regulatory network. The global response to
stress orchestrated by the RpoS/σS was also found to be required. This alternative sigma fac-
tor upregulates ca. 340 genes – many of which play various roles in stress resistance (Weber
et al., 2005). Its expression is induced by a wide variety of conditions, including entry into
stationary phase; starvation; acid pH; osmotic shocks; cold shocks and oxidative stresses
(Hengge-Aronis, 2002). These genetic requirements further strengthen the idea that general
stress responses are implicated in mutagenesis through repair mechanisms.
The genetic discrepancies evidenced between laboratory and natural strains provide a
glimpse of the mechanistic diversity underlying stress-induced mutagenesis. In the aforemen-
tioned E.coli collection, the magnitude of stress-inducible mutability ranged from <1 to
>1000-fold among isolates (Bjedov et al., 2003; and see Figure 10). This pattern further em-

57
Introduction – Control of genetic diversity

phasizes the large variability that affects the genetic determinants of this phenomenon among
different strains.

b - The competence state

At least 40 taxonomically diverse bacterial species are naturally transformable (Lorenz


and Wackernagel, 1994). Most of them essentially use the same machinery to acquire naked
ssDNA from the extracellular environment (Chen and Dubnau, 2004). The assembly of this
machinery is tightly controlled, and defines a specialized physiological state termed compe-
tence. The distribution of genes involved in competence suggests that many more species
might be transformable – though the conditions triggering this developmental state are not
known (Claverys and Martin, 2003). Contrasting with other mechanisms involved in HGT –
such as transduction and conjugation – transformation relies on a mechanism that is inherent
to the species, and independent of extrachromosomal elements. Although competence can be
regarded as the only process to be genuinely dedicated to DNA exchange in bacteria, the ex-
act role of DNA uptake remains unclear. Three major hypotheses have been suggested: i) the
incoming DNA may provide a source of genetic diversity after recombination; ii) it may be
used as a template for DNA repair; or iii) it could be used as a food source (Redfield, 2001).
The major model organisms to study natural competence are the Gram-positive Bacillus sub-
tilis and Streptococcus pneumoniae and the Gram-negative Haemophilus influenzae Rd and
Neisseria gonorrhoeae. Although similar overall, their mechanisms of competence differ in
such a way that a generalization concerning their evolutionary role may not be relevant.
In B. subtilis, S. pneumoniae and H. influenzae, the competence regulons (com) have
been identified using microarrays (Claverys et al., 2006). In both Gram-positive species, the
number of induced genes greatly exceeds that required solely for genetic transformation. This
strongly suggests that the competence state is involved in a wider response than transforma-
tion alone. Notably, few SOS genes are upregulated in the competent state of B. subtilis (Love
et al., 1985; Hamoen et al., 2002). Although the determinism of this pattern is globally un-
clear, the competence activator ComX overrides the repressive action of LexA in the pro-
moter region of recA (Hamoen et al., 2001). S. pneumoniae does not seem to contain a SOS
response, but recA is induced in competence triggering conditions only if the competence
regulatory cascade is intact (Claverys et al., 2006). Altogether, these data suggest that the in-
duction of the competence regulon is coupled with the recombination machinery – which sup-
ports a role for transformation in repair and genetic diversity.

58
Stress-induced mutagenesis – Other examples of stress-induced mutagenesis

However, in H. influenzae the com regulon is essentially composed of genes for DNA
uptake and processing and do not comprise recA. Besides, this regulon is controlled by CRP,
which is responsive to carbon source depletion (Redfield et al., 2005). These data rather sup-
port the idea that DNA is uptaken to spare the metabolic cost of nucleotide synthesis or as a
source of carbon, energy, nitrogen and phosphorus (Redfield, 2001). Inconsistent with these
observations, H. influenzae is known to selectively uptake DNA fragments containing short
and specific signal sequences (Smith et al., 1999). These sequences are highly overrepre-
sented in its genome compared to other bacterial species. One hardly sees the point in restrict-
ing access to DNA if it is uptaken only to be degraded. In contrast, both B. subtilis and S.
pneumoniae efficiently take up DNA from any source – although only homologous DNA
from related species is normally recombined into the cell's chromosome due to the restrictive
action of MMR (see p36).
In the two Gram-negative bacteria, the induction of competence involves a quorum
sensing mechanism whereby the accumulation of a small peptide pheromone in crowded
populations activates a two-component regulatory system that relays the signal down to the
competence regulon. This can be interpreted as an adaptation favoring genetic exchanges
when DNA from conspecifics is most likely available (Morrison and Lee, 2000). In the same
light, competent S. pneumoniae cells can trigger the lysis of non-competent cells (Steinmoen
et al., 2003; Guiral et al., 2005). This fratricide process may serve to increase the local avail-
ability of DNA sequences from distant strains with a different pheromone type (Claverys et
al., 2006). However, quorum-sensing often have well-established roles in nutrient acquisition
and might act as an early warning signal of nutrient shortages, which are likely to result from
high population density (Redfield, 2001). Moreover, the competence state is induced by star-
vation in both H. influenzae and B. subtilis. These observations rather support the “DNA-as-
food” hypothesis.
The failure of mitomycin C to induce transformability in B. subtilis and H. influenzae
(Redfield, 1993) and the report that a small genomic fragment had the same effect on UV sur-
vival as total chromosomal DNA upon transformation in H. influenzae (Dubnau, 1999) led to
the rejection of the “competence-for-repair” hypothesis (Redfield, 2001). However, mitomy-
cin C as well as aminoglycoside and quinolone antibiotics have recently been shown to induce
competence in S. pneumoniae (Prudhomme et al., 2006). Altogether, the available evidences
suggest that the competence state in this species may function as a global stress response,
wherein genetic transformation is used for repair in the absence of SOS system (Claverys et
al., 2006).

59
Introduction – Control of genetic diversity

As with the SOS response, the involvement of competence in the production of genetic
diversity may arise as a side effect of repair function. The many contributions of HGT to
modern bacterial genomes evidenced by comparative genomics suggest that these side effects
can prove advantageous (Ochman et al., 2000; Koonin and Wolf, 2008). However, contempo-
rary genomes only grant us access to successful evolutionary events. New genetic combina-
tions are likely to be more often harmful than beneficial, just as point mutations do (Redfield,
2001). In this context, transformation can be seen as modifier trait that is essentially subjected
to the same constraints as other mutators. To date, there is no evidence supporting the com-
mitment of transformation to genetic diversity rather than DNA repair in some species (e.g. S.
pneumoniae), and the uptake of DNA as a nutrient source appear as a sound hypothesis in
others (e.g. H. influenzae). Interestingly, in B. subtilis the competence state has been found to
be required for cells to revert point mutations in auxotrophic alleles when grown on minimal
medium. This process relies on a putative EP polymerase and limiting concentration of MMR
proteins, but is RecA-independent (Robleto et al., 2007). The involvement of competence in a
process of stress-induced mutagenesis that is not directly linked with recombination of incom-
ing DNA underlies the complex interconnection existing between adaptive responses.
In the context of this thesis, it is noteworthy that the marine bacteria Vibrio cholerae
has recently been demonstrated to be naturally transformable. Although the underlying ge-
netic pathways are not known in details, the induction of competence has been shown to rely
on: i) the availability of chitin (the most abundant polymer in the marine environment); ii)
stresses, such as nutrients starvation (RpoS general response); and iii) a high bacterial density,
such as in biofilms (quorum sensing) (Meibom et al., 2005). The competence-mediated acqui-
sition of large genomic segments in these conditions can account for part of the remarkable
genomic plasticity evidenced in Vibrionaceae (Miller et al., 2007a; Thompson et al., 2005).

I.3. Programed generation of genetic variations

In a changeable world, long-term stability of fitness is found in the adaptive variation


that mutability provides. The previous chapter highlighted the deleterious nature of most mu-
tations, and the consecutive advantages of restricting increased mutagenesis to stressful peri-
ods whereby innovations are really needed. Although a genome-wide increase in mutation
rate is the most “creative” source of genetic novelty, it always leads to the accumulation of a

60
Programed generation of genetic variations – Localized mutation through slipped-strand
mispairing
genetic load in the population by affecting traits that should be kept stable (Roth et al., 2006).
When recurrent modifications of the same traits are required over time, natural selection can
favor the emergence of specific and local mechanisms facilitating targeted variations. Punc-
tual regions of the genome are thus evolutionarily programmed to be inherently hypervariable.
In many cases, mutations are controled both quantitatively and qualitatively – thereby permit-
ing variations to be channeled toward a specific region of the phenotypic space. Overall, these
mechanisms enable the generation of mostly adaptive innovations only where they are fre-
quently needed – with little harm to the global integrity of the genome.

I.3.1. Localized mutation through slipped-strand mispairing

a - Replication slippage

The terms replication slippage and slipped-strand mispairing refer to a general


mechanistic process accounting for the higher than expected instability of repeated DNA
stretches through replication. Stalling of the replication fork is a common event which is de-
termined by exogenous (DNA damaging agents) as well as intrinsic (presence of DNA bind-
ing proteins, unusual DNA structures…) factors in both eukaryotes and prokaryotes (Mirkin
and Mirkin, 2007). It frequently results in the disassociation of the replisome and in the con-
secutive separation of the daughter and parental DNA strands. Reannealing of repetitive se-
quences can lead to stable misaligned intermediates containing a bulge. Larger numbers of
repeated motif increase the stability of illegitimate intermediates and permit the formation of
mismatches away from the replication complex, thereby favoring polymerization over proof-
reading (Kunkel, 2004). If resulting mismatches are not corrected in the meantime, a second
round of replication fixate either a decrease or an increase in repeat number, depending on
whether the bulge is on the template strand or the nascent strand, respectively (see Figure 11).

b - Simple Sequence Repeat (SSR) are variable loci

Genomic regions made up of contiguous iterations of simple DNA motifs are overrep-
resented in natural DNA sequences. The repeated motifs consist of one to several nucleotides,
forming for example a homopolymeric tract of guanine or tandem repeats of 5′-CAA-3′.
These structures are generically referred to as simple sequence repeat (SSR). In eukaryotes,
SSRs are often called micro- or mini-satellites (where the unit of repetitive DNA is 1-6 or >6
nucleotides, respectively). Whereas eukaryotic microsatellites often comprise hundreds of re-

61
Introduction – Control of genetic diversity

peats, in bacteria the numbers


of repeated units are generally
substantially less than 100. SSR
loci are often polymorphic. In-
deed, both eukaryotic and bac-
terial SSRs are prone to
expansions or contractions in
the number of repeat units
through slipped-strand mispair-
ing. In eukaryotes, where the
number of repeats is higher, un-
equal homologous recombina-
tion may also be involved. The
initial development of a SSR
must rely on the fortuitous for-
mation of a repetitive seed
trough random substitutions or
duplication of a small DNA
segment.
The mutation rates of
SSR loci are generally com-
prised between 10−2 and 10−5,
numbers that are orders of
magnitude higher than those
corresponding to standard spontaneous mutations (see p30). The spontaneous, stochastic and
reversible phenomenon of replication slippage thus endows SSR loci with a high and specific
mutation rate. Importantly, this mechanism does not require any specific apparatus and can be
seen as an emergent property of the DNA replication process.
SSRs are commonly regarded as junk DNA. Widely used techniques such as DNA
fingerprinting, lineage analysis and gene mapping actually rely on the assumption that SSR
mutations are neutral. This biased view is mostly based on observations from eukaryotic mi-
crosatellites. Indeed, SSRs were first described to be rather rare in prokaryotes (Tautz et al.,
1986). The prolificacy of fast growing organisms such as bacteria is largely dependent on ge-
nome size. In contrast to most eukaryotes, bacterial genomes are thus under strong selection

62
Programed generation of genetic variations – Localized mutation through slipped-strand
mispairing

for compactness, leaving few non-functional intergenic regions where SSR mutations could
actually be neutral. This probably helped to establish a direct link between SSR loci and
relevant phenotypes. The analysis of complete genome sequences provides a systematic ap-
proach to identify putative SSR loci and determine their overall distribution (Medini et al.,
2008). The first study of this kind was carried on the genome of Haemophilus influenzae
(Hood et al., 1996). All SSR regions identified in this genome are reported in Table 3. Strik-
ingly, most of the repeated motifs are embedded in functional genes. Furthermore, the annota-
tions of these genes are enriched in specific functions, such as lipopolysaccharide (LPS)
synthesis. These examples readily illustrate that SSRs, located either within the reading

63
Introduction – Control of genetic diversity

frames of genes or their promoters, can impact quantitative traits. When SSRs are located
within reading frames the length of the repeated unit is often multiples of three. In eukaryotes,
many structural- and cell-surface proteins, as well as transcription factors seem to have
evolved by expansion of minisatellites, with each repeated unit encoding an oligopeptide mo-
tif. Mutations in these SSR can results in qualitative variations of the protein by finely modu-
lating their function (Kashi and King, 2006). However, most mutations affecting SSR have a
quantitative impact on gene expression. Most expansion-contraction events result in coding
sequence frameshifts, thereby enabling ON-OFF binary phenotypic switches by the produc-
tion of a non-functional, truncated protein. The impact of SSR variations on expression can
also results in more nuanced phenotypes. Mutation in the beginning of coding sequences can
shift the translation initiation site, thereby modifying the level of expression of the protein
without altering its function (Kashi et al., 1997; Dawid et al., 1999). Besides, expansion-
contraction of repeats located upstream of the translational start codon can affect the binding
of transcription factor, alter the spacing between regulatory element and promote competition
between alternate promoters. This provides an effective way to fine-tune phenotypic traits
through the analogical modulation of transcription rate (Moxon et al., 2006). In eukaryotes,
some SSRs have also been reported to impact the DNA structure and packaging, and to affect
mRNA splicing (Kashi and King, 2006).

c - Phenotypic impact

Most instances of SSR have been investigated in silico and proper experimental vali-
dations of the potential role of variations are lacking. This section provides detailed examples
and emphasizes the functional impact of SSR on fitness.
In prokaryotes, early-recognized phenomena generally involve a dual and reversible
switching between ON and OFF states, corresponding to full and no (or markedly decreased)
expression of a phenotypic trait. Such processes are known as phase variation and result in the
diversification of a clonal population into two distinct subpopulations. More complex pheno-
types bestowed by the combinatorial effects of several phase varying loci are referred to as
antigenic variation, because they essentially affect cell surface structures implicated in host-
pathogen interactions.
A well documented example of antigenic variation concerns the aforementioned LPS
synthesis in H. influenzae – whereby tetranucleotide repeats within genes alter expression
through alternative transcription start site orprotein inactivation (Moxon et al., 2006). LPS is
the major antigenic structure of the cell envelope in gram-negative bacteria. Its nature deter-

64
Programed generation of genetic variations – Localized mutation through slipped-strand
mispairing

mines the physiological properties of the envelope and is critical in defining the virulence of
the bacteria. The role of three phase-variable proteins in generating different LPS through the
combinatorial addition of core sugars to the molecule backbone has been elucidated in detail
(see Figure 12). Importantly, clear selective advantages have been associated with different
forms (Weiser and Pan, 1998). For instance, addition of phosphorylcholine by the lic1-
encoded protein is associated with more efficient colonization of nasopharyngeal epithelia,
but is also targeted efficiently by innate immunity. In contrast, variants wherein lic1 expres-

65
Introduction – Control of genetic diversity

sion is switched off are more resistant to host clearance. Besides, switching of several glyco-
syl-transferases implicated in LPS modification was also shown to promote resistance to anti-
body-mediated clearance. Similar modifications of the LPS have been documented in
Neisseria ssp (Yang and Gotschlich, 1996) and Helicobacter pylori (Bergman et al., 2004).
The rapid phenotypic switching may be important for adaptation in the course of infection or
during transmission from host to host.
Most studies of phase-variation have focused on pathogens, and therefore the associa-
tion with virulence and immune evasion has been emphasized. A metabolic involvement of
SSR has been identified with the aphC gene of E. coli. A one-motif expansion of a (TCT)4
tract has been shown to drive the functional conversion of the encoded protein from a peroxi-
dase to a glutathione-glutaredoxin reductase. Interconversion between the two alleles should
mediate alternate survival to oxidative and disulfide-mediated stress with a single phase vari-
able gene (Ritz et al., 2001).
The biological role of most eukaryotic microsatellites remains uncertain but some have
been elucidated. Paralleling the bacterial case, gene-associated SSR seem to affect predomi-
nantly cell-surface proteins involved in cell adhesion and flocculation in Saccharomyces cere-

66
Programed generation of genetic variations – Localized mutation through slipped-strand
mispairing
visiae (see Figure 13). In this organism, two alleles of the ras2 gene that differ by the pres-
ence of A9 and A10 poly-A tracts in the promoter region were shown to confer high and very
low sporulation frequency, respectively. As sporulation efficiency is a significant life-history
trait for yeast, this polymorphism is likely to be of adaptive signification (Kashi and King,
2006).
In higher eukaryotes, non-neutral SSRs have long been associated with deleterious ef-
fects. The first phenotypes that were associated with SSR polymorphism were related to dis-
eases. Particularly, the so-called triplet repeat diseases includes well-known hereditary
pathology (Fragile X, Huntington’s disease, spinocerebellar ataxia, cleidocranial dysplasia…)
and are caused by homopolymeric amino acid stretches within proteins. These disorders are
characterized by a peculiar pattern of inheritance that has been referred to as genetic anticipa-
tion, because symptoms become more severe and tend to appear earlier in successive genera-
tion. This exemplifies the potential analogical effect of SSR-based mutations.
More recently, SSR polymorphism has been implicated in diverse adaptive pheno-
types. In Drosophila melanogaster, the per gene is implicated in circadian clock control and
contains hexanucleotide repeats encoding (Thr-Gly) iterations. One allele containing 17 repeat
yields a circadian period closer to 24 hours, whereas a 20-repeats variant is less adjusted but
proves less sensitive to temperature fluctuations. The geographical distribution of the two al-
leles correlates with temperature so that the buffering allele is associated with colder regions,
a pattern that is indicative of positive selection (Sawyer et al., 1997). Strong correlations be-
tween selected phenotypes and SSRs have been reported in dogs. Emblematically, the pres-
ence of extra toes in the breed Great Pyrenees is consistently linked with a 51-bp contraction
of a hexanucleotide repeat in Alx-4. Reinforcing the observation, this gene was previously as-
sociated with polydactyly in mice (see Figure 14). The fact that dogs were actively subjected
to intense artificial breeding has certainly provided a good material to identify such relation-
ships. This illustrates how mutations at SSR loci might be important in morphological adapta-
tion to natural selection (Fondon and Garner, 2004). The last example extends the range of
phenotype altered by SSR to include social behaviors. Indeed, different species of voles have
been observed to display distinct social behaviors ranging from highly social and monoga-
mous to definitely asocial. These differences have been linked to the expression pattern of the
vasopressin receptor. Social species exhibit a compound SSR in the 5’ regulatory region of
the corresponding avpr1a gene, much of which is absent in asocial species. This was shown to
result in differential expression with respect to cell types, much probably through differential

67
Introduction – Control of genetic diversity

binding properties to transcription factors. Moreover, after artificial selection for longer and
shorter alleles, it was shown that the length of the selected tract correlated with quantitative
differences in brain distribution of vasopressin receptor and in individual social behavior
(Hammock and Young, 2005).

d - SSRs as localized mutators

Because they enable a specific mechanism of mutation, SSR display an increased mu-
tation rate. Besides, these mutations can result in adaptive phenotypic changes. SSR thus pro-
vide a simple strategy to increase phenotypic variability by specifically altering the expression
or function of single genes. In this respect, an important characteristic of SSR consists in their
ability to modulate their own mutation rates within the limits of certain constraints. The muta-
bility of SSR is indeed affected by the length, sequence, number, and purity of the repeated
motifs.

i. The structures of SSRs influence their mutability

Experimental studies evidenced a correlation between the tract length and the intrinsic
mutability of SSRs. Higher repeat numbers increase the stability of mispaired strands arising
after a replication fork stalled (Kunkel, 2004). Models based on E. coli propose that some

68
Programed generation of genetic variations – Localized mutation through slipped-strand
mispairing
SSR form barriers for the DNA polymerase because of their tendency to form secondary
structures, thereby promoting stalling of the replication fork (Bichara et al., 2006). This sug-
gests a direct effect of these SSRs in initiating the slippage responsible for their own mutabil-
ity. In addition, the mispairing frequency is highly sensitive to the degree of homology
between repeats. The accumulation of point mutations can thus stabilize SSRs, whereas active
mutational slippage tends to eliminate imperfect repeats. The purity of the repetitive structure
is thus differentially affected by the directionality of the selective pressure: stabilizing point
mutations are favored by purifying selection against facilitated variations, while diversifying
selection favor the maintenance of highly mutable SSR. Altogether, these mechanisms pro-
vide a direct coupling between the intrinsic mutability of SSR loci and their effects on fitness.
The system is nevertheless constrained by the function of the element in which the SSRs are
located. It seems reasonable to think that the length variations of repeat-encoded peptide are
limited. However, the nature and position of the repeated unit might allow a certain tolerance.
For instance, removal of the tetranucleotide repeats located in 3 H. influenzae genes involved
in LPS synthesis does not impact the protein function. The major role of the SSRs is thus to
mediate frameshift-mediated switching of expression (Moxon et al., 2006). In dogs, variations
in facial shape are best explained by the length ratio of two adjacent SSRs in the transcription
factor Runx-2 (Fondon and Garner, 2004). That the function of the protein depends on a rela-
tive ratio rather than an absolute length hints at a possible mechanism to relieve constraints.

ii. Ideal mutators

Broadly speaking, selection should act against the presence of unstable and mutagenic
SSRs within a gene. Indeed, it has recently been shown that synonymous codons are used in a
way that avoids the emergence of nucleotide repeats within coding sequence (Ackermann and
Chao, 2006; Wanner et al., 2008). This relates to the general observation that minimal muta-
tion rates are favored in the absence of favorable conditions (see The lowest… the best, p39).
Each allele of a given SSR locus encodes both a phenotypic effect – its repeat number – and a
mutation rate. These loci are thus subject to a second order selection process: natural selection
acting on the fitness effects of SSR alleles also indirectly selects their mutation rates. Because
its site for mutation is itself, an SSR locus can then be viewed as a nearly ideal mutator (Kashi
and King, 2006). This setup maximizes the genetic linkage between what mutates and what is
mutated, thereby avoiding the breakage of hitchhiking by recombination. Furthermore, this
strategy comes with a minimal genetic load because surrounding loci are maintained under

69
Introduction – Control of genetic diversity

standard mutation rate. Another important characteristic of SSRs-based mutations is their fac-
ile reversibility, because extensions and contractions are produced according to the same
mechanism. This allows alleles to fluctuate between a discrete numbers of states, thereby ori-
enting the phenotypic impact of variations.

iii. Extrinsic influences affecting SSR mutation rates

Apart from the cis effect of SSRs on their own mutation rates, some factors may be in-
volved in trans regulation. In N. meningitidis, the inactivation of the mutS or mutL genes in-
volved in MMR leads to striking phase-variation increases in genes containing
mononucleotide repeats (Richardson and Stojiljkovic, 2001). Furthermore, meningococcal
isolates containing natural variations in mutS and mutL were also associated with increase in
phase variation rates (Richardson et al., 2002). In the same light, saturation of MMR upon
natural transformation also results in increased variations at SSR loci. Together, these data
provide a mechanistic insight on the coupling of SSR with processes implicated in the modu-
lation of global mutation rates. This effect however may depend on specific characteristics of
the repeat. Indeed, experiments on artificial systems in E. coli showed that MMR is more effi-
cient in processing shorter mispairing (Parker and Marinus, 1992). In H. influenzae, tetranu-
cleotide repeats are refractory to the activity of MMR while artificial dinucleotide repeats are
affected (Bayliss et al., 2002).
These results suggest that in some cases SSRs mutability can be induced by stressing
conditions, such as those leading to saturation of the MMR machinery. Another hint in this
direction comes from the observation that oxidative stresses can destabilize artificial microsa-
tellites in E. coli (Jackson et al., 1998). Being prone to elongate mismatched termini, the SOS-
induced EP polymerase Pol IV may be expected to play a role in this process. However, this
seems not to be the case (Jacob and Eckert, 2007). In wheat, a report suggested that SSR mu-
tation rates are promoted by fungal infections (Schmidt and Mitter, 2004).

70
Programed generation of genetic variations – Mutation by intragenomic recombination

I.3.2. Mutation by intragenomic recombination


The quantitative and qualitative adjustments of phenotypes provided by the targeted
mutation of SSRs might be one of the simplest and straightforward means of increasing the
genetic diversity on which selection can act. The repeated motifs involved in this mechanism
are typically short. Longer repetitions are involved in a variety of recombination mechanisms
that impact the structure of the genome. These events subvert the standard recombination ma-
chinery of the cell, or alternatively relie on specifically dedicated systems to produce poten-
tially adaptive variations.

a - Meiotic sex

By mediating the reassortment of allele whitin a population, meiotic sex is expected to


increase the actual variance in fitness. Although this process occasionally produces fitter
genotypes, it also breaks advantageous combinations down – thereby incurring a recombina-
tional load (de Visser and Elena, 2007). While obligatory in some species, sexual reproduc-
tion is a facultative trait in other. The choice of sexual, as opposed to asexual, reproductive
strategies may provide a strategy to increase the variation in a population during hardship
(Greig et al., 1998; Grimberg and Zeyl, 2005). In this context, individuals essentially gamble
on wheter a process of genetic shuffling can result in fitter offsprings in future environments.

b - Gene conversion

i. Overview

Gene conversion corresponds to


the non-reciprocal transfer of informa-
tion between homologous sequences that
arise during recombination of degraded
DNA ends. This process homogenizes
the genetic information carried by dif-
ferent DNA molecules or loci. Although
gene conversion can be associated with
important genomic rearrangements due
to concomitant crossing-overs, in half of

71
Introduction – Control of genetic diversity

the cases it is restricted to a small region and preserves the overall genome architecture,
thereby minimizing potentially deleterious effects (see Figure 15). Studies on gene conver-
sion were initially conducted in ascomycete fungi (Saccharomyces, Neurospora or Sordaria),
wherein all the products of a single meiosis are clearly separated from each other in an ascus.
In the absence of such structures, it is often difficult to ascertain if an event arose through
gene conversion or by double reciprocal crossovers between sister molecules. Despite these
difficulties, molecular systems have been set up to successfully cumulate evidences of gene
conversion in prokaryotes (Santoyo and Romero, 2005). The most widely accepted mecha-
nism for gene conversion is currently the DSB repair (DSBR) model shown in Figure 16.
Conversion events can switch the expression states of closely related sequences and create
new sequences through the combinatorial rearrangement of different segments. Representa-
tive examples of adaptive gene conversion are presented below.

ii. Programmed rearrangement through gene conversion

The most remarkable examples of adaptive gene conversion arise from host-pathogens
relationships, wherein pathogens diversify their antigenic transmembrane proteins to escape
the host immune system and establish chronic infections. Ironically, gene conversion is also
responsible for immunoglobulin diversification, notably in chickens (Stavnezer et al., 2008).
The common theme to these mechanisms is that variant gene sequences are transferred
through gene conversion from unexpressed genes (pseudogenes or gene cassettes) into an ex-
pressed locus. Overall, this process corresponds to the duplicative transposition of the donor
sequence. The number of unexpressed cassettes and the diversity of sequences condition the
generation of variability at the expression site. As only the exposed domains of an antigen
need effective variation, gene conversion can readily occurs between invariant regions flank-
ing the variable parts of the sequence. In some instances, successive events of segmental gene
conversion generate combinatorial variations (see Figure 17, p75). Variable systems based on
gene conversion have been evidenced in diverse organisms, providing a good example of
convergent evolution (Palmer and Brayton, 2007). Prokaryotic examples are reviewed in
(Santoyo and Romero, 2005) and (Wisniewski-Dyé and Vial, 2008). Eukaryotes are best rep-
resented by Trypanosoma brucei (Taylor and Rudenko, 2006).
B. hermsii and B. burgdorferi are causal agents of Lyme disease transmitted by soft
ticks. In B. hermsii, persistence within mammalian hosts is ensured by the sequential emer-
gence and replication of variants expressing unique variable membrane proteins (VMPs) ex-
pressed from a single locus. At least 59 full-length and unexpressed vmp genes cassettes are

72
Programed generation of genetic variations – Mutation by intragenomic recombination

located on linear plasmids and serve as donors


for gene conversion into the vmp expression
site. Recombination of the full-length donor
involves sequences overlapping the 5’-coding
sequence on one end and external to the cod-
ing sequence on the 3’ end. This mechanism
readily provides a strategy to switch between
an appreciable numbers of surface proteins
(Dai et al., 2006). In this system, the switching
rate between different alleles is determined by
the degree of sequence identity on the 5’ side
and by the distance of the 3’ homologous re-
gion with respect to the coding sequence in
non-expressed cassettes. The variability of the
central sequences was not found to affect
switching frequencies (Barbour et al., 2006).
The closely related B. burgdorferi illustrates
an important refinement to this mechanism.
One of its major surface-exposed lipoprotein
(VlsE) is expressed from a single expressed
locus located on a linear plasmid. The spiro-
chete also uses gene conversion to convert the
expressed sequence from a repository of silent
donor sequences. However, the genome of
burgdorferi contains only 15 silent vls cas-
settes (Zhang et al., 1997). Antigenic variants
produced during mice infection results from
combinatorial gene conversion events, so that
a single variant may arise from 6 to 11 events
of segmental recombinations (Zhang and Nor-
ris, 1998). This phenomenon may also occur in
B. hermsii, but might be less obvious because
of the larger number of donor sequences.

73
Introduction – Control of genetic diversity

In Anaplasma marginale, a tick-borne rickettsia responsible for anaplasmosis in


mammals, avoidance of the immune system is achieved by antigenic variation in genes of the
major surface protein 2 family (msp2). The genome sequence revealed only 5 to 7 donor se-
quences, none of which is full-length. That donor sequences are pseudogenes, ensure that ef-
fective expression occur only at the expression site. Variant arising early in infection are
characterized by simple gene conversion of the msp2 gene. Persistence of the pathogen, char-
acterized by the continuous apparition of mosaic variant through segmental gene conversion,
has been monitored over a period of two years. Simple gene conversion seems initially fa-
vored because the native sequences are more competitive than the chimeric ones when naïve
animals are infected. The advantage of combinatorial variants then arises with training of the
host immune system. Up to four sequential changes have been detected in the expressed gene.
Based on this number, roughly 6500 (94) potential variants could be generated in this seem-
ingly limited system (Palmer et al., 2007).
The most striking example in term of achievable diversity comes with the ethiologic
agent of the sleeping sickness in human, the flagellated unicellular protozoa Trypanosoma
brucei. Bloodstream-form cells of this extracellular parasite are coated with a unique form of
the variant surface glycoprotein (VSG). The VSG coat is an effective protection against com-
plement-mediated lysis, but is effectively targeted by the adaptive immune response. This se-
lective pressure results in sequential parasitemic cycles, whereby new SVG variants emerge
and thrive until a new effective antibody is derived by the host. Dwarfing the number of vari-
ant antigen genes found in other organisms, T. brucei contains a repertoire of at least 1250 to
1500 silent vsg genes, as estimated from an incomplete genome sequence (Berriman et al.,
2005). The vast majority (>1250) are present in tandem arrays ranging from three to 250 cop-
ies and located at subtelomeres, while another set is present on a hundred of stable minichro-
mosomes that seem to have arisen solely to increase the number of telomeric VSGs. In a
process reminiscent of B. hermsii, full-length vsg donors can be mobilized by gene conversion
using characteristic 70-bp repeats upstream of the genes and a conserved domain within the 3’
end of the coding sequence. However, the genome project revealed that >90% of silent vsg are
in fact pseudogenes. These can only be used through segmental gene conversion, like A. mar-
ginale (see Figure 17).
The mechanism of VSG variation reveals additional complications. SVG expression
pattern depends on the phase of the parasite life cycle. T. brucei contains ca. 20 telomeric
bloodstream-form VSG expression sites and ca. 25 metacyclic VSG expression sites, which
are also telomeric but structurally different from the formers. These latter are active im-

74
Programed generation of genetic variations – Mutation by intragenomic recombination

mediately after infection, but are quickly silenced as the trypanosome switches to the exclu-
sive activation of one of the bloodstream-form expression sites. Then, only a single type of
VSG is exposed at a time. Such a hierarchy presumably avoids exhaustion of the antigenic
repertoire. The process ensuring the mutually exclusive expression of a single VSG gene out
of 20 bloodstream-form expression sites is unknown, but can lead to in situ expression

75
Introduction – Control of genetic diversity

switches that do not rely on recombination. Irrespective of gene conversion, the expression
pattern can also be switched through telomere exchange, whereby silent cassettes at the end of
the (mini)chromosomes are simply recombined to the currently active telomeric expression
site. The need for increased variability is not restricted to hosts-pathogens, but is a desirable
characteristic to thrive in a variety of ecological niches. The overrepresentation of pathogens
in the reported examples of adaptive gene conversion probably results from a biased focus on
medically or economically relevant organisms. It is not far-fetched to expect the discovery of
similar systems to perform varied functions in organisms with different lifestyles. The regula-
tion of sexual reproduction in yeast through mating type switching provides a good illustra-
tion.
Homothallic Strains of S. cerevisiae grow as haploid cells of either the a or  mating
type. Only cells of opposite mating type can fuse to form a/ diploids capable of producing
meiotic spores in response to starvation. All physiological differences between a and  cells
and between haploid and diploid yeast cells are ultimately determined by the DNA sequences
present at the MAT locus on chromosome III. Haploid – but not diploid – cells undergo fre-
quent inter-conversion of mating type during growth, with frequency reaching once every cell
division. Two regions located at each end of chromosome III, HMLa and HML correspond

76
Programed generation of genetic variations – Mutation by intragenomic recombination

to silenced backups of the sequences specifying the a and types, respectively. The MAT lo-
cus actually ensures the expression of a duplicate copy of one these regions (see Figure 18).
Owing to conserved regions between the two types, replacement of the current expressed
copy by a silent donor from the opposite type can occur by gene conversion. Importantly, the
switch is initiated by a DSB in the MAT locus specifically introduced by the HO endonuclease
(Haber, 1998).

iii. Setup of a gene conversion system

Operational gene conversion between two sequences requires a specific pattern of ge-
netic diversity, wherein homologies that are sufficiently conserved to allow recombination
flank a region that is sufficiently variable to meditate effective phenotypic changes. How such
a dissymmetric pattern can emerge?
Gene conversion plays an important role in the evolution of multigene families. Nu-
merous phylogenetic studies evidenced a different pattern of evolution between orthologous
and paralogous genes, with paralogs evolving in a non-independent way. Gene conversion is
though to spread identical mutations between closely related paralogs, leading to sequence
homogenization within a multigene family – a process known as concerted evolution
(Santoyo and Romero, 2005). Because high similarities favor recombination and thus gene
conversion, homogenous families are more prone to gene conversion which establishes a self-
sustained dynamic to limit divergence between genes. As advantageous mutations are likely
to maximize their fitness effects when they spread to all the members of a family, directional
selection might be an important drive for concerted evolution.
The specific pattern required for adaptive gene conversion may simply emerge from
concerted evolution. After duplication, two paralogs may stochastically begin to diverge, ei-
ther during a period of relaxed selection or because one copy happen to be non-expressed.
This latter condition would even be a necessary prerequisite. As soon as the trait encoded by
the duplicated gene experiences a diversifying pressure, gene conversion events become ori-
ented by natural selection from the silent copy toward the expression site. In the meantime,
the non-expressed copy is free to cumulate mutations through neutral drift. Only mutations af-
fecting the exposed domains of the protein will be repeatedly selected. The interplay between
frequent gene conversion and positive selection would then lead to the homogenization of the
flanking sequences through concerted evolution, while promoting the maintenance of the
variability cumulated in the central region.

77
Introduction – Control of genetic diversity

The paralogous babA and babB genes, which code for outer membrane proteins in H.
pylori, may illustrate an intermediate step in the formation of a complex gene conversion sys-
tem. A thorough phylogenetic analysis identified a pattern of concerted evolution restricted to
the 3’ region of the sequence, and it was demonstrated experimentally that gene conversion
indeed occurs between the two genes at a rate of 10-3 (Pride and Blaser, 2002). Upon experi-
mental infection of rhesus monkeys with H. pylori, most of the cells lost the ability to express
BabA. In some isolates, this pheotype corresponded to effective gene conversion with babB
replacing babA. The resulting higher BabB expression was proposed to promote chronic in-
fection by facilitating adherence to the gastric epithelium (Solnick et al., 2004). The main
drawback of the babA-babB system lies in the rapid exhaustion of the available diversity. H.
pylori being naturally transformable, this may be compensated by the acquisition of exoge-
nous bab sequences. The ability to generate advantageous variants may provide the selective
dynamic necessary for the emergence of a refined system at these loci.
Several factors are likely to favor the maintenance of diversity (Taylor and Rudenko,
2006): i) the bigger the repertoire of silent donor, the larger the sequence space explored by
drift; ii) increased number of possible donors results in each particular gene to be activated
less frequently, enabling the accumulation of more mutations and pseudogeneization; iii) the
the lifestyle of the organism may potentiate the diversifying action of genetic drift. This is
particularly evident in pathogens which are exposed to repeated selective sweeps through the
generation of antigenic variants; to population bottlenecks during transmission between hosts;
and sometimes to epidemic dynamics that reduce the effective population size; and iv) some
idiosyncratic phenomena may increase the diversity. For instance the VSG repertoire of T.
brucei is located near the telomeres, which are particularly recombinogenic areas and muta-
tional hotspots.

iv. Specific mechanisms of gene conversion?

The exact mechanisms leading to gene conversion are not completely understood and
may vary between species. Broadly speaking, gene conversion seems to involve the standard
HR machinery and fit into the DSB model (see Figure 16, p73). In bacteria, several studies
evidenced the requirement for RecA and it seems that RecBCD, RecE and RecFOR recombi-
nation pathways can mediate gene conversion. The RecFOR pathway might particularly gen-
erate conversions without crossing-overs. Not surprisingly, impairment of the MMR
apparatus increases the frequency of conversion and allows recombination between more di-
vergent homologies (Santoyo and Romero, 2005). In T. brucei, Rad51 is required for VSG

78
Programed generation of genetic variations – Mutation by intragenomic recombination

switching and the conversion frequency is dependent on the lengths and similarities of the
homologies as well as on the MMR machinery (Barnes and McCulloch, 2007).
Nonetheless, rates of gene conversion are often much higher than expected for HR.
Furthermore, conversions often occur between short regions of much lower identities than is
usually considered necessary for HR. In neisseriales, conversion events were reported with
micro-homologies as short as 11 bp. In addition, effective conversion rates in artificial sys-
tems were only frequently observed with large homologies (ca. 40 kb), while simple crossover
predominated with shorter repeats (ca. 5 kb) (Smith, 2001). Altogether, these observations
suggest that cis-acting factors or specialized systems may be implicated on top of the regular
machinery of recombination. So far, such systems have resisted detailed characterization. The
mating type switching system of S. Cerevisiae (see Figure 18, p76) exemplifies how high
conversion rates can be achieved. Here, gene conversion is initiated by introduction of a DSB
at the expressed locus. This event relies on the specific recognition of a 16 bp motif located in
the middle of the MAT locus by the site-specific HO endonuclease (Haber, 1998). This sys-
tem leads to high and controllable switching rates by providing the exact event that triggers
gene conversion. Nonetheless, this strategy is costly because DSBs also promotes other DNA
rearrangements that are deleterious to the cell. Moreover, requirement for a site-specific rec-
ognition system would constitute a significant evolutionary constraint to the setup of gene
conversion system.
As far as we currently understand it, gene conversion primarily depends on the stan-
dard DNA repair apparatus. The apparent independence from specialized enzymes ensures the
applicability of this diversity-generating strategy to a variety of genetic situations. One can
speculate that some refined systems have improved on the basic mechanism to include more
specific functions, including site-specific activities. The two next sections focus on mecha-
nisms that specifically rely on site-specific enzymes. The first is the mobilization of TEs by
transposases. The second involves very specialized and sophisticated recombination systems.

c - Transposition

TEs are selfish DNA segments that contain the genetic information necessary to their
mobilization and spread within a genome (see p28). The mobilization process relies on DDD-
or DDE-transposases and do not involve the formation of covalent bonds between these pro-
teins and the processed DNA molecules. Transposases specifically discriminate recognition
sequences flanking TEs to catalyze their insertion and/or excision. In contrast, target sites for
insertion generally consist in few base pairs only, so that most transposons can insert almost

79
Introduction – Control of genetic diversity

anywhere in a genome (but see for instance Parks and Peters, 2009). The impact of trans-
posons on the phenotype depends on the position of their insertion sites. Although TEs have
been observed in all kinds of genomic compartments, they are predominant in heterochro-
matin and in regulatory regions. It is generally difficult to assess whether this prevalence re-
flects target sequence specificity, chromatin accessibility or filtering of random transposition
events through natural selection (Miller and Capy, 2004).
B. McClintock (1902-1992) early suggested that genome restructuring mediated by TE
activity can be seen as an essential component of the hosts’ response to stress (McClintock,
1950, 1984). Several reports support the idea that environmental stresses indeed increase
transposition rates in some cases. In bacteria, the SOS response has been shown to promote
the mobilization of some ISs (see p54). Starvation stresses (Hall, 1999; Twiss et al., 2005) and
the stationary-phase alternative sigma factor RpoS (Ilves et al., 2001) have also been impli-

80
Programed generation of genetic variations – Mutation by intragenomic recombination

cated in increased transposition. Besides, transposition efficiency can be modulated by host


factors. Remarkably, the mobilization of Tn10 depends on the interplay between the host in-
tegration factor IHF, a DNA bending protein, and the histone-like nucleoid structuring protein
H-NS (Wardle et al., 2005). H-NS is also involved in Tn5 transposition (Whitfield et al.,
2009). Any stress affecting the expression level of these factors can potentially alter transposi-
tion rates. In the plant Nicotiana tabacum, transcription and spread of the Tnt1 retrotranspo-
son is inducible by several biotic and abiotic stress factors (Melayah et al., 2001). In
mammalian cells, telomere damages promote the transposition of LINE-1 elements (17% of
the human genome) (Morrish et al., 2007).
Stressful conditions may also affect the TEs’ target capture. For instance, H-NS has
been shown to confine the integration of IS903 to a few integration hot-spot sin the chromo-
some of E.coli. Thus, decreased level of this protein may results in wider target distributions.
A striking example provided by the Ty5 retrotransposon of Saccharomyces cerevisiae de-
serves a longer presentation. Long terminal repeat (LTR) retrotransposons form a major class
of eukaryotic TEs (Class I). They are structurally similar to retroviruses and also propagate
using an RNA intermediate, though they have lost the ability to autonomously spread from
one cell to another (Kazazian, 2004). The Ty family of LTR retrotransposons in S. cerevisiae
constitutes a particularly well-studied transposition model. Elements from this family are
known to direct their integration into gene-poor regions, thereby alleviating the burden im-
posed on the host genome. The best understood mechanism is the one of Ty5, which generally
integrate into the heterochromatin of telomeres and silent mating loci (Ebina and Levin,
2007). Targeting of the heterochromatin is mediated by a direct interaction between the inte-
grase and Sir4p, an important component of heterochromatin (Zhu et al., 2003). However,
strong interaction between these two proteins relies on the phosphorylation of the C-terminal
end of the Ty5 integrase. Importantly, the level of phosphorylation and thus the association
with Sir4p has been shown to decrease in stressful conditions, such as nutrient deprivation
(Dai et al., 2007). This work strongly supports a model in which Ty5 leads a double life im-
posed by the host genome, which has hence domesticated the phenotypic impact of the TE
(Ebina and Levin, 2007). Under normal conditions, most of the integrations are directed to-
ward neutral site, whereas stresses trigger the host-mediated relief of this constraint so as to
favor the generation of non-silent variations (see Figure 19).

81
Introduction – Control of genetic diversity

d - Site-specific recombination

i. Recombinases

Site-specific recombination is mediated by recombinases of the tyrosine or serine


families, depending on the nature of the residue contracting a covalent bond to the DNA
molecule. Though their recombination mechanisms are distinct, members of both families are
able to catalyze DNA synapsis, cleavage, strand exchange and subsequent ligation without re-
quirement for DNA synthesis or high-energy cofactors. Contrasting with transposases,
recombinases mediate a reciprocal exchange between well-defined DNA target sites and es-
tablish covalent
links with the proc-
essed molecules.
The simplest recom-
bination sites are
short duplex DNA
segments (20 to 30
bp) displaying an
inverted pair of rec-
ognition sequences
that bind one dimer (or two monomers) of the recombinase. These sites contain the point of
DNA breakage and joining. The substrate specificity of recombinases provides larger oppor-
tunity for precise genome rearrangements than transposases because the system is orthogonal
to the rest of the genome. Depending on the initial arrangement of the parental recombination
sites, site-specific recombination has one of three possible outcomes: integration, excision, or
inversion of DNA segments (see Figure 20). Many recombination sites are more complicated
and comprise binding sequences for additional proteins that can exert a regulatory and/or
structural role in the recombination process. Such protein can notably modulate the efficiency
of recombination and favor one particular recombination outcome over another (for example,
excision over inversion or deletion) (Grindley et al., 2006). Site-specific recombination sys-
tems have been extensively used to engineer conditional mutants in a variety of organisms.
These biotechnological achievements reflect the natural role of these systems, which are often
selected to introduce reversible genetic diversity at defined loci within a population.

82
Programed generation of genetic variations – Mutation by intragenomic recombination

ii. Modification through DNA inversion

Most examples of phenotypic variations driven by site-specific DNA inversion pertain


to the bacterial kingdom. Nevertheless, different bacterial species use very diverse strategies
to alter gene expression using this mechanism. The genome sequence of Bacteroides fragilis,
a commensal inhabitant of the human gastrointestinal tract and opportunistic pathogen, illus-
trate plenty of these possibilities. This bacterium uses DNA inversion to control a greater
breadth of systems than any organism described to date.
Two independent teams reported the genome sequence of B. fragilis (Kuwahara et al.,
2004; Cerdeno-Tarraga et al., 2005). In both cases, particular regions were difficult to assem-
ble from the shotgun sequencing data, because the corresponding reads were highly chimeric.
It appeared that these segments were present in two alternative orientations in the starting ge-
nomic material, even though it was extracted from a seemingly pure culture grown for 24h
from a single clone. These projects illustrate the ability of whole genome sequencing to
idetify loci that vary at high frequency (Medini et al., 2008). Further analysis led to the identi-
fication of 31 invertible DNA regions. These have been classified into 6 different classes cor-
responding to different recombination sites – and presumably to different associated
recombinases (Kuwahara et al., 2004). No less than 30 recombination enzymes including 26
tyrosine integrases, 3 serine resolvase-invertases and 1 transposase-invertase have been identi-
fied in the genome (Cerdeno-Tarraga et al., 2005). As shown in Figure 21, the inversion sys-
tems could be associated with four distinct types of modification with different functional
outcomes: i) inversion of promoter-containing segments; ii) generation of hybrid proteins
through exchange of C-terminal domains; iii) rearrangement of gene within operons; and iv)
complex and combinatorial modifications mediated by shufflon-like multiple inversion. Many
genes regulated by these mechanisms are implicated in the synthesis of surface architectures
that may be involved in immune evasion or colonization of different sites in the host, such as
capsular polysaccharides and other outer membrane proteins. Interestingly, many other func-
tions are also affected by site-specific inversions, including several transporters, signal trans-
duction systems, carbohydrate degradation systems, one restriction-modification system and
the molecular chaperone GroES/EL (see p116). This suggests that DNA inversions are used to
control adaptation to a wide range of challenges.
Only one recombinase has been validated experimentally. Fourteen invertible regions
are flanked by inverted repeats similar to those acted on by the Hin invertase, a model serine
recombinase involved in flagellar phase variation in Salmonella typhimurium (van de Putte

83
Introduction – Control of genetic diversity

and Goosen, 1992). Two homologues (FinA and FinB) of this enzyme could be identified in
one genome of B. fragilis (Cerdeno-Tarraga et al., 2005). FinB being located on a plasmid, it
was absent from the other genome (Kuwahara et al., 2004). Seven of these regions were pre-
viously shown to define invertible promoters controlling the ON-OFF expression switching of
polysaccharide biosynthesis operons (see Figure 21, type 1-a; Krinos et al., 2001; Coyne et
al., 2003). The resulting structural variations in expressed capsular polysaccharides is though

84
Programed generation of genetic variations – Mutation by intragenomic recombination

to play a major role in allowing B. fragilis to live in close association with the mucosal sur-
face of the intestine. Interestingly, expression of the finA gene itself might be determined by
the orientation of a flippable promoter segment putatively controlled by nearby recombinase
(Kuwahara et al., 2004). This model would account for the coordinated modification of cap-
sular expression through the hierarchical chaining of several inversion events.
Apart from the polysaccharide operons, the evidences of structural inversions in these
genomes derive from computational analyses and experimental verification by PCR. The like-
lihood that these inversions are indeed functional is nonetheless supported by their close re-
semblance to well described model systems. Unraveling the intricate regulatory controls that
affect the site-specific recombination rates require in depth experimental dissection of particu-
lar systems. To illustrate this, the next section provides an overview of the data cumulated on
the fim model system.

iii. Control of the fim promoter inversion system

The phase variation of type 1 fimbriae in E. coli K-12 constitutes one of the best-
understood models of site-specific promoter inversion. A wealth of detailed analysis allowed
researchers in the field to describe the overall functioning of the switch as a system and pro-
vide a glimpse of the complex mechanisms involved in modulating switching frequencies.
Type 1 fimbriae are the most common fimbrial adhesins in E. coli isolates and seem to be of
particular importance in mediating attachment to host tissue during colonization of the human
bladder. These fimbriae are encoded by the fim operon. The main subunit of the fimbriae is
FimA, the expression of which phase varies through the inversion of a segment containing its
promoter (Pa). The invertible element is flanked by inverted repeats recognized by two differ-
ent tyrosine-recombinases encoded by the upstream genes fimB and fimE (see Figure 22).
The implication of two independent enzymes to catalyze the recombination of a single
element is an uncommon feature that provides complex opportunities for controlling switch-
ing frequencies. FimB and FimE share 48% amino acid identity, but their activity and affinity
to recombination sites differ. FimB mediates inversion in both directions, whereas FimE al-
most exclusively mediates ON→OFF inversions. Two factors explain this bias: i) sequence
differences between the external parts of the recombination sites provide FimE with higher
affinity to its cognate substrate in the ON phase; ii) the activity of Pa in the OFF orientation
indirectly decrease FimE expression through a complex post-transcriptional phenomenon (see
Figure 22). In addition to the regulation of fimA, the invertible element thus modulates
switching frequencies – a phenomenon known as orientational control (Chu and Blomfield,

85
Introduction – Control of genetic diversity

2007). Other interactions complicate the picture. Because the -10 part of the Pa promoter
overlaps the internal side of the inverted repeats, the sole binding of the recombinases prevent
expression of fimA. Conversely, the Pa activity prevents binding of both recombinases in the
ON phase. Recombination of the invertible segment and transcription of fimA hence appear to
be mutually exclusive processes. In the same light, the activities of both the fimE promoter
and Pa seem to specifically inhibit the FimB-mediated OFF →ON transition.
Under typical laboratory conditions, the frequency of inversion mediated by FimB is
-3 -4
10 -10 per cell per generation in both orientations, while the FimE-mediated ON→OFF
switch reaches a frequency of 10-1. In these conditions, steady-state equilibrium is reached
with ca. 97% of the population in the afimbriate phase. The net phase variation rate of type 1

86
Programed generation of genetic variations – Mutation by intragenomic recombination

fimbriae depends on the relative amount of the two recombinases. The components of the sys-
tem are linked by a complex network of interactions, including intricate feedback loops. In-
deed, variations in the expression level of fimA, fimB or fimF impact the switching frequency
and consequently affect expression of the others genes. In general, the contribution of FimB
to the ON→OFF transitions can be neglected before FimE. Therefore, either slight decreases
in the level of FimE or increased level of FimB greatly affects the frequency of OFF→ON in-
versions (Blomfield, 2001).
Several cellular factors modulate recombination rates by directly affecting the expres-
sion of the recombinases or by influencing the recombination process. All these factors act in
very pleiotropic and interdependent ways that makes it difficult to clearly establish their con-
certed effect (global trends are indicated on Figure 22). Importantly, several environmental
signals can modulate the frequency of switching through these cellular factors. Deprivation in
isoleucine, leucine, valine or alanine decreases recombination frequencies in both directions.
This effect is at least partly mediated by Lrp. Indeed, these amino-acids are allosteric co-
factors that modify the binding affinity of Lrp. The absence of these amino acids seems to
promote interaction with binding site 3 (see Figure 22), which limits the inversion reaction
(Blomfield, 2001). H-NS is a global regulator implicated in thermoregulation, with lower
temperature associated with higher expression of the protein. The OFF→ON switching fre-
quency increases with temperature over the range of 30 to 39°C. Regulation by H-NS thus
participates in the temperature responsivity of the fim system, which may increase the produc-
tion of type 1 fimbriae upon host colonization. At last, fimB transcription is also repressed by
a distant cis-active silencer, which promotes the OFF state. Sialic acid (N-acetylneuraminic
acid) and GlcNAc (N-acetyl-glucosaminidase) have been shown to suppress this silencing ef-
fect. Both amino sugars are released during the host inflammatory response and promote the
expression of regulatory proteins capable of repressing the silencer, and ultimately increase
the OFF→ON transition (El-Labany et al., 2003; Sohanpal et al., 2004; Sohanpal et al., 2007).
Overall, several intricate mechanisms seem to modulate the switching rates and thus
the production of variants in response to cues indicative of host colonization. Further deepen-
ing the complexity of the system, the switching frequencies of different type of fimbriae
seems to be coordinated to favor the expression of a single type at a time (van der Woude,
2006). Particularly, the PapB transcriptional regulator can specifically increase FimE-
mediated inversion while exerting the opposite influence on FimB, thereby silencing type 1
fimbriae when Pap pili are expressed (Blomfield, 2001). PapB is a key element of the Pap
fimbrial operon, which phase varies according to an epigenetic mechanism (see p95).

87
Introduction – Control of genetic diversity

iv. Variation controled by DNA insertions and excisions

DNA inversion systems form a stable device to alternate the expression state of a
handful of genes. Site specific-recombination can also mediate the insertion-excision of circu-
lar intermediates (see Figure 20, p82). Most mobile elements including phages, ICEs and ge-
nomic islands use self-encoded site-specific recombination systems. Genomic islands are
peculiar in the sense that they are not self-mobilizable – their excision from a chromosome
leading to non-replicative circular forms (Boyd et al., 2009). The insertion of such mobile
elements in the chromosome ensures their correct replication, but also tightly links their fate
to the host success. As mentioned earlier (see p54), several mobile elements carry accessory
genes that can benefit them indirectly, by proving adaptive to their host. From the host point
of view, the insertion of such mobile elements corresponds to the instantaneous acquisition of
prepacked adaptive functions, including resistance and detoxification factors, metabolic capa-
bilities and virulence determinants. If it does not kill the cell, the excision of these elements
may reduce metabolic cost when the encoded accessory functions are not needed.
The excision of mobile elements can also be co-opted for other purposes. For instance,
the excision of a prophage-like remnant is involved in the developmental differentiation of
forespores in B. subtilis (Krogh et al., 1996). In the same light, the site-specific excision of
two integrated elements triggers differentiation into heterocysts, which are resistant cells spe-
cialized in nitrogen fixation in many filamentous cyanobacteria. In both cases, the excision
restores the coding regions inactivated by an original insertion back to their functional forms.
The excision event is tightly controlled to occur only in terminated cells – corresponding to
progenitor cells after they have been compartmentalized from the forespore in B. subtilis and
to differentiated heterocysts, which are unable to divide in cyanobacteria. Thus, the restruc-
tured genomes do not participate in the next generation, while either the differentiated (spore)
or undifferentiated (vegetative cells) maintain the integrity of the system over time
(Haselkorn, 1992; Prozorov, 2001).
The integron system is a prominent diversity-generating structure that relies on both
insertions and excisions of dedicated elements. Because this system constitutes the essential
matter of this work, it will be described in detail in a dedicated chapter (see The integron ge-
netic system, p118).

88
Programed generation of genetic variations – Epigenetics

I.3.3. Epigenetics

a - Definition

The word epigenetic is somewhat ambiguous because it has been used to convey dif-
ferent ideas along the history a biological science. The concept of epigenesis was introduced
by Aristotle (ca. 384-322 BC) to refer to the embryological development of multicellular or-
ganisms. Particularly, it emphasizes development as an active process leading to the forma-
tion of a complex and organized organism through the gradual differentiation of an
amorphous zygote. The word epigenetic appeared in the 18th century to contrast this idea with
the preformationist theory, which held that the germ cells of each organism contain preformed
miniature adults that unfold passively during development without gain of complexity (Van
Speybroeck et al., 2002). Epigenetics was later used by C.H. Waddington (1905-1975) to
mean the external manifestation of genetic activity in interaction with the environment during
development (Waddington, 1942a). Etymologically, the word is based on the Greek prefix
epi- denoting on top of or in addition and genetic, meaning pertaining to or produced from
genes. In a broad sense, epigenetics is then a bridge between genotype and phenotype – a
phenomenon that changes the final outcome of a locus or chromosome without changing the
underlying DNA sequence. More specifically, epigenetics may now be defined as the study of
any mitotically and/or meiotically heritable change in gene expression or cellular phenotype
that occurs without changes in Watson-Crick base-pairing of DNA (Russo et al., 1996; Gold-
berg et al., 2007).
Epigenetic phenomena rely on diverse mechanisms that usually involve positive feed-
back loops to stabilize phenotypic states over time and rely on stochastic fluctuations to
switch between alternate phases. These mechanisms include: i) covalent histone
modifications, which maintain active and silent chromatin states (in eukaryotes); ii) DNA me-
thylation patterns, which alter the affinity of specific binding protein to their cognate sites; iii)
non-coding RNA, which can heritably affect gene regulation by diverse means, including the
covalent modifications of histones and DNA; iv) multistable regulatory switches, in which
expression states are maintained after the triggering signal disappeared; and v) prions, in
which protein structure is heritably transmitted (Casadesus and Low, 2006; Goldberg et al.,
2007; Feil, 2008; Veening et al., 2008a).
It has been argued that the requirement for heredity in the definition of epigenetics is
too restrictive, because some epigenetic mechanisms are also involved in very transient phe-

89
Introduction – Control of genetic diversity

nomena (Bird, 2007). In such cases, the outcome of the epigenetic process is closer to physio-
logical regulation than genetic modification. Importantly, these changes are particularly sensi-
tive to environmental inputs (Bird, 2007). Epigenetics then provide a bridge between the fast
and transient adaptation afforded by physiological regulation and the slow and heritable ge-
netic adaptation. The intricate relationships between regulatory changes and genetic variations
will be further discussed in the next chapter (see Phenotypic plasticity, genetic, variations and
physiological regulation, p99). In the following, selected examples of heritable epigenetic
phenotypes in microbes are presented to highlight their functional similarities with pro-
grammed genetic changes.

b - Bistable regulatory switch

i. Hysteresis in the lac regulatory system of E.coli

The first epigenetic process described in bacteria affects the lac operon of E. coli. M.
Delbrück (1906-1981) first formulated the idea that discontinuous transitions between alterna-
tive states exist in this context in 1949. The phenomenon of all-or-none enzyme induction was
subsequently verified experimentally (Novick and Weiner, 1957; Cohn and Horibata, 1959).
The genes encoding the proteins required for the uptake (lacY) and utilization (lacZ) of lac-
tose are induced in the presence of lactose analogues, such as isopropyl-d-thio-β-
galactopyranoside (IPTG) or thio-methylgalactoside (TMG). Being non-metabolizable, such
compounds allow disentangling the physiological effects of induction from their metabolic
consequence (Novick and Weiner, 1957). At high inducer concentrations the lac operon is
fully derepressed, cells express the LacY permease at high concentrations and thus remain ac-
tivated (ON state). In contrast, at low concentrations cells that were previously uninduced and
do not have any permease in their membranes do not respond to the low level of inducer and
remain in the OFF state. A single cell cultured at intermediate concentration of inducer lead to
the development of two phenotypically distinct subpopulations, wherein cells in the OFF
states coexist with cell in the ON states without alteration of their genotypes (see Figure 23a).
It has recently been shown that stochastic dissociation of the LacI repressor from its operator
was responsible for this phenomenon (Choi et al., 2008). When cells that were previously in-
duced are shifted to a medium with lower level of inducer, the presence of permease in their
membranes ensure sufficient uptake to maintain the ON state (see Figure 23b). Importantly,
under low inducer conditions, either the ON or OFF state can be epigenetically inherited by
the offspring through multiple rounds of growth and division. Hence, the past environment

90
Programed generation of genetic variations – Epigenetics

experienced by a given cell can influence the phenotype of its offspring – a phenomenon re-
ferred to as hysteresis or cellular memory.
The LacY permease plays a pivotal role in this system: high permease levels keep the
levels of intracellular inducer high, thus inducing permease gene expression. This positive-
feedback loop drives the bistability of the system and is responsible for the sigmoidal shape of
the curve discernable in Figure 23b. The lac operon is also subject to catabolite repression via
cAMP/CRP, which results in low induction when glucose is present. This additional regula-
tory input affects the pattern of hysteresis, showing that other signals can act on top of the
feedback loop to modulate the output of the system (Ozbudak et al., 2004; and Figure 23c).
Interestingly, the switching behavior is essentially stochastic due to noise in the intracellular
concentration of the LacI repressor (Choi et al., 2008). Although only DNA mutations are
usually considered heritable, it has recently been demonstrated that increased rate of transcrip-

91
Introduction – Control of genetic diversity

tion error altered the OFF→ON switching behavior by increasing the noise in functional LacI
production. In such noisy systems, heritable phenotypic changes can thus results from non-
heritable mutations (Gordon et al., 2009).

ii. Sporulation in B. subtilis

Three instances of bistable phe-


notypic switches have been reported in
B. subtilis (Dubnau and Losick, 2006):
i) initiation of the competence state; ii)
separation of single swimming cells
from chains of physically linked sib-
lings; and iii) developmental commit-
ment to sporulation. Heritability was
only demonstrated for the last process.
Sporulation is driven by the mas-
ter sporulation regulator Spo0A, whose
activity is governed by phosphorylation
via a multicomponent phosphorelay
(Burbulys et al., 1991). Although activa-
tion is triggered by specific environ-
mental signals, such as high cell
densities and nutrient deprivations, only
a fraction of cells are found in a Spo0A-
ON state in adequate conditions (Chung
et al., 1994). This observation initially
led to the description of sporulation as a bistable process. This phenomenon depends on a
complex and noisy positive feedback loop involving both the transcription of the gene for
Spo0A and its phosphorylation (Veening et al., 2005). Time-lapse tracking of cell lineage
showed that the actual decision to sporulate could often be traced back more than two genera-
tions before the actual appearance of the phenotype (see Figure 24), thereby demonstrating
that the sporulating signal is epigenetically passed on from cell to cell (Veening et al., 2008b).
The benefit arising from supplementing a physiological control with a bistable mechanism
may be to avoid the whole population to embark on an irreversible differentiation process, in
case the initial trigger is erroneous. Also, Spo0A-ON cells in a population trigger the lysis of

92
Programed generation of genetic variations – Epigenetics

non-sporulating Spo0A-OFF siblings, a process termed fratricide or cannibalism (Gonzalez-


Pastor et al., 2003). By doing so, the population density is decreased and nutrient are released
in the environment, which may alter the Spo0A activating feedback loop and delay commit-
ment to sporulation (Dubnau and Losick, 2006). The inheritance of sporulation ability might
be beneficial in increasing the response of the progeny to future conditions which have higher
probability to be challenging considering the parent experience (Veening et al., 2008a).

c - DNA methylation patterns in bacteria

i. General mechanism

Insofar, all described cases of strict epigenetic regulation in bacteria rely on a methyla-
tion-dependent mechanism. These systems require a DNA methylase and a regulatory protein
that bind to DNA sequences overlapping the target methylation site. Upon binding, the regula-
tory protein hampers methylation by preventing access of the target site by the methylase. In
turn, methylation of the target site inhibits protein binding to its cognate sequence. This recip-
rocal relationship results in the existence of two alternative and stable patterns – fully methy-
lated and nonmethylated (i.e. not methylated on both strand) – which correlates with the
binding of a regulatory protein and is thus associated with the control of a target gene’s ex-
pression (Casadesus and Low, 2006).
Only a handful of such systems have been reported, reflecting both the difficulty to
evidence these phenotypes and the multiplicity of conditions required to build up such
mechanisms. All described systems pertain to E. coli and involve the orphan DNA adenine
methylase (Dam), which processively catalyses the methylation of adenine in the ca. 20,000
5’-GATC-3’ motifs present in E. coli’s genome. Comprehensive studies indicated that only
few target sites are consistently protected from methylation by binding proteins. Among these
proteins, few play a regulatory function and fewer are likely to be affected by methylation
patterns (Casadesus and Low, 2006). Dam apparently methylates only one DNA strand at a
time, but full methylation is promoted by the higher affinity of the enzyme to hemimetylated
sites (Lim and van Oudenaarden, 2007). After replication, DNA duplexes are transiently
hemimethylated and this state serves as an important cue for several processes, including syn-
chronization of replication, DNA repair and transposition (Casadesus and Low, 2006). While
such processes can be viewed as epigenetic phenomena, they are not heritable. In contrast, the
methylation patterns described above are heritable over several generations.
The exact determinants of this cellular memory are not understood in details, but some key

93
Introduction – Control of genetic diversity

points can nevertheless be drafted. Nonmethylated naked sites are subject to competition be-
tween the binding regulatory protein and the methylase. The outcome of this competition de-
pends on the relative concentrations and affinities of the two proteins. As there is no DNA
demethylation reaction in E. coli, the passage from a fully methylated to a nonmethylated
state necessitates two successive rounds of replication. Replication thus provides an interme-
diate state that can initiate switching toward nonmethylation. Because replication forks
probably remove regulatory proteins associated with nonmethylated sites, replication also af-
fects the switch toward methylation state. Overall, a functional switch involves several inter-
mediate states that tend to revert to the stable state from which they originate. As a result, the
extreme states are strongly stabilized and the switching frequency is maintained to a level
compatible with sustained heritability (Lim and van Oudenaarden, 2007).
Several parameters can influence the switching frequency of these epigenetic systems.
Obviously, the affinities of the regulatory protein to the different methylated states are of pri-
mordial importance. The steady state switching frequency then depends on the relative con-
centration of regulatory protein and Dam, and any factor impacting this ratio can displace the
dynamic equilibrium. Growth rates may be a source of particular variations: increased delays
between replication rounds would favor the occurrence of two consecutive methylations,
while faster growth would titrate Dam activity thereby favoring dilution of the methylation
pattern (Lim and van Oudenaarden, 2007). To some extent, this effect may be buffered, be-
cause Dam levels are known to specifically increase with growth rate (Lobner-Olesen et al., 1992).

ii. The agn43 system

The agn43 (flu) gene of E. coli is a straightforward example of bacterial epigenetic


switch involving covalent DNA modification. Indeed, it is not embedded in a complex net-
work of feedback regulations and conforms perfectly to the general features presented above.
agn43 encodes the outer membrane protein Ag43, which is notably implicated in autoaggre-
gation and biofilm formation (Casadesus and Low, 2006). The expression of Ag43 phase var-
ies with a switching rate of 10-3 to 10-4 per cell per generation (Lim and van Oudenaarden,
2007). The oxydative stress response protein OxyR represses the transcription of agn43. The
binding site is located immediately downstream of the gene promoter. This regulator exists in
two redox states, associated with distinct binding affinities to DNA. Binding of the reduced
protein represses gene expression in the OxyR regulon, a constraint that is generally relieved
by oxidation of the protein. Surprisingly, the redox state of OxyR does not affect binding to
the agn43 regulatory region. However, the binding region contains three Dam target sites.

94
Programed generation of genetic variations – Epigenetics

Methylation of the target sites prevents OxyR binding, and consequently promotes Ag43 ex-
pression (Henderson and Owen, 1999). Conversely, the binding of OxyR to its cognate se-
quence protects the region from methylation, thereby maintaining proper condition for
effective repression. The affinity of OxyR for its hemimethylated substrate is six-fold lower
than nonmethylated, while its affinity for the fully methylated region is even too low to be
measured. As discussed in the previous section, this must be a critical feature in controlling
the switching frequency. In addition, methylation of any two of the three target sites is suffi-
cient to prevent binding (Casadesus and Low, 2006). The third site might then be involved in
decreasing the frequency of ON→OFF transitions.
Additional observations slightly complicate this simple model. First, an intermediate
agn43 expression level is measured in an oxyR dam double mutant background. This likely
results from a non-specific transcription, and indicates that methylation of the promoter region
is required for full expression. Furthermore, the region upstream of the promoter and between
the OxyR binding region and the agn43 gene are both necessary for correct regulation. It thus
seems that binding of OxyR by itself is not sufficient for repression. Instead, effective repres-
sion would rely on an interaction between the upstream and downstream regions mediated by
the DNA bending properties of the protein. This additional structural state seems to stabilize
the nonmethylated state and decrease the probability of OFF→ON switching (Lim and van
Oudenaarden, 2007).
The expression of the bacteriophage Mu mom gene of is also affected by OxyR. In this
case the binding of the protein is both regulated by the redox control and the methylation state
of three Dam target sites (Bolker and Kahmann, 1989; Hattman and Sun, 1997).

iii. The pap system

The pap operon, which encodes the pyelonephritis-associated pili in E.coli constitutes
the paradigmatic example of epigenetic switch mediated by methylation patterns. Several
roles have been suggested for the switching of Pap expression: escape from the host immune
response, facilitation of a bind-release-bind series implicated in urinary tract colonization and
control of growth through contact-dependent growth inhibition (Aoki et al., 2005). Phase
variation of Pap relies on the same general principle as agn43. The system is nonetheless
more complex, because it involves a dual methylation pattern and is subject to intricate feed-
back controls.

95
Introduction – Control of genetic diversity

The system consists of the main operon and a divergently encoded upstream gene
papI. Two stable expression states can be distinguished for the pap operon, depending on the
binding of Lrp (see Figure 25). In the ON state, GATCprox is methylated preventing binding of
Lrp to the Pb region, while Lrp binding protects GATCdist from methylation. In the OFF
phase, Lrp prevent expression of the pap operon and protect GATCprox from Dam activity,
while methylation of the GATCdist prevents binding to this site. Importantly, Lrp plays a dual
regulatory role in this system. In the OFF phase Lrp represses papB by preventing access to
Pba, whereas in the ON phase it is bound to GATCdist and activates papI transcription through
the formation of a complex involving PapI, PapB, and CRP. PapI plays an important role in
regulating the affinity of Lrp to its cognate sites. PapB is an activator of PapI expression, and

96
Programed generation of genetic variations – Epigenetics

also exert a negative autoregulation on the pap operon (Blomfield, 2001). The most conven-
ient way to appreciate the intricacies of the system is to consider the transitions between the
ON and OFF states.
Let us first consider the OFF→ON transition, which occurs at a frequency of ca. 10-4
per cell per replication in standard conditions. Two factors stabilize the OFF state: i) the sole
binding of Lrp to the proximal region decreases binding at the distal region by 10-fold, a phe-
nomenon termed mutual exclusion. The underlying mechanism is unknown but requires DNA
supercoiling, potentially mediated by Lrp binding to the proximal region; ii) the methylation
state of GATCdist, a mutation preventing methylation at this site is sufficient to lock the sys-
tem on the ON phase (Braaten et al., 1994). PapI is essential to the OFF→ON switch. In the
absence of PapI, Lrp is slightly more affine to the proximal region. When expressed at physio-
logical concentrations, PapI increases the affinity of Lrp to the distal region, even when
hemimethylated. Furthermore, PapI specifically decreases the affinity of Lrp to the methy-
lated proximal region. A raise in either PapI or PapB levels seems mandatory to initiate the
switch and replication is presumably required in this process. By destabilizing Lrp bound to
GATCprox, replication probably results in slight increase in PapB expression. This would initi-
ate a positive feedback loop because papI expression is promoted by PapB. Increased levels
of PapI and release of mutual exclusion would favor Lrp binding to the hemimetylated distal
region, which further promotes the expression of PapI in the presence of CRP. High levels of
PapI further decrease Lrp binding to the proximal region provided it has been methylated by
Dam, thereby stabilizing the ON phase and the production of PapB. A second round of repli-
cation is required to obtain a nonmethylated GATCdist and fully stabilize Lrp binding in this
region. In steady state, the expression of pap is limited by binding of PapB to the Pb region.
This autoregulation also indirectly controls the level of PapI, and probably permits the reverse
switch to occur.
The ON→OFF transition occurs at a 100-fold higher rate. Initiation of the switch
probably also involves replication, which destabilizes the Lrp complex at the distal site and
promote competition with Dam. Replication also produces a hemimethylated GATCprox site to
which Lrp affinity is increased in the presence of PapI. Once Lrp is bound to the proximal re-
gion, it would favor methylation of the distal site by mutual exclusion. Altered formation of
the Lrp enhancer together with lower PapB level decrease PapI expression, thereby stabilizing
the OFF state. A second round of replication leads to nonmethylated GATCprox (Blomfield,
2001; Casadesus and Low, 2006).

97
Introduction – Control of genetic diversity

This switching system is subjected to several environmental influences. The papI ex-
pression enhancing complex comprises the CRP protein, which links the regulation of the pap
operon with catabolite repression and availability of carbon sources. Specifically, the absence
of glucose dramatically lowers the OFF→ON switching rate. The global regulatory protein H-
NS is indirectly implicated in controlling switching rate via PapB expression. H-NS seems to
bind to the pap regulatory region and represses papB transcription in response to low tem-
perature (Goransson et al., 1990), high osmolarity and rich culture medium (White-Ziegler et
al., 2000). Both of these regulatory mechanisms may be used by cells to adapt the production
of pili in response to their immediate environment. Another environmental input into Pap
phase variation is mediated by the CpxAR two-component system. Under certain conditions
that stress the cell envelope, CpxA activates CpxR which then binds to sites overlapping all
pap Lrp binding sites, thereby shutting papB expression off. This response is not affected by
methylation patterns and thus overrides Lrp control (Hernday et al., 2004). The biological role
of this phenomenon remains unclear.
Another layer of complexity results from the presence of several paralogous fimbrial
operons in E. coli. These operons share a similar organization and crosstalks occur between
homologs of PapB and PapI. As a result, a particular stochastic switching event can influence
the transition frequency of related operons. This regulatory network even extends to unrelated
system. For instance, PapB and its paralogue SfaB repress phase variation of the type 1 fim-
briae encoded by the fim operon, which is mediated by site-specific promoter inversion (van
der Woude, 2006; and see p85). Overall, this interdependence may account for the observed
co-ordination of pili expression in individual cells.

d - The yeast prion PSI+

Prions are proteins that can adopt at least two distinct and stable conformational states,
one of which – the prion form – can stimulate the non-prion conformation to convert into the
prion form (Uptain and Lindquist, 2002). The yeast prion PSI+ is generated by the aggrega-
tion of the Sup35p translation termination factor which allows readthrough of nonsense
codons (Patino et al., 1996). The PSI+ prion is a metastable element that is generated and lost
spontaneously at low rates. Sup35 aggregates into amyloid fibers which leads to a range of
phenotype strengths that render characterization of switching rates uneasy (Uptain and
Lindquist, 2002). By uncovering hidden genetic variations in 3’ untraslated regions, ca. 25%
of [PSI+] cells exhibit a survival advantage under adverse conditions (True et al., 2004).

98
Programed generation of genetic variations – Epigenetics

The [PSI+] phenotype is maintained by the ability of prion fibers to convert native
Sup35p protein to the prion form (Satpute-Krishnan and Serio, 2005). Upon cell division, the
prion conformation is probably passed from the mother cell on to the daughter cell through
the cytoplasm (Uptain and Lindquist, 2002). Consequently, the [PSI+] state can be stably
maintained for approximately 105 to 107 generations (Lund and Cox, 1981). The protein chap-
erone Hsp104 modulates the propagation of the [PSI+] state (Chernoff et al., 1995). During
heat and chemical stress it is observed that the [PSI+] phenotype is suppressed, presumably
due to increased Hsp104 activity that releases functional Sup35 from prion aggregates
(Eaglestone et al., 1999). In this case, stresses transiently decrease the readthrough phenotype
of [PSI+] cells, which diminishes the phenotypic variance instead of promoting it. In contrast,
it was recently reported that signal transducers and stress response genes are prominent factor
modulating the frequency of switching to the prion state. Particularly, stressful conditions
such as oxidative stress or high salt concentrations greatly increase the induction of [PSI+]
phenotypes (Tyedmers et al., 2008).

II. PHENOTYPIC PLASTICITY, GENETIC


VARIATIONS AND PHYSIOLOGICAL
REGULATION

Natural selection acts on the external manifestation of the genetic information – the
phenotype. The availability of heritable phenotypic variations determines the response of a
population to selection. Phenotypic diversity is not only the consequence of differences in the
genetic makeup of individual organisms but also results from variations in gene expression
(Bennett and Hasty, 2007). This chapter discusses the relationship between physiological
regulation and the diverse mechanisms to generate genetic diversity that have been introduced
above. Overall, the multiplicity of processes involved in phenotypic plasticity provides a more
complete view of evolution than initially conceived (see Figure 26, p113). The concepts of
evolvability and robustness are overviewed in this framework.

99
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

II.1. Genetic versus physiological changes

II.1.1. Cybernetic genomes


It has been observed for a long time that individual organisms are able to react and
adapt to their environment. C. Bernard (1813-1878) first defined the concept of homeostasis
to acknowledge the ability of higher metazoans to actively maintain the stability of certain
physiological parameters in front of varying conditions, via the coordinated action of their or-
gans. Such processes guarantee the self-consistance of individual organisms and ensure a rela-
tive independence from their immediate environment. The molecular mechanisms underlying
these physiological processes were first described by F. Jacob and J. Monod (1910-1976),
with their molecular studies on the lactose operon of E. coli (Müller-Hill, 1996).
Generally speaking, genetic information is encoded in the DNA sequences that form
the genome of living entities. These sequences constitute information because they are inter-
preted by the molecular machinery present in the cell to construct the phenotype. To some ex-
tent, proteins and nucleic acids molecules constitute the cell’s genetic hardware, while the
exact sequence of bases in DNA represents the genetic software – which is decoded by the
hardware. This computer analogy is, however, limited. In a computer the composition of the
hardware is fixed through time, and this ensures the unambiguous interpretation of the soft-
ware’s information according to a defined flow of execution. In a cell, the flow of execution is
not explicitly directed by the structure of the software. Instead, the execution of the software
determines the manufacture of hardware components which – in turn – can affect the interpre-
tation and structure of the software. In a first approximation, DNA sequences are accessible
continuously and the logical decisions directing the execution of this program depend on the
hardware available at a given time point.
In this view, the genetic program results from the interpretation of pure genetic infor-
mation by a dynamically changing molecular machinery. This machinery contains processing
elements (polymerases, ribosomes, tRNAs…) to express the information and controlling ele-
ments (regulator, operator…) to regulate this phenotypic expression. While the processing
fraction is relatively stable, the composition of the controlling pool varies through time in re-
sponse to diverse stimuli. Regulation of gene expression can occur at different levels (i.e.
transcriptional, translational or posttranslational) through various logical mechanisms includ-
ing feedback loops (e.g. catabolite derepression during glucose limitation), signaling through

100
Genetic versus physiological changes – Individual and populational adaptation

coupled sensor-transducers (e.g. histidine protein kinases) and global responses to stress (e.g.
SOS signaling, see p47). Overall, this physiological regulation allows organisms to exhibit a
prescriptive phenotypic plasticity in response to changes (see below).

II.1.2. Individual and populational adaptation


Living organisms have two major strategies for adapting to environmental changes: i)
random and heritable phenotypic changes mediated by genetic mutation; and ii) programmed,
directed and transient changes in gene expression patterns which only alter the phenotype.
The first strategy relies on the blind force of natural selection iteratively operating in a popu-
lation made of different individual variants. The second requires a set of prescriptive re-
sponses, which allow cells to maximize their fitness in response to a wide range of situations
and in a controlled manner. Because the phenotypic characteristics acquired through such
physiological responses are usually not heritable (but see Bistable regulatory switch, p90),
this form of phenotypic plasticity only affects individual survival punctually, and therefore
have no long-term evolutionary impact. Nonetheless, regulatory responses result from sus-
tained selection for cells to maintain their homesostasis in front of external and internal envi-
ronmental challenges.
Any mutation affecting a regulatory protein – or one of its cognate DNA binding sites
– can be selected based on its impact on fitness in the current environment. Yet, the effective
evolution of responsive regulation patterns is limited in several ways (Moxon et al., 2006): i)
the stressing conditions must not be too harsh for organisms to survive long enough to express
the response; ii) the environment must provide indicative cues, which can be sensed by cells
to elicit the response. Such cues have to be highly specific to avoid detrimental activation un-
der inappropriate environments. For some stressing conditions, such triggers may be very
scarce; iii) the frequency at which triggering conditions occur heavily constrains the evolution
of a dedicated response. Too rare challenging conditions cannot sustain the selective pressure
required to orient the mounting of a sophisticated response. Ideally, the environment should
cycle rapidly to ensure proper selection of advantageous responsive variants – though slowly
enough to avoid selection of constitutive regulation patterns (see Genetic assimilation of long
lasting regulation, p107); and iv) once functional, the sensing and operator systems are likely
to impose a substantial energetic cost on the organism. The maintenance of these systems de-
pends on a trade-off between their permanent cost and the occasional fitness advantages they
confer over time. The outcome of this trade-off ultimately relies on the rate at which appro-

101
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

priate environmental triggers occur, and rarely demanded systems are expected to be elimi-
nated by selection. Overall, the repertoire of programmed physiological responses resulting
from these constraints is necessarily limited.
Responsive regulation and natural selection are two processes with different scopes.
Mutations are the sole source of genetic innovations and can influence fitness in different
ways – including regulation – to cope with a very wide range of challenges. However, muta-
tions generally occur randomly in few individuals and are essentially destructive (Eyre-
Walker and Keightley, 2007; and see Figure 1, p24). Their occasional selection by the envi-
ronment slowly impacts the population, depending on the strength of selection. In sharp con-
trast, physiological regulation involves teleonomic systems that responsively orient
phenotypic modifications when exposed to known challenges. This process increases the
adaptedness of all individuals in the population at the same time. Selection is thus a general
and populational process operating on individuals over evolutionary time, while regulation is
a specialized and individual process that simultaneously affects the whole population. To
some extent epigenetic mechanisms form the middle way – blurring the difference between
simple regulation and heritable change. Aside from prions (see p98), epigenetic mechanisms
essentially affect regulation patterns. Owing to self-reinforcing behaviors mediated by posi-
tive-feedback loops, epigenetic phenomena stabilize particular physiological states so far as to
allow their inheritance (Casadesus and Low, 2006). This permits elements of the genetic
hardware that have been specifically expressed in response to stress to constitute historical in-
formation that can durably influence the interpretation of the genomic software. This hystere-
sis enables adaptation to occur on an intermediate timescale between regulation and selection
(Jablonka et al., 1995; Rando and Verstrepen, 2007).

II.1.3. Stochastic switches as a bet-hedging strategy

a - Contingency loci

Although most mutations are irreversible and random in time, space, and nature, sev-
eral mechanisms mediating high-frequency, stochastic, heritable, and reversible switching be-
tween well defined phenotypic states have been presented in detail earlier (see Programmed
generation og genetic variations, pp 60-99). In bacteria, the loci affected by these mechanisms
are often referred to as contingency loci and are involved in phase and/or antigenic variations.
These phenomena are not restricted to bacteria and have also been described in eukaryotic

102
Genetic versus physiological changes – Stochastic switches as a bet-hedging strategy

microbes (yeast, trypanosomes…), as well as in higher metazoans (e.g. immune system and
SSR loci, p64). Contingency loci are hypermutable compared to the genomic background but
channel variations toward specific phenotypes. This generally involves the ON-OFF switch-
ing of a gene’s expression state. Combinatorial variations allow a clonal population to rapidly
diversify into phenotypically distinct subpopulations – which remain almost unaltered at the
genetic level. The actual switching rate and the population size determine the extent of diver-
sification (De Bolle et al., 2000). Accurate estimations of switching rates in the absence of se-
lection are often difficult to achieve experimentally. In bacteria, they are commonly found on
the order of one switch in every 103–105 generations, though rates as high as 10-1 has been re-
ported in some systems (van der Woude and Baumler, 2004). In most cases, contingency loci
are modifiers of their own mutation rates and are thus hardly affected by recombination, as is
the case with general mutators (see p42).
What are the advantages of pre-diversifying a population at such contingency loci?
Broadly speaking, evolution occurs through the selection and rise in frequency of fitter vari-
ants in a population. As most mutations are deleterious, mutation rates are kept as low as pos-
sible on a genomic scale to ensure the maintenance of genetic characteristics on the short term
(see p39). In these conditions, a population may not be variable enough to efficiently respond
to selection. Recurrent selective pressures exerted on a given trait can drive the development
of contingency mechanisms targeted to loci at which increased rate of specific mutations
prove beneficial to the organism. In this light, the subpopulations resulting from the combina-
torial variations of several contingency loci are poised to adapt to a variety of sudden – but
expectable – environmental changes. In contrast to the loss of genetic information that follows
the take over of a population by one rare advantageous variant (selective sweep), a subpopula-
tion selected on the basis of contingency loci will be able to rediversify efficiently – owing to
the reversibility of these specific types of genetics mutations. This kind of genetic plasticity
underlies a phenotypic plasticity that is close to the one afforded by physiological regulation.

b - Bet-hedging

The idea that stochastic phenotypic switching can be advantageous in fluctuating envi-
ronments accords well with theoretical studies in evolutionary ecology – and is often referred
to as bet-hedging. In ecology, this strategy implies the diversification of life-history traits ex-
perienced by the progeny of an individual. Classical examples include variable maturation
rates in insects or germination events in plant seeds. Formally, bet-hedging refers to a risk-
spreading strategy which favors genotypes with lower variance in fitness at the cost of lower

103
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

mean fitness (Hopper, 1999). The lower variance in fitness reflects the fact that diversified
subpopulations are potentially able to cope with a larger panel of environments. In a given
environment however, only a subpart of the population may be well adapted, resulting in
lower mean fitness. Strictly speaking, this strategy supposes that the phenotypic variability is
expressed from a constant genotype. Indeed, phenotypic diversification can be achieved on
purely regulatory ground in multistable systems (Dubnau and Losick, 2006; Veening et al.,
2008a; and see p79). Contingency loci are close to this ideal because the mutations underlying
phenotypic changes are reversible. Interestingly, such a genetic determinism may be a benefi-
cial feature because it straightforwardly enables a heritable cellular memory of past environ-
mental conditions (Jablonka et al., 1995; Lachmann and Jablonka, 1996). At a given
generation, a population will be essentially biased toward the expression of traits that just
proved advantageous for the parental generation – instead of being randomly drawn from the
whole set of potentially adaptive variation. Such hysteresis is less reliably implemented using
regulatory network, which are more sensitive to molecular noise (Lim and van Oudenaarden,
2007).
What is the optimal switching frequency in these systems? Obviously, too high fre-
quencies would exaggeratedly decrease the mean fitness, while too rare switches would nega-
tively impact the fitness variance. Both modeling (Lachmann and Jablonka, 1996; Kussell et
al., 2005; Wolf et al., 2005) and experimental (Acar et al., 2008) studies suggested that sto-
chastic and heritable phenotypic switches would be beneficial when the environment
fluctuates randomly over timescales that roughly match the phenotypic switching rate. Inter-
estingly this optimum frequency would be the one observed at the population level if the
switch was controlled by responsive regulation. Another example corroborating these finding
will be presented in the results section (see Evolution of recombination rate in integrons,
p152).

c - Genetic switches as crude regulatory controls

Most genetic switches impact the expression of an existing trait rather than generating
functional novelty. On this ground, the phenotypic impact of these systems is comparable to
physiological regulation. Then, why are some phenotypic traits controlled through a pre-
emptive bet-hedging strategy rather than a prescriptive regulation pattern? As highlighted in
the previous section, several factors constrain the evolution of responsive regulation (see
p101). To some extent, the bet-hedging strategy provides an alternative to overcome these
constraints: i) while the time delay between sensing and phenotypic change may not be fast

104
Genetic versus physiological changes – Stochastic switches as a bet-hedging strategy

enough to ensure effective response, the pre-existence of substantial variability guarantees the
immediate survival of a subpopulation upon sudden and instantly lethal challenges (Borst and
Greaves, 1987); ii) a bet-hedging strategy does not require sensing of specific cues and is
probably well suited to cope with a wide range of situations (Wolf et al., 2005); iii) contin-
gency mechanisms are less costly in term of molecular machinery and some are easy to setup.
This is particularly true for SSR and gene conversion, which mostly subvert existing mecha-
nisms of repair. Supporting this idea, it has been noted that regulation by contingency loci are
particularly prevalent in bacteria with smaller genome sizes (Moxon et al., 2006). In addition,
some sophisticated contingency mechanisms can produce phenotypes that are difficult or im-
possible to achieve through physiological regulation – such as coordinating mutually exclu-
sive expression patterns and creating new combinatorial diversity through rearrangements.

d - Link with the lifestyle of organisms

Most examples of stochastic genetic switches are found in microbes, probably because
their biology is particularly well suited to this evolutionary strategy. Microbes tend to rapidly
establish clonal populations with large effective population sizes. In this context, subpopula-
tions differing at their contingency loci can readily differentiate – and heritable switches can
efficiently substitute for regulation. Microorganisms also undergo severe and extremely vari-
able selective pressures. On a macroscopic scale, microbial cells can be considered as sessile
organisms and are often subjected to rapid environmental change without any means of es-
cape (Andrews, 1998). They routinely experience rapid changes in nutrient levels, osmolarity
and exposure to toxic compounds. Particularly, pathogens must face a continuous and dy-
namic battle against immune defenses of their hosts. Accordingly, most of the described vari-
ability is in the cell surface and many contigenci loci have been implicated in immune
evasion, or possibly in colonization of different host niches (van der Woude, 2006).
Generally, multicellular organisms form smaller populations. Their evolution is then
more influenced by genetic drift, and contingency loci cannot reliably generate a diversified
panel of phenotypes. Furthermore, long generation times rule out genetic switching as an ef-
fective response to rapidly fluctuating conditions, and physiological regulation is a more
appropriate alternative in this context. Besides, cells in multicellular organisms are less ex-
posed to environmental changes. Remarkably, the immune system of higher eukaryotes re-
lies on several diversity generating mechanisms. Although this constitutes the other side of
the arm race with pathogens, the evolutionary implications are different because somatic
changes are not transmissible to the next generation. Most examples of stochastic genetic

105
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

switching described in multicellular organisms essentially rely on SSR, the simplest contin-
gency mechanism (see p64). The general significance of these mechanisms in multicellular
organisms may become apparent with further studies (Fondon and Garner, 2004).

II.2. Links between genetic changes and regulation

II.2.1. Impact of expression strength on sequence evolution


Genes are selected when their expression affect the fitness of the organism. Genes
whose products are at the interface with the external environment are often subjected to diver-
sifying selection at specific positions. In contrast, genes that are essential to the functioning of
the organisms undergo strong and constant purifying selection. Following S. Wright (1889-
1988) metaphor of the adaptive landscape (Wright, 1931), continuous exposure to directional
selective pressures traps functional genes to local fitness maxima and heavily constrains ef-
fective exploration of the surrounding landscapes (Weinreich et al., 2006).
The evolution of essential and highly expressed genes is particularly constrained. In-
deed, expression strength seems to be the most important determinant of the protein evolution
rate, with the more expressed genes undergoing less non-synonymous changes in a wide
range of organisms (Pal et al., 2001; Rocha and Danchin, 2004; Subramanian and Kumar,
2004; Lemos et al., 2005). Although essential genes tend to display higher expression level,
the importance of expression strength may be decoupled from the functional activity of the
protein and probably involve a selective pressure to limit translation errors (Drummond and
Wilke, 2008). In contrast, accessory genes whose expression infrequently impacts the pheno-
type, as well as genes that have been subjected to recent duplication undergo relaxed selective
pressures. Even if they do not participate in fitness in a given environment, the evolution of
constitutively expressed genes is constrained because alterations of their product can interfere
with the normal functioning of the cell (negative epistasis). Genes that are silenced through
physiological repression or genetic switch are not subject to any selective pressure.
In a context of relaxed selection, mutations can neutrally accumulate in the population
and evolution is mainly driven by genetic drift. While this introduces a substantial amount of
deleterious mutation and may eventually lead to pseudogenisation, it also provides a favorable
opportunity to generate diversity, and relieve adaptive constraints. This is illustrated by the
frequently observed sub- or neo-functionalization of duplicated genes (Conant and Wolfe,

106
Links between genetic changes and regulation –

2008b). Mimicking extended periods of neutral drift recently proved to be an efficient strat-
egy to evolve proteins experimentally (Gupta and Tawfik, 2008). This issue is further dis-
cussed in my work on directed evolution of proteins (see Results – Intrinsic evolutionary
potential of genes, p212) – and its relevance to the functioning of integrons will be addressed
later (see Discussion – Increased rate of cassette evolution, p237).

II.2.2. Genetic assimilation of long lasting regulation


What might happen when sustained environmental conditions induce a physiological
response for a long time? This type of question addressing the connection between short-term
and long-term phenotypic variation was raised by J.M. Baldwin (1861-1934) to deal with be-
havioral traits (Baldwin, 1896). The term Baldwin effect is now used to refer to a scenario in
which a phenotypic change occurring in an organism as a result of its interaction with its envi-
ronment becomes gradually assimilated into its genetic or epigenetic repertoire (Simpson,
1953). This concept was further developed by I.I. Schmalhausen (1884-1963) (Levit et al.,
2006) and C.H. Waddington (1826-1894) (Waddington, 1953). Waddington provided experi-
mental support to this idea. He evoked phenotypic changes in Drosophila by exposure to
ether, heat, or salt treatment and obtained flies that heritably exhibited new phenotypes in the
absence of treatment after few generations under selection. More recently, compromising the
activity of the heat-shock protein Hsp90 with environmental stresses, such as temperature,
was shown to increase the phenotypic variation in D. melanogaster (Rutherford and
Lindquist, 1998) and A. thaliana (Queitsch et al., 2002). These phenotypes could be subjected
to selection and stabilized so that their appearance was no longer dependent on the triggering
conditions, thereby demonstrating that Hsp90 is implicated in uncovering cryptic genetic – or
epigenetic (Sollars et al., 2003) – variations (see below pp 115 and 116). These results suggest
that environmentally evoked phenotypic changes can be regarded as new internal environ-
ments to which organisms can adapt genetically – provided they are maintained long enough.

107
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

II.2.3. Evolution of regulatory patterns

a - Regulatory networks as evolutionary target

The importance of regulatory variation in evolution has recently been emphasized


(Gerhart and Kirschner, 2007). Indeed, rewiring of existing genetic components can easily
promote the emergence of sophisticated and advantageous phenotypes. To some extent, bio-
logical entities at different levels of organization can be described as arrangements of largely
independent modules. For instance, the diversity of proteins seemingly arose from the combi-
nation of a limited set of functional domains (Caetano-anollés et al., 2009). In the context of
regulatory networks, modularity refers to a pattern of connectedness in which genes and their
products are grouped into highly connected subsets – i.e. modules – which perform integrated
functions and are more loosely connected to other such groups (Wagner et al., 2007). The
practical idea behind this abstract concept is that modules form functional bricks that – once
evolved on one occasion – can be reused by evolution in the edification of more and more
complex structures. If the notion of modular organization is now widely accepted in biology,
it remains unclear whether consistent modules are mounted by natural selection or emerge as
side effects of other processes (Wagner et al., 2007).
The rearrangement of regulatory networks is particularly flexible because the links be-
tween their constituting elements are weak. Two properties have been put forward to highlight
the impact of such weak regulatory linkages on evolution (Gerhart and Kirschner, 2007): i)
the signal input and response output interact indirectly through an intermediate agency, and
hence do not require stereochemical complementarity to each other; and ii) the output can be
much more complex than the regulatory input because it may be produced by a module that
has been previously built by natural selection – independent of the nature of the signal. The
regulatory input and functional output being decoupled, they need not coevolve. Instead, regu-
latory signals are selectable just for their regulatory value, without regard to their chemical re-
lationship to the response or to their intrinsic instructive capacity. In this light, functional
modules could be rewired and coupled with another module or environmental signals with
only few actual mutations. This process may greatly facilitate the contingent emergence of
complex phenotypic variations over evolutionary time.
The evolutionary implications of weak regulatory linkage were first perceived on very
simple systems: allosteric proteins (Monod, 1970). These proteins comprise both regulatory
and productive activities and thus constitute integrated system by themselves. They are able to

108
Links between genetic changes and regulation –

switches between ON and OFF states of activity – the OFF state being usually preferred in-
trinsically. Regulatory agents operate a state selection simply by binding preferentially to one
or the other conformation. Any regulator stabilizing the ON state is an activator, while any fa-
voring the OFF state is an inhibitor. The activity and inactivity states are built into the protein
and regulators only influence the conformation of the protein. Consequently, the actual func-
tion of the enzyme and its regulation are partly decoupled. The evolution of the segment me-
diating interaction with regulators can largely evolve independently from the functional sites.
This property readily allows diversification of an ancestral allosteric protein, with variants ei-
ther deriving new activities in response to the same regulatory inputs or displaying the same
activity in response to new triggers. Other examples of weak regulatory linkage involve more
complex modules at higher levels of organization (see Gerhart and Kirschner, 2007 and refer-
ence therein). The incorporation of the integron system to the SOS regulon presented in the
Results section also provides a striking illustration of how complex phenotypes can emerge
from the association of two preexisting modules (see p171).

b - Switching regulation patterns

Although contingency loci only affect their own single locus, their variations can have
far reaching phenotypic consequences. Indeed, the expression of several regulatory proteins
are known to phase varies (van der Woude and Baumler, 2004; van der Woude, 2006). The
expression state of all genes in the corresponding regulons is indirectly affected by such
switches. Documented examples of global regulators include Mga, a virulence-associated
regulatory protein in S. pyogenes (Bormann and Cleary, 1997); BvgS, a member of the global
two-component BvgAS regulatory system in Bordetella pertussis (Mattoo et al., 2001); and
possibly the type III restriction-modification system coded by the mod gene in H. influenzae
(Srikhanta et al., 2005). More local regulators can be involved in the coordination of specific
phenotypes – as exemplified by fimbriae, whereby crosstalks exist between fim and homo-
logues of the pap operon (van der Woude, 2006). Besides, the contribution of contingency
loci to the diversification of regulatory patterns can be more subtle. For instance, the binding
affinity of a transcription factor is modulated by a mononucleotide tract in the operator site of
an adhesion-coding gene in N. meningitidis (Martin et al., 2005). Thus, mechanisms facilitat-
ing mutations can further promote the evolution of weak regulatory linkage.

109
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

II.2.4. Physiological regulation of mutagenesis

a - Stress-induced mutagenesis can be spatially confined

Several phenomena suggesting a profound connection between major stress responses


and increased mutagenesis have been presented earlier (see Stress-induced mutagenesis, p47).
Conceptually similar genome-wide instabilities induced by a variety of stressing conditions
have been experimentally reported in various bacteria, yeast and human cancer cells. For a
comprehensive review, see Galhardo et al., 2007. There is no well defined or universal mo-
lecular mechanism responsible for stress-inducible mutagenesis, but rather a collection of in-
terconnected processes and recurring themes. Global stress responses coordinate the
expression of proteins that alleviate the deleterious effects of inadequate environmental condi-
tions on the organism. Because genomic alterations heavily compromise survival, it is not
surprising that mechanisms involved in DNA repair are incorporated in these responses.
However, the processes ensuring genome maintenance are not error free, and increased activ-
ity in time of stress incidentally promotes the fixation of mutation. By avoiding the constant
production of deleterious mutations, such phenomena promote the evolvability of populations
when it is most needed (see below Evolvability, p111).
Stress-induced mutagenesis essentially bestows a temporal control over diversification
on a genome-wide scale. However, spatial targeting of mutations may be surperimposed in
some instance. In E. coli, the coincident induction of the RpoS and SOS responses defines a
specific hypermutable state, whereby the relatively error-safe repair of DSBs is switched to a
DinB-dependent highly mutagenic process (Ponder et al., 2005). Because the repair of DSBs
is restricted to the region surrouding the lesion, such a hypermutagenic switch would actually
result in a targeted increase in mutations. The location of DSBs being random, the whole ge-
nome would be collectively affected at a populational level. But only regions near DSBs
would be mutagenized in individual cells, thereby limiting the overall mutational load
(Galhardo et al., 2007).

b - Targeted mutagenesis can be regulated

Contingency loci readily achieve spatial confinement of mutation rates – enabling a


genetic bet-hedging strategy (see p103). Nevertheless, in a number of cases environmental
signals and intercellular regulatory networks can be integrated with – or surperimposed on –
some switching mechanisms (van der Woude, 2006). The most straightforward examples con-

110
Evolvability and robustness – Evolvability

cern those contingency mechanisms that are closely dependent on DNA repair processes. For
instance, mutability at SSR loci is influenced by Pol IV and the MMR machinery (see
Extrinsic influences affecting SSR mutation rates, p70) – whereas gene conversion events de-
pend on the recombination apparatus. Any stress affecting these repair mechanisms can im-
pact the variation rate at such contingency loci. Natural populations of N. meningitides and H.
influenzae contain a significant proportion of mutL general mutators. As both bacteria encode
a large number of SSR contingency loci, there might be a synergistic effect between increased
bet-hedging and impaired MMR (Denamur and Matic, 2006; Moxon et al., 2006).
The most striking examples of complex coupling between contingency mechanism and
physiological regulation concern the promoter inversion in the fim system and the epigenetic
switch at the pap operon. The mechanisms involved in their regulation have been detailed
previously (see pp 87 and 98, respectively). In both systems, the switching frequencies are
regulated by a battery of cellular factors that are likely to provide integrated cues with regard
to the environment of the cell. Because switching events remain essentially stochastic, the
overall strategy still corresponds to bet-hedging. Nonetheless, these systems gain information
so as to tune their bet to fit specific environments. These examples are among the most docu-
mented illustrations of genuine directed mutations resulting in inheritance of the acquired
characters. One can appreciate the extent to which the apparent teleological outcome reflects
the combined action of chance with a sophisticated apparutus teleonomically designed to pro-
duce variability (see Appendix, p253).

II.3. Evolvability and robustness

II.3.1. Evolvability
A classical result of population genetics is that the rate of adaptation in a population is
proportional to the genetic variance in fitness in this population (Fisher, 1930). However, the
entities under selection are not genotypes but their physical embodiment – phenotypes. The
parameter that really constrains the rate of adaptation is thus the phenotypic variance – and
specifically the availability of heritable phenotypic diversity in the population. In this context,
the notion of evolvability refers to the capacity to evolve through the generation of heritable
variations in fitness. Two distinct positions concerning evolvability can be contrasted in the
literature (Sniegowski and Murphy, 2006): i) organisms evolved the capacity to evolve – i.e.

111
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

evolvability is an adaptation resulting from second-order selection; and ii) evolvability is a


by-product of other mechanisms that evolved to directly benefit the organisms (first order se-
lection).
The first point may appear teleological. Indeed, natural selection cannot adapt a popu-
lation for future contingencies any more than an effect can precede its cause (Dickinson and
Seger, 1999). As discussed previously, the generation of diversity is strongly promoted under
fluctuating selection (see p42). Even if such conditions are met, the concept of evolvability as
an adaptation implicitly involves selection among populations rather than individuals in this
particular context. Indeed, the average fitness effect of mutations strongly influence the opti-
mal mutation rate (Orr, 2000). As most random mutations are deleterious, a modifier driving
increased mutation rates would generally be counter-selected at the individual level, even
though it would increase the evolvability of the population. Mutator alleles impose a heavy
deleterious burden on the population over time, and continuous rise in global mutagenesis is
not a sustainable strategy (see p44). Because the average fitness effect actually depends on the
adaptedness of the population (Eyre-Walker and Keightley, 2007), increased mutation rates
are arguably more advantageous in adverse conditions. To some extent, mechanisms of stress-
induced mutagenesis limit the production of mutation to such situations. However, the muta-
bility they afford often arises as opportune side-effect of processes that evolved for more im-
mediate functions (Redfield, 2001; Matic et al., 2004). In these cases, there is no ground to
support the first hypothesis rather than the more parsimonious idea that evolvability is a by-
product of simpler functions subjected to first-order selection (Tenaillon et al., 2004; Gal-
hardo et al., 2007).
Nonetheless, the mechanisms underlying phenotypic variations are not limited to in-
creases in genome-wide mutagenesis. Several mechanisms capable of generating stable phe-
notypic variations have been presented hiterto. Most of these variations being teleonomically
confined to restricted areas of the phenotypic space, such processes limit the production of
deleterious mutations that hampers the evolutionary success of global modifiers. While some
of these mechanisms subvert other cellular functions (e.g. SSR and gene conversion), they
cannot be considered as mere by-products – but are better described as specialized mutators.
Hence, all kinds of contingency loci rather support the “evolvability-as-adaptation” hypothe-
sis (Sniegowski and Murphy, 2006). Because independent modules can be easily reused (see
above, p108), the modularity of biological entities is often presented as a factor promoting
evolvability. In this light, evolution appears as a self-promoting and explosive process. The
concept of modularity being fairly young in biological science, it remains unclear if this prop-

112
Evolvability and robustness – Evolvability

erty is selected to favor evolutionary tinkering – or if it arises as a by-product of other con-


straints (Wagner et al., 2007).
Overall, the combination of global processes with a range of specialized and targeted
mechanisms ensures adaptability to a wide panel of situations (see Figure 26). Although es-
sentially deleterious, genome-wide mutagenesis is essential in producing unbiased genetic in-
novations. This process is particularly inefficient and slow because it acts blindly at the
sequence level. In contrast, physiological regulation and contingency loci are programmed by
natural selection to rapidly and/or responsively create functional phenotypic variability – but
not true novelty. Besides, elements facilitating the exchange and acquisition of DNA se-
quences enable the instantaneous gain or loss of whole functions. Thus, not only different
processes reach different phenotypic scopes, but they also operate on different timescales
thereby increasing the flexibility of the organism’s response to change (Jablonka et al., 1995;
Rando and Verstrepen, 2007).

113
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

II.3.2. Robustness
Some phenotypes are inherently less sensitive to perturbations. This property is often
referred to as robustness (de Visser et al., 2003) or canalization (Waddington, 1942b). Pheno-
typic robustness is likely to be a beneficial feature because adapted traits must be kept stable
in order to ensure the maintenance of fitness over time. Robustness appears at various levels
of biological organization, including gene expression; protein folding; metabolic flux; physio-
logical homeostasis; development and – ultimately – organism fitness. The mechanisms un-
derlying robustness are diverse – ranging from thermodynamic stability at the RNA and
protein level to behavior at the organism level. Phenotypes can be robust either against herita-
ble perturbations (e.g., mutations) or non-heritable perturbations (for instance noise or envi-
ronmental variations). These phenomena are referred to as genetic and environmental
robustness, respectively (Elena et al., 2006). Moreover, robustness can be an intrinsic property
of a phenotype or be extrinsically mediated by a dedicated mechanism.
The robustness of a trait faced to heritable perturbations depends on its so-called target
size. The more proteins are involved in the correct expression of this trait, the higher the odds
for the trait to be altered by mutation. Practically, the target size is difficult to estimate be-
cause it relies on a multiplicity of parameters such as the overall sequence size, the sequence
composition and the tolerance of each protein to mutations. The structure of the genetic code
illustrates a straightforward mechanism to modulate the intrinsic target size of a gene. Each
group of 6-fold degenerate codons (corresponding to leucine, arginine and serine) can be de-
composed into two groups of 4 and 2 codons, wherein only the last base of the codon varies.
In the smaller group, half of the mutations affecting the third position of codons results in
amino-acid changes at the protein level. In contrast, any such mutation affecting a codon of
the larger group leads to a synonymous codon, and hence would not impact fitness (but see
Results – Intrinsic evolutionary potential of genes, p212). Then, using a codon pertaining to
the first group decreases the mutational target size of the gene – thereby supporting purely ge-
netic robustness. It has been claimed that this property – termed codon volatility – could be
used to detect selection based on a single sequence (Plotkin and Dushoff, 2003; Plotkin et al.,
2004). While this possibility is unlikely (Dagan and Graur, 2005; Sharp, 2005; Plotkin et al.,
2006), it remains that low volatility could be selected to minimize error during translation
(Archetti, 2006). The redundancy of the genetic code can also avoid the occurrence of SSRs,
thereby avoiding increases in local mutation rates when robustness is desirable (Wanner et al.,
2008; Ackermann and Chao, 2006). Besides, large populations of proteins subjected to intense

114
Evolvability and robustness –

neutral drift under purifying selection and high mutation rates are enriched in stable variants
(Bloom et al., 2007; Bershtein et al., 2008), demonstrating that intrinsic robustness can be ef-
fectively selected experimentally. Another factor influencing the target size is functional re-
dundancy arising from gene duplication, polyploidy and alternative metabolic pathways.
Mechanisms mediating extrinsic robustness are best illustrated by chaperones – a spe-
cific subset of the heat-shock proteins (Hartl et al., 1994). Chaperones belong to structurally
unrelated protein families that share the ability to recognize and bind to aberrant protein con-
formations. Under normal physiological conditions, chaperones are involved in guiding pro-
tein folding during translation; assisting the entrance of newly synthesized polypeptides into
organelles; or facilitating the building of multimeric complexes. Under stressful conditions,
chaperones prevent misfolding and aggregation – or can even actively disaggregate damaged
proteins and restore their proper conformation. Some mutations that would normally affect the
folding or stability of the protein can be rescued by chaperones, which thus canalize the phe-
notypic expression of genetic mutations. In E. coli, the overproduction of the chaperone
GroEL is able to partly rescue fitness losses resulting from intense genetic drift (Fares et al.,
2002). In D. melanogaster (Rutherford and Lindquist, 1998) and A. thaliana (Queitsch et al.,
2002), the Hsp90 chaperone is also specifically involved in stabilizing several signaling path-
ways. Wheter extrinsic robustness is an adaptation or a contingent phenomenon will be dis-
cussed below (see p116). Chaperone genes are upregulated in response to various stresses.
Among different species, the thresholds for expression are correlated with the levels of stress
that they naturally undergo (Feder and Hofmann, 1999). By rescuing damaged proteins, these
proteins mediate robustness to non-heritable environmental perturbations.
It is likely that environmental and genetic robustness are linked. Notably, RNA se-
quences that have been evolved in silico toward the ability of folding into a given structure
across a wide range of temperatures are also less prone to structural change as a consequence
of mutations. Interestingly, the selection for increased stability also promoted the emergence
of modularity within the RNA molecules (Ancel and Fontana, 2000). In this particular case,
genetic robustness incidentally arose as a consequence of environmental robustness. The
aforementioned selection of protein with increased stability exemplifies the converse relation-
ship (Bloom et al., 2007; Bershtein et al., 2008) – wherein increased stability is selected to
favor genetic robustness to high mutation load, but also favors tolerance to a wider range of
temperatures.

115
Introduction – Phenotypic plasticity, genetic variations and physiological regulation

II.3.3. Links between robustness and evolvability


While evolvability promotes the adaptability of populations faced to new environ-
ments, robustness stabilizes phenotypes in front of adverse conditions. Although these two
processes seemingly carry distinctly opposite functions, their relationships are often ambigu-
ous. Indeed, mechanisms that generally promote robustness can also favor evolvability on oc-
casional situations, and conversely.
The buffering of phenotypic variations enlarges the neutral space accessible to a popu-
lation and thereby also widens the exploration of the adaptive landscape. By supporting the
diversification of a population over time, robustness can thus increase its evolvability. This
process cannot occur if the tolerance to mutation is purely grounded on genotypic robustness
– as is the case with codon volatility (see above p114). In silico evolution of RNA molecules
demonstrated that selection for genotypic robustness effectively antagonizes evolvability,
while selection for general phenotypic robustness increased evolvability (Wagner, 2008). In a
population experimentally subjected to directional selection, a mutation promoting the intrin-
sic stability of the target protein was shown to increases the fraction of advantageous amino
acid changes (Bloom et al., 2006). In this case, intrinsic robustness promoted evolvability be-
cause thermodynamic stability affects the phenotype of the gene – not its genotype.
The proteins involved in extrinsic robustness are often referred to as evolutionary ca-
pacitors. Their alteration can release phenotypic canalization – uncovering insofar hidden
variations. In the case of Hsp90, inactivation of the protein; modulation of its expression level
in response to environmental cues; or diversion from its usual targets through stress-mediated
saturation effectively lead to phenotypic diversification in D. melanogaster (Rutherford and
Lindquist, 1998) and A. thaliana (Queitsch et al., 2002). The exact mechanisms underlying
these changes are unknown, but they can be maintained independently of Hsp90 after selec-
tion during few generations – thereby increasing evolvability (see Genetic assimilation of
long lasting regulation, p107). Interestingly, expression of the GroES/EL chaperone is under
the control of a DNA inversion switch in B. fragilis (Kuwahara et al., 2004; and see p83).
This mechanism may promote the stochastic or controlled release of hidden phenotypic varia-
tions.
The hide-and-release of genetic variations is probably an intrinsic property of epistatic
genetic systems subjected to environmental interactions (Hermisson and Wagner, 2004;
Zhang, 2008). Computer simulations showed that inactivating a gene in a regulatory network

116
Evolvability and robustness – Links between robustness and evolvability

that has previously been evolved for phenotypic stability increases the phenotypic variance
and speed of adaptation toward a new optimal phenotype (Bergman and Siegal, 2003). In S.
cerevisiae, many single gene deletions increase the phenotypic variance of the cells, and can
thus act as evolutionary capacitors. These genes tend to be less dispensable with respect to
growth rate and are highly connected within cellular networks (Levy and Siegal, 2008). In
these reports, robustness is inferred from the phenotypic effects induced by genetic alterations
rather than from actual tolerance to perturbations. To a large extent, the phenomenon of ca-
pacitation can be regarded as an emergent feature of complex biological systems – and not
necessarily as an adaptation for canalization nor a consequence of robustness. The perceived
role of Hsp90 in extrinsic robustness could therefore simply reflect its central position in cel-
lular networks. Nonetheless, it has been proposed that selection for robustness can affect the
topology of interaction networks. The effect of a mutation increases with the number of char-
acters it affects – its pleiotropy. By reducing the number of negative pleiotropic effects per
mutation, selection for robustness can limit the interaction to a restricted set of elements –
thereby defining a module (Cooper et al., 2007b; Wagner et al., 2007). In this light, modular-
ity would be the consequence of selection for robustness. Yet modularity probably favors
evolvability on a larger timescale.
To some extent, processes that are evolvable at a given level of organization may
guarantee robustness higher in the hierarchy. For instance, the diversification of outer-
structures using contingency loci ensures the robustness of microbial infection within their
host. In this regard, RNA viruses constitute an extreme case. These entities exhibit the largest
mutation rates monitored across different taxa, which underlies their evolvability (see Figure
2, p32). The median selection coefficient against single mutations across different RNA vi-
ruses has been estimated to ca. 10.8% per generation – a figure that sharply contrast with the
ca. 1.7% measured in DNA organisms (Elena et al., 2006). Thus, RNA viruses are poorly ro-
bust at the genetic level. Now, the efficiency of natural selection increases with population
size and selective coefficient. These two parameters probably act in positive synergy within
population of RNA viruses – which are typically very large. The efficiency of selection may
then be used to drive the effective elimination of unfit variants at the benefit of non-mutated
genotype, thereby promoting robustness at the clonal population level. Such a strategy has
been referred to as anti-redundant in contrast to direct robustness mechanisms (Krakauer and
Plotkin, 2002).

117
Introduction – The integron genetic system

III. THE INTEGRON GENETIC SYSTEM

III.1. Overview of the system

III.1.1. A brief historical perspective


Most antibiotics are natural compounds synthesized by microorganisms. Although
their actual role in the wild has been questioned (Davies, 2006), they exhibit significant bacte-
ricidal or bacteriostatic activities by interfering with essential cellular processes of bacteria at
high concentrations. The unprecedented therapeutic benefits of antibiotic were readily illus-
trated during the second world-war, shortly after the initial discovery (Fleming, 1929) and
subsequent purification (Chain et al., 1940) of penicillin. This success inaugurated a new era
in the fight against bacterial pathogens, and initiated a massive effort toward the identification
and industrial production of new antibiotics.
The spontaneous apparition of rare resistant clone was not unnoticed at the time. How-
ever, mutations were then fundamentally regarded as discrete, independent and random events
(see Appendix), and an early study even confirmed this assumption in the case of antibiotic
resistance (Lederberg and Lederberg, 1952). In this context, the concomitant development of
resistance to several antibiotics was largely considered beyond adaptive potential of bacterial
populations. As exemplified in the previous chapter, bacteria developed numerous genetic
tools and strategies to overcome the drastic environmental changes they routinely undergo. In
hindsight, it is not surprising that clinical isolates of Shigella dysenteriae that were simultane-
ously resistant to four antibiotics (streptomycin, tetracycline, chloramphenicol and sulphona-
mides) were identified shortly after the introduction of these antibiotics in the 1950s

118
Overview of the system – A brief historical perspective

(Mitsuhashi et al., 1961). Today, multi-resistance has become a major public health issue.
Some pathogenic bacterial strains are virtually resistant to all known antibiotics. Novel classes
of antibiotic are ever more difficult to isolate and most of the innovations in the drug industry
rely on slight modifications of existing chemical scaffolds. Resistance determinant adapt so
quickly to this minor changes that drug development is a hardly viable, economically speak-
ing.
We now know that different strategies can results in resistance phenotypes: i) modifi-
cation of the antibiotic target; ii) bypass of the targeted pathway; iii) inactivation of the anti-
biotic; iv) modification of membrane permeability; and v) active efflux of antibiotics
(Tenover, 2006). The genetic requirements to achieve these functions range from mere point
mutations to acquisition of whole operon. It is clear that pre-existing genetic mechanisms
have been recruited to support the rapid evolution, acquisition and spread of multi-resistant
functions. Several ge-
netic elements imbri-
cate on top of each
other, just as Russian
dolls, are implicated
in the 1950s’ initial
outbreak of multi-
resistant Shigella
dysenteriae and fol-
lowing. During the
1970s, it was deter-
mined that multi-
resistance phenotypes
are frequently associ-
ated with transmissi-
ble plasmids and more
specifically with
transposable elements
located in this plas-
mids (Liebert et al.,
1999). The genetic

119
Introduction – The integron genetic system

system responsible for the gathering of resistance determinants on these mobile elements was
first described in the late 1980s (Martinez and de la Cruz, 1988) and termed integrons (Stokes
and Hall, 1989). It is now clear that such mobile integrons constitute the major vectors of an-
tibiotic multi-resistance in gram-negative and to a lesser extent in gram-positive bacteria.
Their importance in clinical and agricultural settings is reflected by the impressive amount of
epidemiological studies monitoring their prevalence and evolution (see Figure 27).
More recently, a much bigger integron was found on a chromosome of the Vibrio
cholerae (Mazel et al., 1998) and similar loci were further identified in a significant fraction
of environmental bacteria (Rowe-Magnus et al., 2001; Boucher et al., 2007). Such chromo-
somal integrons are ancient and relatively sedentary adaptive systems developed by bacteria
to face a changing world. Their discovery provided a paradigm to understand the emergence
of the multi-resistance integrons.

III.1.2. Structure of integrons


All integrons are composed of a stable platform, which contains the functional ele-
ments required for the functioning of the system, associated with a variable array of discrete
gene cassettes encoding accessory functions (see Figure 28).

a - The functional platform

The principal component of the integron functional platform is the intI gene which en-
codes a site-specific tyrosine recombinase. This enzyme catalyzes the specific excision and
integration of dedicated and discrete genetic elements, known as gene cassettes (Stokes and
Hall, 1989). The integration of cassette essentially occurs at a specific loci lying immediately
adjacent to intI, referred to as the primary recombination site attI (Collis et al., 1993). The ex-
pression of the gene contained in the integrated cassettes is ensured by a dedicated promoter
Pc which is generally embedded in the intI gene and oriented toward attI (Collis and Hall,
1995). In itself, the functional platform is stable and non-mobile.

b - The cassette array

Successive integration at the attI site results in the streamlined assembly of different
gene cassettes (Recchia et al., 1994). This cassette array constitutes the versatile part of the
integron. Gene cassettes are minimal functional elements intended to be mobilized by the in-
tegrase of integrons. They are generally constituted by a single ORF immediately followed by

120
Overview of the system – Different flavors of integron

a recombination site termed attC specifically recognize by IntI. Cassette-borne genes are gen-
erally promoterless and their expression is hence conditioned by the proximity of an external
promoter, essentially Pc. Accordingly, the ORF in cassettes are usually oriented toward the
attC site. The excision of cassettes by the IntI integrase leads to non-replicative covalently
closed circular intermediates (Collis and Hall, 1992).

III.1.3. Different flavors of integron


Although they present a similar organization, mobile and chromosomal integrons dis-
play distinctive characteristics that underpin different evolutionary history. This section out-
lines the paradigmatic characteristics of both groups. With the accumulation of genomic data,
it is becoming clear that a continuum of intermediate forms exists between these two ex-
tremes.

a - Mobile integrons

Mobile integrons correspond to functional platforms that are physically associated


with mobile DNA elements, such as TEs and conjugative plasmids. These elements are used
as natural genetic vehicles, enabling efficient transmission between bacterial individual of the
same or different species. Mobile integrons contain few cassettes of heterogeneous origins
that are probably collected successively in different genomics backgrounds. The longest array
identified is composed of 8 cassettes (Naas et al., 2001a). The heterogeneity of the cassette is
attested by the unusual codon usage of their associated ORF and by the sequence and size di-
versity of their attC sites. Contrasting with their heterogenous origins, cassettes associated
with mobile integron display a striking functional homogeneity. A pool of >130 different cas-
settes harboring antibiotics resistance gene have been identified in mobile integrons (based on
98% nucleotide identity threshold) (Partridge et al., 2009). Together, these cassettes provide
resistance to most classes of antibiotics including β-lactams, all aminoglycosides, chloram-
phenicol, trimethoprim, streptothricin, rifampin, erythromycin, fosfomycin, lincomycin, qui-
nolones and antiseptics of the quaternary-ammonium-compound family (Mazel, 2006;
Partridge et al., 2009). Only, few cassettes in mobile integron from clinical isolates harbor
ORFs of unknown functions.
Five different classes of mobile integrons have been defined to date, based on the se-
quence of the encoded integrases (40–58% identity). Although only the three first ones have
been historically involved in the spread of multi-resistance phenotypes, all five classes have

121
Introduction – The integron genetic system

been associated with antibiotic-resistance determinants (Mazel, 2006). Class 1 integrons are
the most widespread and clinically important, as they are detected in 22 to 59% of Gram-
negative clinical isolates (Labbate et al., 2009). As such, they constitute the major experimen-
tal model of integron. They are associated with functional and non-functional transposons de-
rived from Tn402 that can be embedded in larger transposons, such as Tn21 (see Mounting of
mobile integrons, p137). Class 2 integrons are exclusively associated with Tn7 derivatives.
The integrase gene of class 2 integrons, intI2 contains a nonsense mutation that yields a non-
functional protein. Class 3 integrons are also thought to be located in a transposon and are

122
Overview of the system – Different flavors of integron

more prevalent than class 2. The other two classes of mobile integrons have been identified
through their involvement in the development of trimethoprim resistance in Vibrio species.
The class 4 integron is embedded in the integrative and conjugative element SXT found in Vi-
brio cholerae. The class 5 is located in a compound transposon carried on the pRSV1 plasmid
of Alivibrio salmonicida.
Interestingly, class 1 (Stokes et al., 2006) and class 3 (Xu et al., 2007) integrons that
are not associated with resistance genes have been recovered in environmental bacteria. In
each case, these integrons carried cassettes of unknown functions. In addition, a functional
class 2 integron isolated from beef cattle was associated with four non-antibiotic-resistance
gene cassettes (Barlow and Gobius, 2006). These data indicate that mobile integrons are not
specifically dedicated to antibiotic resistance. The prevalence of these functions results likely
from biased sampling focused on clinically relevant environment and reflects the evolutionary
success of integrons in these settings.

b - Chromosomal integrons

The integron described on the small chromosome of V. cholerae serotype O1 biotype


El Tor strain N16961 is the paradigmatic example of chromosomal integron (Mazel et al.,
1998). Its array contains 179 cassettes and spans ca. 3% of the whole genome. Contrasting
with the largely variable attCs found in mobile integrons, 149 cassettes comprise attC sites
that differ in sequence by less than 10% over their entire length of 122 to 124 nucleotides (see
Figure 29).
Chromosomal integrons have been found in a wide panel of bacterial species. A list of
representative examples is shown in Table 4 The cassette array of chromosomal integrons can
contains a much larger number of cassette (up to 217 in V. vulnificus), though some contains
only few or even no cassettes. The homogeneity of attC within a single integron is not con-
fined to V. cholerae but has also been observed in V. fischeri, V. metschnikovii, P. alcaligenes,
P. stutzeri, X. campestris, and T. denticola (Labbate et al., 2009). Homologous attC sites are
essentially species specific and define a genetically relevant typology (Rowe-Magnus et al.,
2001; Rowe-Magnus et al., 2003; and see Figure 29). To a large extent, chromosomal inte-
grons are sedentary resident in their host genomes (Rowe-Magnus et al., 2001; Boucher et al.,
2007). This point will be further discussed in the light of IntI phylogeny (see Chromosomal
integrons are ancient and widespread structures, p135).
In sharp contrast with mobile integrons, chromosomal integrons maintain highly di-
verse cassettes – mostly of unknown functions. The analysis of vibrionales genomes available

123
Introduction – The integron genetic system

124
Overview of the system – Different flavors of integron

in 2007 led to the identification of 1677 cassettes (Boucher et al., 2007). Among these, 65%
have not homologues in the database; another 4% correspond to proteins with homologues of
unknown functions; while 6% can only be assigned a vague general function. Altogether, 75%
of the cassette pool corresponds to accessory genes of undefined functions – which empha-
sizes the importance of integrons in gathering genetic diversity. These data parallel the obser-
vations made in environmental mobile integrons mentioned above. The remaining 25%
cassettes contain genes with a wide functional distribution (see Figure 30). The most preva-
lent functions are phage-related proteins; toxin-antitoxin systems; acetyltransferase; DNA
modification and virulence. The few functions which have been experimentally confirmed in-

125
Introduction – The integron genetic system

cludes restriction or methylation systems, sulfate-binding proteins, lipases, polysaccharide


biosynthesis and dNTP pyrophosphohydrolases (Rowe-Magnus et al., 2001; Smith and Sie
beling, 2003; Robinson et al., 2008). As mentioned above, most cassettes in large integrons
are silent – thus any mutation affecting these cassettes is expected to be neutral. In this light,
the overrepresentation of phage related functions in integrons probably reflects the neutral in-
cidence of mobile element insertions in silent cassettes. Large cassette arrays are enriched in
Toxin-Antitoxin (TA) systems – also known as post-segregational killing systems, which en-
code a stable toxin and its unstable cognate antitoxin. The genome of V. cholerae N16961
contains 13 TA loci, all of which are present in the integron array. Recently, these addiction
modules were shown to maintain the stability of the cassette array, by preventing excision and
loss of the surrounding silent cassettes (Rowe-Magnus et al., 2003; Szekeres et al., 2007).
A substantial part of the functions identified in chromosomal integrons are involved in
substrate modification (acetyltransferases) or interactions with biotic factors (virulence factors
and DNA modification). About 10 to 30% of the cassettes potentially encode proteins carry-
ing a signal peptide region for either membrane association or export from the cell (Koenig et
al., 2008). Besides, we found that ca. 30% of the cassette-encoded proteins display signatures
of multiple transmembrane domains (unpublished observations). Altogether, these data indi-
cate that chromosomal integrons carries important functions to mediate interactions with ex-
ternal environments.

III.2. Functional organization of integrons

III.2.1. A unique site-specific recombination mechanism

a - Double and single stranded recombination substrates

i. Integron integrase are tyrosine recombinases

Integron integrases belong to the family of tyrosine recombinases (see p82), though
they exhibit an additional and specific functional domain compared to other closely related
recombinases (Messier and Roy, 2001). These enzymes usually perform recombination be-
tween two DNA regions by establishing a synapse between their cognate binding site and the
subsequent resolution of the resulting HJ. Their typical core recombination sites consist of a
pair of highly conserved 9-13 bp inverted binding sites separated by a 6-8 bp central region

126
Functional organization of integrons – A unique site-specific recombination mechanism

(Grainge and Jayaram, 1999; Grindley et al., 2006). As described below, attI sites signifi-
cantly differ from this canonical organization, while attCs sites are processed in a very un-
conventional way.

ii. Structure of attI sites

The core recombination site of


attI is composed of two binding sites
termed L and R. The recombination
point is located in a conserved 5’-GTT-
3’ triplet between G and TT (Hall et al.,
1991). The inverted L binding, always
degenerate with respect to R is hardly recognizable. In addition, the central region differs
greatly between different attI sites. In vitro experiments performed with the class 1 integrase
IntI1 on its cognate double-stranded attI1 site demonstrated that four regions are actually
bound by the enzyme. Two of these regions correspond to the core site, while the other two
regions, dubbed DR1 and DR2 form direct repeats located 5' to the core site (Gravel et al.,
1998; Collis et al., 1998). The structure of attI1 is shown in Figure 31. The attIs sites from
different integrons diverge significantly, paralleling the pattern observed for integrases
(Rowe-Magnus et al., 2001). Cross recombination studies involving attIs and IntIs of het-
erologous origins evidenced that integrases preferentially recognize their cognate attI sites,
but did not rule out the possibility of cross talk between different systems (Hall et al., 1999;
Collis et al., 2002b). For instance, the inactivated integrase generally found in class 2 integron
can be complemented in trans by IntI1.

iii. attC sites form single stranded substrates

The structure of attC sites is more complex (see Figure 32). Each extremity contains a
degenerate core site, dubbed R”-L” and L’-R separated by a central region which is highly
variable in sequence and size (20-104 bp) (Mazel, 2006). A comparison of attC sites shows
that sequence conservation is restricted to two triplets 5’-AAC-3’ and 5’-GTT-3’ located in
the R” and R’ boxes, respectively. Consistent with attI sites, the recombination point is lo-
cated between the G and TT of the latter motif. Thus only the L’–R’ core site is recombino-
genic. Contrasting with their sequence heterogeneity, attC sites display a strikingly conserved
palindromic organization that can form cruciform structures through the extrusion and self-
pairing of both DNA strands (Stokes et al., 1997; Rowe-Magnus et al., 2003). Upon folding,

127
Introduction – The integron genetic system

single stranded attC


sites present an al-
most canonical core
site consisting in L"-
L' and R"-R' duplexes
separated by a bulged
region.
The impor-
tance of this secon-
dary structure for
proper interaction
with the integrase
was initially put for-
ward by in vitro bind-
ing experiments
(Francia et al., 1999;
Johansson et al.,
2004). The delivery
of single stranded by conjugation showed that the bottom strand of attC sites is ca. 103 more
recombigenic than the top strand in vivo (Bouvier et al., 2005). The elucidation of the struc-
ture of vchIntIA bound to a folded attC bottom strand further confirmed that attCs sites con-
stitute atypical recombination substrates (MacDonald et al., 2006). Generally, the top and
bottom strands of attC sites by the orientation of conserved extra-helical bases with respect to
the 5’-GTT-3’ recombination motif differ upon folding. The structure of the IntI1-attC com-
plex showed that these extra-helical bases are precisely contacted by the IntI-specific addi-
tional domain (MacDonald et al., 2006). Sequence modified in such a way that extra-helical
bases are appropriately oriented on the top strand lead to a switch of integrase strand specific-
ity. However, this genetic manipulation resulted in cassettes being inserted in the wrong direc-
tion with respect to Pc, preventing expression of the associated ORF (Bouvier et al.,
submitted).
In contrast to canonical core recombination sites, the genetic information required for
proper recombination is not contained in the primary sequence of attC sites, but mostly in
their secondary structures. This mechanism readily explains how cassettes with diverse attC
sites can be mobilized by a single mobile integron. Nevertheless, the evolutionary advantages

128
Functional organization of integrons – A unique site-specific recombination mechanism

arising from this atypical process are unclear, and will be discussed latter in light of new re-
sults (see Discussion – Single stranded DNA: a bridge between two systems, p241).

b - The different recombination reactions

i. attC x attC recombination

Recombination between two attC sites located in the same cassette array leads to the
excision of a circular cassette intermediate (Collis and Hall, 1992). Two single stranded and
appropriately folded attCs must be contacted at the same time by an integrase complex to es-
tablish a HJ. Resolution of the synapse leads to the excision of a covalently closed single
stranded intermediate from the bottom strand, while the top strand remains unchanged. Cas-
sette excision is thus an asymmetric and semi-conservative process. Upon replication, one of
the molecules remains unchanged while a cassette is effectively deleted from the other. Sev-
eral cassettes can be excised at the same time when the recombination does not occur between
to immediately successive sites. However, circular intermediates made of two cassettes seem
to be further resolved into single cassette circles prior to reintegration (Collis et al., 1993). In
this respect, it is worth noting that the sequence downstream the R’ box of a given attC gener-
ally forms a palindrome with the sequence upstream of the next cassette’s R” box. The circu-
lar cassette resulting from the recombination between these two sites then contains an attC
that folds in a longer and probably more stable stem-loop structure. This feature may promote
high rate of reintegration of excised cassette.
Recombination can also occur between attC sites located on two different molecules.
This intermolecular reaction is disfavored compared to the intramolecular recombination be-
tween physically linked sites described above. In the case of mobile integrons, recombination
frequently occurs between two copies of the same plasmid, resulting in co-integration. Co-
integration of two newly replicated chromosomes may occur but prevent proper segregation,
and lead to the formation of chromosome dimers after another round of replication. Although
such dimers can be resolved by dedicated mechanisms (Sivanathan et al., 2009), this makes
co-integration of two chromosomal integrons unlikely. When one of partner is a previously
excised circular cassette intermediate, recombination leads to the insertion of the cassette just
after the attC site contacted on the other molecule (Collis et al., 1993). Again, the process is
semi-conservative and only one replicated strand is modified.

129
Introduction – The integron genetic system

130
Functional organization of integrons – A unique site-specific recombination mechanism

ii. attI x attC recombination

The integration of a circular cassette intermediate preferentially occurs at the attI site
compared to an arbitrary attC site within the array (Collis et al., 1993; Collis et al., 2001).
This feature is essential in ensuring immediate expression of integrated cassettes by the up-
stream Pc promoter. In mobile integrons, attI x attC recombinations can occur between sites
located on two different plasmids, leading to co-integration. The excision of cassette resulting
from attI x attC recombination has never been accurately monitored.
The attI sites seem to be processed in the usual double stranded form. Conventional
resolution of the HJ formed between a single stranded attC and a double stranded attI would
lead to an abortive product. Proper resolution is thus dependent on unidentified host factors
and a model relying on replication has been proposed (Bouvier et al., 2005; and see Figure
33). According to this hypothesis, cassette insertion would only affect one of the daughter
DNA molecules upon replication. Supposing replication occured between the excision of a
cassette and its subsequent reintegration, the cassette may be duplicated if integration involves
the attI site of the daughter DNA derivating from the top strand.

iii. attI x attI recombination

The recombination between two attI sites has been observed but is particularly ineffi-
cient (Collis et al., 2001). This reaction may happen if the number of attI sites in the cell is
important – i.e. in the context of mobile integron harbored by high copy number plasmids.

iv. Recombination at secondary sites

Insertion events at unconventional sites outside of the integrons have been occasion-
ally observed. This can occur at very low frequency between either attI or attC site and 5’-
GTT-3’ containing sequences (Francia et al., 1993; Recchia et al., 1994; Recchia and Hall,
1995; Francia and Garcia Lobo, 1996; Francia et al., 1997; Hansson et al., 1997). In any case,
the expression of the inserted element is conditioned by the presence of a promoter at the in-
sertion site. The absence of surrounding recombination partners renders subsequent excision
unlikely, ensuring the stability of the inserted cassette.

c - Accessory factors

IntI-mediated recombination does not seem to rely on any absolutely required acces-
sory factor. The attI x attC insertion reaction was recently reconstructed in vitro with the class

131
Introduction – The integron genetic system

1 system, which suggests that IntI1 indeed possesses all the functions required to carry out
this reaction (Dubois et al., 2007). However, this observation does not rule out the existence
of accessory factors that might increase recombination efficiency in some systems. For in-
stance, attI x attC recombination occurred at a 2,600-fold higher rate in V. cholerae than in E.
coli using a system derived from V. cholerae. In contrast, the recombination frequencies in a
system derived from the Class I integron were identical in both species (Biskri et al., 2005).
This suggests that host factors in V. cholerae increase recombination in the resident chromo-
somal integron, while the class 1 mobile integrons achieve higher degree of independence.

III.2.2. Expression of cassettes’ genes

a - Transcription

Few exceptions aside (e.g. Bissonnette et al., 1991; Stokes and Hall, 1991), gene cas-
settes are promoterless and their correct expression relies on their relative position from the
Pc promoter. Cassette ex-
pression has been studied
in details in class 1 inte-
gron, which display two
potential promoter Pc1 and
Pc2 (Collis and Hall, 1995;
and see Figure 34). The
Pc1 is embedded in the
intI1 sequence (Stokes and
Hall, 1989). Four version
(1 strong, 2 weaks and 1
hybrid intermediate) that
differ by the sequence of
the -35 and -15 boxes has
been distinguished
(Levesque et al., 1994;
Bunny et al., 1995). The
Pc2 is located in the upper
part of attI but is usually

132
Functional organization of integrons – Expression of cassettes’ genes

inactivated by incorrect spacing between the -35 and -15 elements. However, a canonical
spacer has been observed in some attIs, generally associated with a weak version of the Pc1
(Collis and Hall, 1995). The Pc promoter of class 2 integrons has not been accurately mapped
but seems to lie within attI2, hence resembling Pc2 (Levesque et al., 1994). In class 3 inte-
grons (Collis et al., 2002a) and in the chromosomal integron of Pseudomonas stutzeri strain Q
(Coleman and Holmes, 2005), the Pc has been located in the 5’ part of the integrase gene,
similarly to Pc1. The location of the Pc in V. cholerae has never been explored experimentally
and is also assumed to be embedded in intI.
In class 1 integrons, the transcripts originating from both Pc promoters are of varying
lengths and can span several cassettes. Folded attC sites may determine this pattern by func-
tioning as transcriptional terminators (Collis and Hall, 1995). The representation of a given
cassette’s gene in the transcript pool, and hence its expression level, decreases with increasing
distance from the promoter. Then, only the few first cassettes are expressed at significant
level, depending on the strength of the promoter (Collis and Hall, 1995). This model seems
applicable to all types of integrons. In large chromosomal integrons, the majority of genes in
the array are thus silent.

b - Translation

The presence of binding motif initiating the assembly of ribosome (RBS) is a major
determinant of gene expression levels (Shultzaberger et al., 2001). Some cassette-borne genes
are preceded by a functional RBS, while the motif is seemingly absent from others. In this lat-
ter case, the translation complex can be initiated at an upstream ORF. In class 1 integron, a
small ORF preceded by a functional RBS is located in attI1 and is thus present in all tran-
scripts originating from the Pcs. This ORF – dubbed orf11 – overlaps the recombination point
so that its actual 3’ end depends on the first cassette inserted in the array. When the first cas-
sette in the array brings an appropriately located termination codon, the translation complex
responsible for ORF11 expression can process to the cassette encoded ORF. This mechanism
accounts for a significant part of the expression of some cassettes (Hanau-Berçot et al., 2002).
In the same perspective, the presence of small ORFs in attC sites may serve to increasing ex-
pression by mediating processivity of the translational complex.

133
Introduction – The integron genetic system

134
Integrons and evolution – Chromosomal integron as the source of mobile integrons

III.3. Integrons and evolution

III.3.1. Chromosomal integron as the source of mobile integrons

a - Chromosomal integrons are ancient and widespread structures

Integrases of the tyrosine-recombinase family are essential in the processing of a wide


variety of mobile elements – including integrons; phages; ICEs and genomic islands. The
plasticity associated with the content of these elements as well as with their genomic locations
makes integrases stable models to resolve their phylogenetic relationships (Boyd et al., 2009).
From a broad perspective, the integron integrases form a well defined clade among the tyro-
sine-recombinase family (see Figure 35).
A comprehensive analysis of the 603 completely or partially sequenced genomes
available in 2007 showed that 9% of them carry an integron integrase (Boucher et al., 2007).
The phylogenetic relationship between the 56 corresponding integrases is showed in Figure
36. Three major groups can be distinguished in this tree: i) the soil-freshwater proteobacteria
group of integrons, mostly composed of proteobacteria from freshwater and soil environ-
ments; ii) the marine γ-proteobacteria group; and iii) the inverted integrase group, character-
ized by the co-linear orientation of the integrase with respect to the cassette array, so that the
attI site is found in the 3’ end of the integrase. The first two clades form ecologically relevant
taxons (Mazel, 2006), while the third one regroup taxonomically diverse organisms but corre-
late with a structural peculiarity (Boucher et al., 2007).
Overall, the distribution of chromosomal integrons spans several bacterial phyla and
the branching pattern of integrases is in good agreement with the organismal phylogeny. This
clearly shows that integrons are ancient and generally stable genomic structures (Rowe-
Magnus et al., 2001; Rowe-Magnus et al., 2003; Mazel, 2006; Nemergut et al., 2008). Never-
theless, several phylogenetic incongruences can be noted (Boucher et al., 2007; and see
Figure 36). Notably, alteromomonades (including Shewanella, Pseudoalteromonas, Altero-
monas) and vibrionales normally are sister taxa that are closely related to pseudomonadales to
form a consistent group a marine species. In the IntI based dendrogram, vibrionales atypically
branch whithin the alteromomonades to cluster with Alteromonas and Pseudoalteromonas,
thereby separating these groups from the Shewanella. The vibrionales taxon mostly keeps its
monophyletic structure in the IntI tree, at the exception of V. fischeri. Consequently, the pseu-

135
Introduction – The integron genetic system

136
Integrons and evolution – Chromosomal integron as the source of mobile integrons

domonadales taxon is fractioned in discrete clusters that branches in diverse location of the
tree. One cluster is related to the soil-freshwater proteobacteria group, another branch with V.
fischeri, while others are scattered within the alteromomanades. Overall, the strinking poly-
phyletic structure of alteromomanades, together with the widespread distribution of pseudo-
monadales in the IntI-based phylogeny, suggests that several transfers of the integron platform
occurred independently along the evolutionary history of chromosomal integrons. The obser-
vation that the monophyletic group of integrons with inverted integrases includes representa-
tives from unrelated phyla (Proteobacteria, Planctomycetes, Chlorobi, Spirochaetes and
Cyanobacteria) further supports the occasional mobility of chromosomal integrons.
Importantly, the integrases of mobile integrons do not group together but are rather
scattered in the tree. IntI1 and IntI3 are closely related and pertain to the soil-freshwater pro-
teobacteria group. IntI2 and the integrases corresponding to the two other mobile integron re-
ported in Vibrio branch to various points within the marine γ-proteobacteria group. Hence,
despite their common functional contribution to antibiotic resistance phenotypes, mobile inte-
grons does not form an evolutionary consistent group. In contrast, they arose several times in-
dependently, most probably from chromosomal integrons incidentally mobilized by other
mobile elements (Rowe-Magnus et al., 2001).

b - Mounting of mobile integrons

Phylogenetic evidences strongly suggest that mobile integrons originated from the as-
sociation of chromosomal integrons with mobile elements. The mobilization processes that
led to the successful radiation of class 1 integrons within clinical environments are relatively
well understood. The following paragraph provides a reconstitution of the evolutionary his-
tory that hypothetically generated the most prevalent forms of class 1 integrons (Labbate et
al., 2009). The different steps of this scenario are outlined in Figure 37.
Class 1 integrons not associated with resistance cassettes have been reported in the chromo-
somes of the environmental bacteria Azoarcus communis MUL2G9 and Acidovorax sp.
MUL2G8A (Stokes et al., 2006). These integrons are not associated with mobile elements and
may be regarded as the ancestor of mobile class 1 integrons. Such a chromosomal integron
was subsequently embedded in a functional transposon, through the addition of transposition
genes associated with two cognate inverted repeats. This structure then acquired a cassette
harboring qacE (resistance to quaternary ammonium compounds). The resulting element is
known as Tn402. This transposition system particularly targets the resolution (res) sites of
plasmids and other transposons. In this context, the qacE cassette potentially provided an

137
Introduction – The integron genetic system

138
Integrons and evolution – Chromosomal integron as the source of mobile integrons

adaptive advantage that drove the transfer of Tn402 to different plasmid and transposons. The
next important step was the recruitment of sulI (resistance to sulfonamides). This event fol-
lowed an unconventional mechanism that resulted in the deletion of the 5’ end of qacE and its
associated attC site. The acquisition of this major resistance determinant probably boosted the
spread of the Tn402 derivative in a wide range of genetic backgrounds. The mobile integron
then gain access to a vast repertoire of cassettes, among which potential resistance cassettes
(see next section). This spread was accompanied by diverse alterations and associations with
other ISs. A prominent event was the deletion of part of the transposition region, leading to
the inactivation of Tn402. This led to the genetic context that is now prevalent in clinical iso-
lates: an upstream region termed 3’-CS consisting of the left-hand inverted repeat associated
with the intI1/attI1 functional platform, and a downstream 5’-CS consisting of the qacE frag-
ment associated with the sul1 gene and the partially deleted transposition region (see Figure
37).
This structure is still mobilizable by related transposases in trans or through the asso-
ciation with another active transposon. This latter case is best illustrated by the Tn21 transpo-
son, which results from the insertion into a Tn501-like transposon harboring the mer genes
conferring resistance to mercury. Tn21 borne by plasmid NR1 (R100) contributed to the ini-
tial outbreak of multi-resistance identified in the 1950s mentioned above (Mitsuhashi et al.,
1961), and led to the historical discovery of integrons (Martinez and de la Cruz, 1988).
Evidences showing that other mobile integrons followed a similar evolutionary path
are widespread. Class 3 integrons are also associated with Tn402-like elements and versions
lacking these transposition features have been found in the chromosome of two Delftia spe-
cies (Xu et al., 2007). Class 2 integron are generally associated with Tn7, an active transposon
that can preferentially target conjugative plasmids or a unique conserved site within bacterial
chromosomes (Parks and Peters, 2009). The integron platform harbored by the Vibrio sal-
monicidae plasmid pRVS1 is very closely related to the one of Pseudoalteromonas
haloplanktis TAC125, providing a clear-cut example of recent chromosomal integron mobili-
zation (Szekeres et al., 2007).

c - Resistance gene and chromosomal integrons

The functional platforms of mobile integrons probably arise from essentially stable
chromosomal integrons. However, chromosomal integrons mostly harbor cassettes of un-
known functions. In this context, where do the resistance cassettes found in many mobile in-
tegron come from? Two lines of evidences suggest that resistance cassettes are gathered by

139
Introduction – The integron genetic system

mobile integrons wandering in different host genomes: i) some resistance cassette display an
attC sequence that match the typical sequence of chromosomal integrons; ii) some chromo-
somal integron carry identifiable resistance genes.
The first attC sites to be described were 59 bp long and were all closely related in se-
quence (Cameron et al., 1986; Martinez and de la Cruz, 1988). It was initially though that the
length and structure of these elements, termed 59-be were characteristic of resistance cassette
(Stokes and Hall, 1989). These 59-be were subsequently found to be closely related to the
attC sites typically harbored in chromosomal integron of Xanthomonas spp (Rowe-Magnus et
al., 2001; Gillings et al., 2005). This implies that many common resistance cassettes (such as
aadA1, aadA6, aadA7, aadB, aacA and qacF) probably originated in these genomes (Rowe-
Magnus et al., 2001). Likewise, several resistance cassettes with longer attC sites probably
originated in Vibrio spp. This feature contributed to the realization that previously identified
Vibrio cholerae repeats (VCR) (Barker et al., 1994) were part of a large chromosomal inte-
gron (Recchia and Hall, 1997; Clark et al., 1997; Mazel et al., 1998). Examples include the
CARB-4 and dfrVI cassettes, which harbor typical attC sites of V. metschnikovii and V. para-
haemolyticus respectively (Rowe-Magnus et al., 2001). Cassettes harboring a dfr gene are
also present in the mobile integrons located in SXT and plasmid pRVS1. Recently, two cas-
settes comprising the qnr gene (resistance to fluoroquinolones) associated with attC sites
typical of V. parahaemolyticus and V. cholerae were found in class 1 integrons isolated from
V. cholerae (Fonseca et al., 2008). Interestingly, these genes are closely related to non-mobile
qnr genes present in vibrionales genome (Poirel et al., 2005). It is tempting to speculate that
these structural genes were independently recruited to the chromosomal integron present in
the genome of these Vibrio species, and were latter recruited by class 1 integron. Closely re-
lated qnr genes are also associated with class 1 integron in various enterobacteria (Nordmann
and Poirel, 2005). However, these genes lack an attC site and their integration relies on inde-
pendent mechanisms (Robicsek et al., 2006a). It might be that these cassettes initially drove
the spread of qnr genes from vibrionales reservoirs, and were latter atypically mobilized in the
so-called complex integrons.
Several cassette encoding resistance determinants have been found in chromosomal in-
tegrons. An unexpressed but functional catb9 cassette specifying resistance to chlorampheni-
col is present in the array of V. cholerae N16961 and is associated with an attC site
corresponding to this species (Mazel et al., 1998). Similarly, CARB-7 and CARB-9 cassettes
harboring typical attCs were independently found in the chromosomal array of environmental
V. cholerae isolates (Melano et al., 2002; Petroni et al., 2004). These cassettes provide resis-

140
Integrons and evolution – Chromosomal integron as the source of mobile integrons

tance to β-lactam and resemble the CARB-4 cassette found in class 1 integron. A dfr cassette
was recently found in the array of V. splendidus LGP32 (Le Roux et al., 2009), parralleling
the presence of such cassette in diverse mobile integrons. Also, an aacC-A7 cassette confer-
ring resistance to aminoglycosides was identified in Saccarophagus degradans (Elbourne and
Hall, 2006). The attC signature is indicative of an exogenous origin related to Nitrosococcus
oceani, which provide an example of gene cassette exchange between chromosomal inte-
grons.

d - The generation of cassettes

Some cassettes harbor ORFs that are homologous to structural genes, indicating that
chromosomal sequences can be recruited to cassette. In several instances, the analysis of the
phylogenetic distribution of genes found in different cassettes and outside a cassette context
clearly suggests that recruitment to cassette occurred several time independently (Recchia and
Hall, 1997; Rowe-Magnus et al., 2003; Boucher et al., 2006). This point is strengthened by
the observation of cassettes harboring closely related genes associated with clearly distinct
attC types (Recchia and Hall, 1997; unpublished data). Together, these data implies that cas-
sette generation is not a rare event. Even though hypotheses have been put forward, the
mechanisms responsible for the creation of cassettes remain unknown.
The observation that most cassette lack promoter and contain very little non-coding
sequence led R. M. Hall and colleagues to suggest that the process of cassette formation may
involve the reverse transcription of an mRNA molecule (Hall et al., 1991; Recchia and Hall,
1997). The repeated identification of bacterial group II introns inserted behind attC sites
(Sunde, 2005) fostered P. Roy and colleagues to propose a mechanism relying on the retro-
transpositional and RNA catalytic properties of these elements (Centron and Roy, 2002; Leon
and Roy, 2003). Although this model may account for the occasional creation of cassettes, it
relies on processes that are too complicated and contingent to account for the tremendous di-
versity of cassette observed in single genomes.
The fact that some genomes harbor hundreds of cassettes with closely related attC se-
quences may indicate that they encode the machinery required for gene recruitment and addi-
tion of specific attC sites (Rowe-Magnus et al., 2003). These genomes would then stand as
genuine cassette factories. The corresponding integrons are often referred to as superintegrons
(Mazel, 2006). As will be further discussed in the next sections, this hypothesis implies that
superintegrons are the source of cassettes observed in mobile integrons.

141
Introduction – The integron genetic system

Anecdotally, new cassettes can arise through the modification of existing cassettes.
Cases of cassette fusion have been reported (for instance, see Centron and Roy, 2002). By fa-
voring the genetic linkage and co-expression of genes, such events may participate in the
creation of novel operon, such as those evidenced in various genomic islands. As will be dis-
cussed below, mobile elements are often inserted in integron. Such elements can introduce
new gene between two attC sites, thereby providing the material for subsequent evolution of a
novel cassette.

III.3.2. A central role in horizontal gene transfer

a - Evidences for interspecies cassettes exchanges

The recognition that the sequence of attC sites tend to be species specific provide a
valuable tool to trace the origins of cassettes (Rowe-Magnus et al., 2001). As illustrated
above, the origin of several antibiotic resistance cassettes have been determined this way.
Other cases implicating cassettes unrelated to antibiotic resistance pervade chromosomal inte-
grons (unpublished data). However, a systematic and rigorous analysis based on attC signa-
tures remains to be carried out.
These data show that cassette exchange between different species can readily occur.
The mobilization of gene cassette with diverse recombination sequences rely on the specific
recombination mechanisms at work in integrons. The information carried by attC sites is
mostly expressed upon folding of single-stranded molecules (see above, p127), which allow
integron integrases to recognize a wide variety of seemingly unrelated sites. The recombina-
tion efficiency on different substrates seems to vary among integrases. For instance, IntI1 ef-
ficiently recombine a wider range of structure than vchIntIA (Biskri et al., 2005), a feature
that may partly explain the success of the class 1 integron.
Cassettes might be acquired upon excision as circular intermediate through transfor-
mation for instance. In this respect, it is noteworthy that major contributors to the cassette
pool, such as V. cholerae and most probably other Vibrio, enter a transformable state when
forming biofilms in the presence of chitin (Meibom et al., 2005; Bartlett and Azam, 2005; and
see p60). Another route for the transmission of cassette involves highjacking of mobile ele-
ments.

142
Integrons and evolution – A central role in horizontal gene transfer

b - Mobile integrons and the spread of cassette

The complex genetic structure resulting from the coupling of sedentary chromosomal
integrons to mobile elements have been described in details above (see p137). Associations
with mobile elements that show a wide range of hosts, such as most transposons and are self-
mobilizable, such as conjugative plasmids greatly enhances the dispersion potential of inte-
grons. Although the prevalence of mobile integrons unrelated to antibiotic resistance has
hardly been studied in most natural environmental environment, at least five clearly inde-
pendent cases of mobile integrons have been described so far. The mobilization of gene cas-
sette by mobile integron is probably a frequent event in natural setting, and this may play a
significant role in the dissemination of cassette away from the factory genomes (see The gen-
eration of cassettes, p141). Indeed, mobile integron are likely to experience a variety of ge-
netic backgrounds and can readily exchange cassettes with the chromosomal integron of their
hosts and with other mobile integrons. This process is reflected by the diverse origins of the
attC sites identified in the arrays of mobile integrons. Mobile integron may also functions as
efficient vehicles to shuttle cassette between chromosomal integrons. The direct recruitment
of a chromosomally borne cassette by a class 1 integron was demonstrated with the catB9 cas-
sette of V. cholerae N16961 (Rowe-Magnus et al., 2002).

c - A cassette metagenome

The contribution of integron to genome evolution clearly extends beyond the classical
vertical transfer of genetic information, as illustrated by the common exchange of gene cas-
sette between different species. HGT is an important contributor to bacterial evolution
(Ochman et al., 2000), and integrons certainly stand as a major facilitating mechanism. Over-
all, the global pool of cassettes can be encompassed as a shared metagenome potentially ac-
cessible by a diverse bacterial community (Stokes et al., 2001).
Available genomic sequences provide a glimpse of the genetic diversity encoded in
gene cassettes (see Figure 30, p125). However, sequenced genomes constitute a particularly
biased sample of the bacterial biosphere and the actual diversity can only be assessed through
metagenomic approaches (Keller and Zengler, 2004). Several techniques based on PCR am-
plification using degenerate primers have been developed to extract integrons from metage-
nomic samples in a culture-independent manner. These techniques can be used to recover intI
sequences, using primer matching the seemingly conserved regions (Nemergut et al., 2004);
or gene cassettes, using primer targeting attC sites (Stokes et al., 2001; Holmes et al., 2003).

143
Introduction – The integron genetic system

Several studies led to the identification of integrases and a limited number of associated cas-
settes in atypical environments such as heavy-metal-contaminated mine tailings (Nemergut et
al., 2004) or hydrothermal vents (Elsaied et al., 2007). Other reports specifically focused on
the analysis of cassette diversity. Because the sequences of attC sites are very variable, prim-
ers designed to amplify cassette are necessarily restricted to a subset of the actual cassette
pool. Despite these limitations, a study restricted to a 50 m2 soil plot suggested that the area
contained >2300 different cassettes (Michael et al., 2004). This estimation was based on the
limited information provided by the fine resolution of cassette lengths after cassette PCR.
More recently a similar study was conducted on marine sediment samples, but 2145 amplified
cassettes were entirely sequenced. Four different – though geographically close – sites were
sampled in Halifax harbor, Canada. Diversity analysis suggested that these locations collec-
tively contain ca. 3000 different cassettes. Here again, a major fraction (80%) of the recovered
cassettes harbored genes of unknown functions (Koenig et al., 2008). Together, these data
highlight that the cassette metagenome constitute an incredible source of genetic diversity.
Is the cassette metagenome globally available to all integrons? Or is it subdivided into
smaller pool restricted to specific communities? This important question remains largely un-
solved. It is likely that integron integrases developed some substrate preferences along their
evolutionary history. However, the existence of mobile integrons not only offer facilitated ac-
cess to novel cassette but also provides genomes with integrases potentially selected to ac-
commodate a wide range of substrates (Biskri et al., 2005). Nonetheless, the overall cassette
diversity available in a given environment is likely to be limited and somewhat adapted to a
specific niche, leading to specialized local pools or cassette ecotypes. In the Halifax harbor
study mentioned above, the authors found that two geographically distant – but ecologically
related – sites contain more cassette types in common than expected by chance, which support
the ecotype concept (Koenig et al., 2008). In this context, access to the global gene pool is
necessarily limited and depends on the migration pattern of individual bacteria and/or mobile
integron between specific niches.

III.3.3. Integrons as sophisticated contingency loci

a - Working model

This section provides an outlook of the essential features characterizing integrons.


These properties can be integrated in a consistent conceptual model which highlights the

144
Integrons and evolution – Integrons as sophisticated contingency loci

adaptive role of the system (see Figure 38).


The minimal integron consists in a functional platform composed of a site-specific in-
tegrase gene intI, tightly associated with a cognate recombination site attI and a promoter Pc
ensuring the expression of integrated cassettes. The genetic linkage between these elements is
very strong, as the Pc promoters are embedded in intI and/or attI sequence. This tightly
packed locus is able to integrate and expressed an indefinite number of accessory traits mobi-
lized as gene cassettes that are stockpiled downstream of attI. Because gene cassettes are
promoterless, only the first cassettes are expressed by Pc. While random excisions occur
throughout the cassette array to form non-replicable circular intermediate containing one or
several cassettes, integrations preferentially occur at the attI site. These observations support a
model in which newly integrated cassettes are immediately expressed and thereby gradually
driving previously integrated cassettes away from expression. Integration events are thus sub-
jected to selection. Non-expressed cassettes in the array constitute a reservoir of standing ge-
netic variability that can be mobilized through excision and subsequent reintegration at attI. In
this light, an integrons function as a sophisticated contingency locus that is able to switch the
expression state of accessory traits owing to dedicated site-specific recombination machinery.
This defines a system that generates genetic diversity at a targeted locus.
While most contingency loci can only access a narrow range of phenotype, integrons
can potentially access a huge metagenomic repertoire of gene cassettes that encodes a wide
range of different functions. In at least some species, an extended version of the system is
necessarily able to generate new cassettes to enrich the existing pool. In addition, the stepwise
attenuation of expression resulting from successive integration events led to the progressive
relaxation of selective pressure. This particular regime may drive accelerated diversification

145
Introduction – The integron genetic system

of cassette-encoded proteins (see Impact of expression strength on sequence evolution, p106).


Overall, the functions carried in the cassettes might provide ready-made adaptive opportuni-
ties to adapt to a vast panel of different environments. The integration of a new cassette can
either provide an advantageous phenotypic trait in the current conditions, in which case it will
be selected, or unproductively take previously integrated advantageous traits away from ex-
pression, an event that must be counter-selected. In this light, the integron system, by stock-
piling elements that previously prove adaptive, provides a form of molecular memory. Drastic
reductive evolution leading to the loss of unexpressed gene cassettes is reduced by the pres-
ence of addiction module (TA systems) in large array. As discussed previously, the rate of
phenotypic switching is an essential parameter in such systems (see Stochastic switches as a
bet-hedging strategy, p102 and Evolution of recombination rate in integrons, p152). However,
the effective recombination dynamic in integron was essentially unknown until recently.

b - An unknown recombination dynamic

The functional activities of integron integrases has been demonstrated experimentally


over a wide range of substrates and model systems. However, in all this studies the integrase
is overexpressed, generally from a plasmidic vector. In fact, spontaneous changes in cassette
arrays within a single strain have never been observed in controlled conditions.
Nevertheless, evidences of modification in integron cassettes array are countless. This
is manifest in the continuum of slightly different array identified in mobile integron isolated
from clinical isolates. An overview of these different structures can be found on the integrall
website (http://integrall.bio.ua.pt/; Moura et al., 2009). Valuable information concerning
large chromosomal integrons comes from the comparison of completely sequenced array. The
large chromosomal arrays reported in several vibrionacae are highly variable and most cas-
settes are only found in one genome (Boucher et al., 2006; and see Table 5). Even the array of
the two closely related V. vulnificus strains CMCP6 and YJ016, which respectively harbor 211
and 188 cassettes, show little similarity in composition or order (Chen et al., 2003). In fact,
integrons stand as one of the most variable genomic loci in these organisms. This feature has
been used as a typing system to finely resolve phylogenetic relationships between otherwise
identical strains (Labbate et al., 2007). However, this method relies on the length resolution of
cassette PCR amplicons and do not allow clear identification of the recombination events.
A fine analysis of the recombination dynamic would require sequencing the array of
very closely related isolates. Such a situation is unlikely to occur by the random sequencing of
environmental bacteria. Nevertheless, the comparison of three complete genome of

146
Integrons and evolution – Integrons as sophisticated contingency loci

pathogenic V. cholerae strains isolated from related pandemic outbreak recently provided a
better picture (Feng et al., 2008). Overall, 207 different cassettes, of which only 36 are
unique, have been identified. As illustrated in Figure 39, there is a substantial conservation of
cassette order between genomes. More precisely, the pattern of syntheny is organized in cas-
sette blocks. This strongly supports the fact that several cassettes can be excised and reinte-
grated together. As expected from the model discussed above, the distal part of the array is
less variable than the proximal one. Several observations are consistent with the mobilization
of distal cassette and subsequent reintegration at attI (see arrows in Figure 39B). Interestingly,
many cassettes seem to have been duplicated, most probably through the copying of pre-
existing cassettes in the same integron. For instance, of the 40 cassettes which were appar-
ently integrated at the attI site of M66-2 since divergence from N16961 (block A, Figure 39),
21 were copied from downstream cassettes. Among these, one is located in block A, 12 are
scattered in downstream blocks and 8 correspond to cassettes present in N16961 but absent in
M66-2. These last cassettes may have been duplicated and lost or may correspond to effective
mobilization from the distal to the proximal part of the array. These data are in accordance
with the mobilization of cassette as single stranded substrates (see Figure 33). Based on a so-
phisticated whole genome analysis, the authors estimated that the M66-2 and N16961 strains
(which were isolated in 1937 and 1971 respectively) diverged around 1923, while strain 0395
(isolated in 1965) diverge from the former clade in 1880. Hence, substantial variation in the
cassettes arrays accumulated in few decades.
A better understanding of the recombination dynamic of integrons would benefit from
experimental evolution studies. Nevertheless, the apparent absence of recombination under
laboratory conditions seriously hampered this approach. What is the source of the discrepancy
between the overwhelming variability observed in natura and the stability in controlled condi-
tions? Obviously, the natural selection of newly expressed traits upon integration plays a sig-
nificant role in the cassette dynamic. As no adaptive recombination were observed when a

147
Introduction – The integron genetic system

resistance cassette capable of overcoming an antibiotic selection was afforded in experimental


systems, the absence of recombination could not stem from an inadequacy between selective
pressure and the adaptive potential carried in the array. Instead, the array stability results from
limiting rates of recombination. The integrase is indeed expressed at very low levels in stan-
dard laboratory cultures. Despite extensive efforts, the intI1 promoter proves impossible to
map in these conditions (M.C. Ploy, unpublished observation). Interestingly, overexpression

148
Integrons and evolution – Integrons as sophisticated contingency loci

of integron integrases is deleterious, as attested by growth and survival rates (Mazel lab, un-
published observations). In addition, a recent study reported that about one third of the intI
genes identified in public databases are pseudogenes, inactivated by either an internal stop
codon or a frameshift mutation (Nemergut et al., 2008).
We found that integrase expression is actually controlled by a wide physiological
stress response in a majority of integrons. This feature endows the integron system with the
ability to adapt responsively to environmental changes. This will be exposed in the first sec-
tion of the results (see Recombination in integrons is controled by the SOS response to stress,
p151).

149
150
RESULTS

151
Results – Evolution of recombination rate in integrons

I. EVOLUTION OF RECOMBINATION RATE IN


INTEGRONS

BACKGROUND
Integrons can be regarded as sophisticated contingency loci (see p144) capable of
switching the expression state of independent gene cassettes harboring diverse functions.
These systems thus exploit a bet-hedging strategy wherein the population diversifies by site-
specific recombination into subpopulations expressing different cassettes. The extent of this
diversification is constrained by the actual recombination rate. Although this parameter is es-
sential in understanding the adaptive properties of integrons, the dynamic of cassette recom-
bination remains largely unknown (see p146). Theoretical studies on phenotypic switches
suggest that the recombination rate should display a strong dependency on the rate at which
challenging environmental changes occur (a challenging environment being one that can be
overcome through cassette swapping) (see p103). In this work, we developed a model to track
the optimal recombination rate of an “idealized” integron under fluctuating selection.

METHODS
We used C-implemented Monte-Carlo simulations to model the evolution of the site-
specific recombination rate in an integron subjected to a randomly fluctuating environment.
The fitness conferred by each cassette in the integron array is given by a stabilizing fitness
function. The optimum of this function is randomly shifted during environmental changes,
which occur stochastically at a predefined rate. Only the first cassette in the array is expressed
and hence contributes to fitness. Deleterious mutations in non-expressed cassettes do not af-
fect the phenotype. The integrase is treated as a modifier locus which is indirectly selected
through its effect on cassette expression. Mutations between integrase alleles associated with
different recombination rate is allowed. The mean recombination rate of the population, its
mean fitness and the proportion of deleterious mutations are recorded at each generation.

RESULTS AND DISCUSSION


Low rates of cassette recombination ensure the stability of the system in steady condi-
tions. However, too few recombinations would limit the adaptability of integrons facing vari-

152
Results – Evolution of recombination rate in integrons

able environments. In contrast, too high recombination rates exaggerately decrease the mean
fitness and are thus counter-selected. We found that the optimal recombination rate selected in
fluctuating environments is linearly proportional to the rate of environmental changes.
Though it has no detectable effect on recombination rate, the accumulation of deleterious mu-
tations in non-expressed cassettes is inversely proportional to the rate of environmental varia-
tions. This suggests that cassette-borne functions may evolve at a higher rate than
continuously expressed genes. The optimal mutation rate can be achieved with a distinct strat-
egy, wherein recombination events occur punctually in the whole population as a stress-
response to environmental changes. This eventuality is addressed in article 2 and 3 (see Re-
combination in integrons is controled by the SOS response to stress, p171).

ARTICLE I
This article is in preparation to be submited to PLoS One or to the Journal of Evolu-
tionary Biology (pp 154-170).

153
Results – Evolution of recombination rate in integrons
Article I

Title: Evolution of site-specific recombination rates in integrons


under fluctuating selection.

Cambray Guillaume1*, Chevin Luis-Miguel2* and Didier Mazel1


1
Institut Pasteur, Unité de Plasticité du Génome Bactérien, CNRS URA 2171, 75015 Paris,
France

2
Ecologie, Systématique & Evolution, Université Paris-Sud XI, 91400 Orsay, France

*: These authors contributed equally to this work

Abstract
Integrons are complex genetic structures, which are notably responsible for most of the
antibiotics multi-resistance phenotypes that threaten our control over pathogenic bacteria. The
system maintains an array of unexpressed gene cassettes that stands as a reservoir of potential
genetic variation. Silent cassettes can be randomly recombined at an expression site by a
dedicated integrase. This establishes a switching mechanism that allows instantaneous expres-
sion of potentially adaptive functions. Here, we model the evolution of integrase-mediated re-
combination rate in a stochastically fluctuating environment. The integrase gene is a modifier
locus at which alleles can change the turnover rate of the expressed cassettes. Simulations
show that the mean recombination rate of a population would tend to fit the environmental
change rate. Cassette-borne genes are under relaxed selective constraints when not expressed.
We show that deleterious mutations tend to accumulate in the unexpressed part of the cassette
array. While this process does not affect the mutation rate in large populations, it may pro-
mote the functional diversification of cassette-encoded functions. We suggest that a stress re-
sponsive control of recombination rate may be an efficient alternative to a constitutively
determined bet-hedging strategy. This work highlights the importance of integrons as a major
bacterial adaptive system through its effect on evolvability.

154
Results – Evolution of recombination rate in integrons
Article I
Introduction
Bacteria are one of the major successful life form (Gould, 1996). They have been iso-
lated from a wide range of natural environments, some of which quite extremes. However, di-
rect observation of natural microbial communities is uneasy and little is known about the
actual ecology of bacterial populations. Indeed, a great majority of the bacterial species
(>99%) cannot be cultivated, rendering their detection and studies only possible through me-
tagenomic approaches (Streit and Schmitz, 2004; Rusch et al., 2007). Despite the inherent dif-
ficulties in characterizing the ecological specificities of individual bacterial species, it is safe
to generally regard those organisms as essentially sessile on a macroscopic scale. As a conse-
quence they cannot track their environment in space when it changes, and hence experience a
wide variety of environmental variations, be they physical, chemical or biotic (Andrews,
1998).
The ubiquity of bacteria underlies the remarkable diversity of metabolisms developed
over evolutionary times. The ability of bacterial populations to adapt rapidly to new and ever-
changing environments has been documented in both experimental and natural conditions.
The long-term evolution experiment initiated in 1988 by R. Lenski and colleagues monitored
the adaptation of twelve replicate populations of Escherichia coli to a regime of exponential
growth in a nutrient-limited environment (Lenski et al., 1991; Lenski and Travisano, 1994).
Since then, numerous fitness-enhancing phenotypic variations occurred more or less repeat-
edly in the different populations. These variations include morphological diversification
(Lenski and Mongold, 2000; Philippe et al., 2009), topological modification of DNA (Crozat
et al., 2005), global change in gene expression (Cooper et al., 2003; Pelosi et al., 2006; Phil-
ippe et al., 2007; Cooper et al., 2008), specialization and diversification of metabolic abilities
(Cooper and Lenski, 2000; Cooper et al., 2001; Blount et al., 2008). Adaptive changes occur-
ring in natural conditions are more difficult to identify and most examples involve medically
or economically relevant pathogenic bacteria. The most striking illustration is certainly the
ever more rapid development of resistance phenotype consecutive to the introduction of new
antibiotics (Hawkey, 2008).
According to Fisher’s fundamental theorem of natural selection (Fisher, 1930), the rate
of adaptation in a population equals the heritable variance in fitness in this population. The
speed of adaptive evolution can thus theoretically be increased by increasing the variance in
fitness in the population – its evolvability. The most straightforward way to achieve this is
through a genome-wide increase in mutation rate. Mutator strains are indeed found at non-

155
Results – Evolution of recombination rate in integrons
Article I
negligible frequencies (0.1% to >60%) in pathogenic bacterial isolates (Denamur and Matic,
2006). Besides, a mutator phenotype reached fixation in 3 out of the 12 replicate lines per-
petuated in Lenski’s long-term evolution experiment (Sniegowski et al., 1997). These strains
are generally affected in their ability to perform mismatch repair, a major system of DNA re-
pair also involved in recombination. Particularly, mutations in the mutS and mutL genes re-
sults in up to 100-fold increase in mutation rates (Denamur and Matic, 2006).
Genes that can affect the mutation rate by their activity can be abstracted as modifier
loci which are subjected to indirect selection by hitchhiking with the mutations they contrib-
uted to generate (Kondrashov, 1995). Because most mutations are deleterious (Eyre-Walker
and Keightley, 2007), a general mutator is generally counter-selected. Nevertheless, the pro-
duction rate of advantageous mutations increases with the initial maladaptation of the popula-
tion (Silander et al., 2007; Martin and Lenormand, 2006). Hence, increased mutation rate are
more likely to be beneficial in fitness-compromising conditions. Indeed, mutator readily pro-
vides short-term adaptive advantage in new, changing or heterogeneous environments (Giraud
et al., 2001). The rise in frequency of a mutator genetically linked to a beneficial mutation is
only the consequence of adaptation, and do not constitute an adaptation in itself (Sniegowski
et al., 2000). Instead, the accumulation of deleterious mutations over time hampers the long-
term success of mutator populations (Funchain et al., 2000; Cooper and Lenski, 2000; Zeyl et
al., 2001). After adaptation occurred, a modifier locus experiences a strong selective pressure
toward lower mutation rate if the environment remains constant (De Visser, 2002). As a re-
sult, the spontaneous mutation rates observed in various microbes are surprisingly steady and
are though to be actively maintained at the lowest level afforded by the cost of fidelity (Drake
et al., 1998).
The evolution of increased evolvability through an increase in the genome-wide muta-
tion rate is thus heavily constrained by the prevalence of deleterious mutations. To circumvent
this limitation, some loci are subjected to frequent, stochastic and heritable modifications me-
diated by dedicated genetic or epigenetic mechanisms (van der Woude and Baumler, 2004;
and see Discussion). These specific mutations are easily reversible, which promotes the con-
stitutive wavering between well defined phenotypic states. Such localized increase in muta-
tion allows the combinatorial diversification of target functions while limiting the potential
deleterious effects of mutations at loci that do not need to evolve.
Integrons constitute a particularly sophisticated example of such systems. A typical in-
tegron consists of a stable platform associated with a variable array of dedicated gene-

156
Results – Evolution of recombination rate in integrons
Article I
cassettes. The functional platform constitutes a tightly packed locus comprising an intI gene,
coding for a site-specific recombinase, a primary recombination site attI and a promoter Pc
oriented toward attI (figure 1). The gene cassettes integrated in integron arrays are generally
composed of a single and promoterless ORF flanked by two attC recombination sites (Mazel,
2006). The integrase catalyses the recombination of cassettes through a cut-and-paste mecha-
nism whereby cassettes are randomly excised from the cassette array (attC x attC recombina-
tion) to be preferentially integrated in attI downstream of the Pc promoter (Collis et al., 1993;
Collis et al., 2001). This oriented process ensures instantaneous expression of the mobilized
cassettes, while previously integrated cassette are progressively moved away from the Pc
(Collis and Hall, 1995). Overall, only the few first cassettes in the array are expressed and
hence subjected to selection, while the others constitute a silent reservoir of potential genetic
variation (figure 1). Exogenous cassettes uptaken from the environment or brought about by
mobile elements can be incorporated in the array, thereby enriching the repertoire of available
functions (Holmes et al., 2003; Biskri et al., 2005).
Two distinct forms of integrons are generally distinguished in the literature. Mobile in-
tegrons (MI) were the first to be identified through their involvement in antibiotic multi-
resistance phenotype (Martinez and de la Cruz, 1988; Stokes and Hall, 1989). They are lo-
cated on mobile genetic elements such as ICEs, plasmids and transposons, which permit their
dissemination and potentially make them efficient shuttles for the transfer of cassette between
genomes (Biskri et al., 2005). They comprise only few cassettes (up to 8 (Naas et al., 2001b)),
which typically encode antibiotic resistance proteins (Fluit and Schmitz, 2004). Chromosomal
integrons (CI), in contrast, are essentially sedentary. They have been identified in around 10%
of bacterial genomes sequenced to date (Boucher et al., 2007). A subset of these CIs, often re-
ferred to as superintegrons, comprise large array that can span hundreds of cassettes and are
hypothesized to play a major role in the generation of cassettes (Mazel, 2006), a process
which otherwise remains unraveled. Most CI’s cassettes harbor genes of unknown functions.
Nevertheless, the functions that can be predicted are very diverse and a substantial part of it is
involved in substrate modification (acethyltransferases) or interactions with biotic factors
(virulence factors and DNA modification) (Boucher et al., 2007). Besides, 10 to 30% of the
genes potentially encode protein carrying a signal peptide region for either membrane associa-
tion or export from the cell (Koenig et al., 2008). Altogether, these data suggest that cassette-
encoded genes can mediate adaptation to wide range of environmental conditions. Both the
functional platform and the cassettes of MIs are though to derive from CIs (Mazel, 2006;

157
Results – Evolution of recombination rate in integrons
Article I
Labbate et al., 2009). In this light, the impressive ability of bacteria to rapidly overcome such
drastic environmental changes as those imposed by the human use of antibiotics heavily relies
on the recruitment of pre-existing integrons. This illustrates the capacity of the system to cope
with ever changing environments.
Obviously, the shuffling of gene-cassette introduces variability in the integron regard-
ing which traits are expressed or not, and this process is directly dependent on the system’s
recombination rate. The functionality of integron integrases has been demonstrated experi-
mentally over a wide range of substrates and model systems. However, the integrase was al-
ways artificially overexpressed in these studies and spontaneous recombination events have
never been observed in controlled conditions. Thorough epidemiological studies designed to
monitor the spread of multi-resistant MIs evidenced a continuum of cassette arrangements,
evidencing their effective diversification in naturae (Moura et al., 2009). Considerable vari-
ability has been observed in CIs, even between closely related bacterial species and strains
(Boucher et al., 2006). The integron locus is actually one of the most variable genomic loci, a
feature that has been used to finely resolve phylogenetic relationships between otherwise
identical isolates (Labbate et al., 2007). One of the most precise examples to date identified
numerous rearrangements between three pandemic V. cholerae strains over a one century pe-
riod. More accurate estimations of recombination dynamics would rely on the comparison of
very closely related arrays, which is difficult to achieve in practice through the sequencing of
random natural isolates. Hence, although the evolvability bestowed by integron relies on cas-
sette rearrangement, the recombination rate in these systems remains enigmatic.
To shed light on this question, we model the evolution of site-specific recombination
in an integron subjected to a fluctuating environment, which entails the need to constantly
evolve. By essence, environmental changes in nature are stochastic. Their effects on natural
selection are difficult to capture in a purely analytical model without tremendous assumptions.
To avoid oversimplification and to accurately describe the integron system, we develop a
Monte-Carlo simulations scheme incorporating a quantitative-genetic-based modeling of fit-
ness. We show that the selected recombination rate tend to fit the rate of environmental shifts.
Moreover, we highlight that mutations tend to accumulate in non-expressed cassette, particu-
larly under slowly fluctuating conditions.

158
Results – Evolution of recombination rate in integrons
Article I
Methods
We used Monte-Carlo simulations to model the evolution of the site-specific recombi-
nation rate in an integron, in a randomly fluctuating environment. An integron with C cas-
settes was modeled such that, for each cassette ci, its genetic value zi was drawn in a uniform
distribution of variancec². The fitness of each cassette relative to the best possible genotype
was then assigned by applying a stabilizing fitness function, such that for a genotype of value
z, its relative fitness is:
( z  ( t )) 2

W ( z )  W0  (1  W0 ) e 2 .

( z  ( t ))2
The Gaussian term in this function ( e  2 ) refers to selection for an optimum genotype

whose value (t) can change in time. The term W0 (0<W0<1) is a constant representing the
basal fitness of the organism irrespective of which cassette is currently expressed. This latter
term can be viewed as the dispensability of the integron: when W0 is equal to 1, the fitness of
the individual is maximal whatever the expressed cassette in the integron, such that the inte-
gron does not improve the fitness of the individuals; in contrast when it equals 0, the fitness of
individuals varies much according to which cassette is express by the integron. For simplicity,
a single cassette in the array – the one integrated at the expression site attI – is considered to
be expressed. The remaining C-1 cassettes are silent and constitute the reservoir of genetic
variation. Besides the expressed and unexpressed cassettes, integrons bear an integrase locus,
the product of which is responsible for cassette excision and subsequent integration. This lo-
cus determine the site-specific recombination rate, and hence the turnover pace of expressed
cassettes. Mutations at this locus can potentially impact the mean cassette turnover rate, mak-
ing it a modifier of the system, just as in models of modifiers of homologous recombination,
segregation or mutation for instance (Kondrashov, 1995). Polymorphism on the recombina-
tion rate trait was allowed at this locus, such that there could be up to I different alleles, the
recombination rate of each allele inti being ri. At the beginning of each run of the simulation,
we set r1=0 (no recombination), while the recombination rate ri of each of the other alleles
was drawn randomly. Specifically, we used ri=10i where i was drawn uniformly between 1
and 4.5, so that a wide range of recombination values were explored. During the course of the
simulation, mutations were then allowed to occur at the integrase locus at equal rate µ among
all pairs of alleles. To model the effect of the integrase locus on the recombination rate, a pro-
portion ri of the expressed cassettes of individuals carrying allele inti at the integrase locus, were
replaced by other cassettes in the unexpressed pool of the same individuals at each generation.

159
Results – Evolution of recombination rate in integrons
Article I
Fluctuations in the environment were modeled as changes in the optimal genetic value
(t). These changes happened randomly in time at rate e. A new genetic value was drawn in a
uniform distribution of variance e²=c², such that the potential values of the optimum and
those of the actual genetic values of cassettes fully overlapped; we assumed no autocorrela-
tion, meaning that (t) was independent of (t-1). To improve computer-time efficiency in
cases where e<1/20, environment shifts were modeled as a Poisson process, such that the time
in generations between two changes was drawn in an exponential distribution of parameter e.
When mentioned, mutation was also allowed inside the cassettes at a rate . We assumed
those mutations had deleterious effects, as a consequence of either pleiotropic effects on traits
not considered in the model, or of a general decrease in the efficiency of the protein encoded
by the affected cassette. The Gaussian term in the fitness of a cassette affected by m mutations
was then multiplied by an amount (1- s)m if m < mmax, and 0 if m ≥ mmax. Practically, mmax
was set to 0 or 1 in this work to model the absence of mutations and the occurrence of drasti-
cally deleterious mutations.
To model genetic drift, we used a genotype-based framework with multinomial sam-
pling adapted from that of Tenaillon et al. (Tenaillon et al., 1999). This framework has the
caveat that it imposes prior knowledge of all possible genotypes, but it is effective in term of
computer time and efficient to model very large populations for many generations.
We aim at understanding how the genetic properties of the integron, such as the turn-
over rate of expressed cassettes, change as an adaptation to the environmental fluctuations. At
each run of simulation, a burn-in period of 10 environmental changes was let to elapse in or-
der to allow the system to reach its dynamical equilibrium, and the population was then left to
evolve for another 100 environmental changes, during which the mean recombination rate r
and the mean fitness of the population were recorded at each generation, and averaged over
the 100 environmental changes.

Results
We developed a simulation framework to study the evolution of recombination rate in
a stochastically fluctuating environment. To provide an overview of the model behavior, we
draw the evolution of the frequency of integrase alleles in a population of integrons over a pe-
riod of time spanning 75 environmental changes in one run of simulation (figure 2A). The
population consisted in N=108 integrons and comprised I=6 different integrase alleles associ-
ated with an array of C=5 different gene cassettes, of which only one is expressed in each in-

160
Results – Evolution of recombination rate in integrons
Article I
dividual. All 30 possible genotypes are initially introduced in equal frequency in the popula-
tion. Environmental shifts occur stochastically according to a predefined rate. To highlight the
dynamic of integrase allele in diverse contexts, the rate of environmental change was initially
set to 10-2.5 and was decreased by a factor of 10-0.5 every 25 shifts, resulting in 3 successive
regimes of selection.
Before any environmental shift occurs, the genotype expressing the fittest cassette as-
sociated to the non-functional integrase (no recombination) is transiently favored. This allele
is strongly counter-selected by the first shift, because it does not allow the generation of di-
versity necessary to adapt to new conditions. The two alleles with highest recombination rates
were also rapidly counter-selected for the exact opposite rationale: their associated array is not
stable enough to sustain selection in steady environments. In contrast the three integrase al-
leles characterized by intermediate recombination rate rose in frequency. A burst of succes-
sive environmental variations quickly led to the drop of the allele with the lowest rate. Then,
one allele predominated in this regime, with few occasional take over by the allele with im-
mediate lower rate correlating with period of relative environmental stasis. After a lag, the
passage to the next regime of environmental change indirectly drove the fixation of this latter
allele. Similarly, the slowest fluctuating regime led to the rise of the allele with the lowest re-
combination rate, which was previously counter-selected. We also calculated the mean re-
combination rate in the population at each generation. As illustrated in figure 2B, the mean
recombination rate selected indirectly via the fitness effect of gene cassettes tends to stabilize
in each regime to fit the imposed fluctuation rate. Overall, each regime promotes either the
fixation of one specific allele, or the maintenance of a polymorphism at the integrase locus,
resulting in a relatively stable recombination rate over the long term; this steady-state recom-
bination rate changes with the speed of environmental fluctuations.
Prompted by this observation, we undertook a more systematic approach to monitor
the evolution of the mean recombination rate r over a range of different environmental change
rates e, for different values of specificity and dispensability. Overall, the results confirmed the
existence of a linear relationship between these two variables (figure 3). Each point represents
the average of 100 independent runs. In each run, different cassette values and recombination
alleles were randomly sampled and the mean recombination rate over 100 stochastic envi-
ronmental shifts was calculated. Despite the high level of noise imposed by this method, the
mean recombination rate remarkably conform to the fluctuation rate corrected by the expecta-
tion of the number of switch necessary to recombine the best cassette (figure 3). The dispen-

161
Results – Evolution of recombination rate in integrons
Article I
sability of the integrons, i.e. its contribution to the global fitness, does not seem to influence
this result on its own. In contrast, low cassette specificity – which is modeled by a wider
Gaussian fitness curve – results in lower recombination rates, especially at high rates of envi-
ronmental change. This can be understood as a consequence of clonal interference (Gerrish
and Lenski, 1998; De Visser et al., 1999). The simultaneous occurrence of cassette with simi-
lar effect on fitness decrease their relative selection coefficient and results in slower evolu-
tionary dynamics. Under rapid environmental change, the frequency of cassettes does not
change fast enough to allow adaptation, even when the best cassette has been reached by re-
combination. In this context, there is no advantage in increasing the mutation rate. Moreover,
it has been suggested that under very rapid environmental change, it may be advantageous to
decrease evolvability, since a genetic response in one generation often decreases adaptation in
the next generation (Kawecki, 2000). The combination of these two factors may explain why
recombination rate decreases at high rates of environmental change and weak selection, a re-
sult that has not been described in pervious models of mutators. Note also that the dispensa-
bility of cassettes does reduce the mean recombination rate under high specificity of cassettes.
In a given integron only the cassettes proximal to the Pc promoter are expressed and
thus subjected to directional selection. The remaining cassettes experience relaxed selective
pressure and may accumulate deleterious mutations, because those cannot be efficiently
purged by natural selection. This should produce a decrease in mean fitness – a genetic load –
that is different from the one directly caused by mutation itself, and more similar to the drift
load (Hartl and Taubes, 1998; Poon and Otto, 2000). We term it the silencing load. To address
the importance of this silencing load, we incorporated a rate of deleterious mutations in the
previous framework. We considered a drastic mutational model wherein a single mutation
leads to gene inactivation. We carried the same simulation as described previously, and moni-
tored the mean frequency of inactivated genes in the cassette reservoir L.
We found that L is inversely proportional to e (figure 4). Under rapidly cycling envi-
ronments (e=10-4), the non-expressed compartment had the time to accumulate up to 8% of
inactivated cassettes, irrespective of the set of parameter used. As the frequencies of environ-
mental shifts decreases, simulations ran with a low cassette specificity (σ=0.8) progressively
cumulate more deleterious mutations than their counterparts. Similarly, simulations ran with
higher dispensability (ω0=0.5) display an increased silencing load. Because both of these pa-
rameters decrease the intensity of selection, these data strongly suggest that natural selection
is involved in purging the silencing load. In rapidly fluctuating environments, cassettes ex-

162
Results – Evolution of recombination rate in integrons
Article I
perience environments in which they prove adaptive at a higher rate and thus tend to spend
less time in the non-expressed compartment. The silencing load clearly reflects the frequency
at which deleterious mutations are purged by selection when cassettes are put under expres-
sion in a favorable environment. Overall, the silencing load does not affect the population fit-
ness. Although heavy loads are essentially cumulated in slowly fluctuating environments,
these conditions also provide the sustained periods of stasis required for efficient purifying se-
lection. In these conditions, the dynamic of natural selection is fast enough to mediate effi-
cient adaptation. In rapidly changing environments, the efficiency of selection is reduced, but
favorable environment occur fast enough to limit the impact of deleterious mutations on fit-
ness. The introduction of deleterious mutation had no impact on the selected recombination
rate (figure 3).

Discussion
To be written.
Essential points that will be raised include:
 Comparison of the integron system with other loci subjected to diversity-generating
mechanisms (e.g. SSRs, gene conversion, epigenetic switches and other systems relying on
site-specific recombination). Highlight the general scope of integron with respect to these sys-
tems.
 Similar relationships between the optimal rate of phenotypic switches and the rate of
environmental variations have been reported in theoretical (Kimura, 1967; Lachmann and
Jablonka, 1996; Kussell et al., 2005) and experimental (Acar et al., 2008) studies. However,
these models only consider two phenotypic states in binary environments that change with a
constant period. Discuss the advantage of this stochastic model to address the complex case of
integron without such simplification.
 Discuss the control of phenotypic plasticity by recombination-mediated expression.
Contrast it with classical physiological regulation.
 Discuss the bet-hedging strategy.
 Discuss the benefit of stress-responsive regulation of integrase expression with respect
to constitutive regulation.
 Highlight the impact of the system (silencing load) on cassette diversification.

163
Results – Evolution of recombination rate in integrons
Article I
References
Acar M, Mettetal JT, van Oudenaarden A (2008) Stochastic switching as a survival
strategy in fluctuating environments. Nat Genet 40(4): 471-475. Epub 2008 Mar 2023.
Andrews JH (1998) Bacteria as modular organisms. Annual review of microbiology
52: 126.
Biskri L, Bouvier M, Guerout AM, Boisnard S, Mazel D (2005) Comparative study of
class 1 integron and Vibrio cholerae superintegron integrase activities. J Bacteriol 187(5):
1740-1750.
Blount Z, Borland C, Lenski R (2008) Historical contingency and the evolution of a
key innovation in an experimental population of Escherichia coli. Proceedings of the National
Academy of Sciences: 0803151105.
Boucher Y, Labbate M, Koenig JE, Stokes HW (2007) Integrons: mobilizable plat-
forms that promote genetic diversity in bacteria. Trends in microbiology 15(7): 309.
Boucher Y, Nesbo C, Joss M, Robinson A, Mabbutt B et al. (2006) Recovery and evo-
lutionary analysis of complete integron gene cassette arrays from Vibrio. BMC Evol Biol
6(1).
Collis CM, Grammaticopoulos G, Briton J, Stokes HW, Hall RM (1993) Site-specific
insertion of gene cassettes into integrons. Molecular microbiology 9(1): 52.
Collis CM, Hall RM (1995) Expression of antibiotic resistance genes in the integrated
cassettes of integrons. Antimicrob Agents Chemother 39(1): 162.
Collis CM, Recchia GD, Kim MJ, Stokes HW, Hall RM (2001) Efficiency of recom-
bination reactions catalyzed by class 1 integron integrase IntI1. Journal of bacteriology
183(8): 2542.
Cooper TF, Remold SK, Lenski RE, Schneider D (2008) Expression profiles reveal
parallel evolution of epistatic interactions involving the CRP regulon in Escherichia coli.
PLoS Genet 4(2): e35.
Cooper TF, Rozen DE, Lenski RE (2003) Parallel changes in gene expression after
20,000 generations of evolution in Escherichiacoli. Proc Natl Acad Sci U S A 100(3): 1072-
1077.
Cooper VS, Lenski RE (2000) The population genetics of ecological specialization in
evolving Escherichia coli populations. Nature 407(6805): 736-739.
Cooper VS, Schneider D, Blot M, Lenski RE (2001) Mechanisms causing rapid and
parallel losses of ribose catabolism in evolving populations of Escherichia coli B. J Bacteriol
183(9): 2834-2841.
Crozat E, Philippe N, Lenski RE, Geiselmann J, Schneider D (2005) Long-term ex-
perimental evolution in Escherichia coli. XII. DNA topology as a key target of selection. Ge-
netics 169(2): 523-532.
De Visser JAGM (2002) The fate of microbial mutators. Microbiology (Reading, Eng-
land) 148(Pt 5): 1252.
De Visser JAGM, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE (1999) Diminishing
returns from mutation supply rate in asexual populations. Science 283(5400): 404-406.
Denamur E, Matic I (2006) Evolution of mutation rates in bacteria. Molecular Micro-
biology 60(4): 827.
Drake J, Charlesworth B, Charlesworth D, Crow J (1998) Rates of Spontaneous Muta-
tion. Genetics 148(4): 1686.
Eyre-Walker A, Keightley PD (2007) The distribution of fitness effects of new muta-
tions. Nat Rev Genet 8(8): 610-618.
Fisher RA (1930) The Genetical Theory of Natural Selection: Oxford University
Press.

164
Results – Evolution of recombination rate in integrons
Article I
Fluit AC, Schmitz FJ (2004) Resistance integrons and super-integrons. Clinical micro-
biology and infection: the official publication of the European Society of Clinical Microbiol-
ogy and Infectious Diseases 10(4): 288.
Funchain P, Yeung A, Stewart JL, Lin R, Slupska MM et al. (2000) The consequences
of growth of a mutator strain of Escherichia coli as measured by loss of function among mul-
tiple gene targets and loss of fitness. Genetics 154(3): 970.
Gerrish PJ, Lenski RE (1998) The fate of competing beneficial mutations in an asexual
population. Genetica 102-103(1-6): 127-144.
Giraud A, Matic I, Tenaillon O, Clara A, Radman M et al. (2001) Costs and Benefits
of High Mutation Rates: Adaptive Evolution of Bacteria in the Mouse Gut. Science
291(5513): 2608.
Gould SJ (1996) Full House: The Spread of Excellence from Plato to Darwin: Three
Rivers Press.
Hartl DL, Taubes CH (1998) Towards a theory of evolutionary adaptation. Genetica
102-103(1-6): 525-533.
Hawkey PM (2008) The growing burden of antimicrobial resistance. J Antimicrob
Chemother 62 Suppl 1: i1-9.
Holmes AJ, Gillings MR, Nield BS, Mabbutt BC, Nevalainen KM et al. (2003) The
gene cassette metagenome is a basic resource for bacterial genome evolution. Environ Micro-
biol 5(5): 383-394.
Kawecki TJ (2000) The evolution of genetic canalization under fluctuating selection.
Evolution 54(1): 1-12.
Kimura M (1967) On the evolutionary adjustment of spontaneous mutation rates.
Genet Res 9: 23-34.
Koenig JE, Boucher Y, Charlebois RL, Nesbo C, Zhaxybayeva O et al. (2008) Inte-
gron-associated gene cassettes in Halifax Harbour: assessment of a mobile gene pool in ma-
rine sediments. Environ Microbiol 10(4): 1024-1038.
Kondrashov AS (1995) Modifiers Of Mutation-Selection Balance - General-Approach
And The Evolution Of Mutation-Rates. Genet Res 66(1): 53-69.
Kussell E, Kishony R, Balaban NQ, Leibler S (2005) Bacterial persistence: a model of
survival in changing environments. Genetics 169(4): 1807-1814. Epub 2005 Jan 1831.
Labbate M, Boucher Y, Joss MJ, Michael CA, Gillings MR et al. (2007) Use of chro-
mosomal integron arrays as a phylogenetic typing system for Vibrio cholerae pandemic
strains. Microbiology (Reading, England) 153(Pt 5): 1498.
Labbate M, Case RJ, Stokes HW (2009) The integron/gene cassette system: an active
player in bacterial adaptation. Methods in molecular biology (Clifton, NJ) 532: 125.
Lachmann M, Jablonka E (1996) The inheritance of phenotypes: an adaptation to fluc-
tuating environments. J Theor Biol 181(1): 1-9.
Lenski RE, Mongold JA (2000) Cell size, shape, and fitness in evolving populations of
bacteria. Scaling in biology: Oxford University Press. pp. 221-235.
Lenski RE, Rose MR, Simpson SC, Tadler SC (1991) Long-Term Experimental Evo-
lution In Escherichia-Coli.1. Adaptation And Divergence During 2,000 Generations. Am Nat
138(6): 1315-1341.
Lenski RE, Travisano M (1994) Dynamics of adaptation and diversification: a 10,000-
generation experiment with bacterial populations. Proc Natl Acad Sci U S A 91(15): 6808-
6814.
Martin G, Lenormand T (2006) A general multivariate extension of Fisher's geometri-
cal model and the distribution of mutation fitness effects across species. Evolution 60(5): 893-
907.
Martinez E, de la Cruz F (1988) Transposon Tn21 encodes a RecA-independent site-

165
Results – Evolution of recombination rate in integrons
Article I
specific integration system. Molecular & general genetics: MGG 211(2): 325.
Mazel D (2006) Integrons: agents of bacterial evolution. Nature Reviews Microbiol-
ogy 4(8): 620.
Moura A, Soares Mr, Pereira C, Leitão N, Henriques I et al. (2009) INTEGRALL: a
database and search engine for integrons, integrases and gene cassettes. Bioinformatics (Ox-
ford, England).
Naas T, Mikami Y, Imai T, Poirel L, Nordmann P (2001) Characterization of In53, a
class 1 plasmid- and composite transposon-located integron of Escherichia coli which carries
an unusual array of gene cassettes. J Bacteriol 183(1): 235-249.
Pelosi L, Kuhn L, Guetta D, Garin J, Geiselmann J et al. (2006) Parallel changes in
global protein profiles during long-term experimental evolution in Escherichia coli. Genetics
173(4): 1851-1869.
Philippe N, Crozat E, Lenski RE, Schneider D (2007) Evolution of global regulatory
networks during a long-term experiment with Escherichia coli. Bioessays 29(9): 846-860.
Philippe N, Pelosi L, Lenski RE, Schneider D (2009) Evolution of penicillin-binding
protein 2 concentration and cell shape during a long-term experiment with Escherichia coli. J
Bacteriol 191(3): 909-921.
Poon A, Otto SP (2000) Compensating for our load of mutations: freezing the melt-
down of small populations. Evolution 54(5): 1467-1479.
Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S et al. (2007) The
Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical
Pacific. PLoS Biol 5(3): e77.
Silander OK, Tenaillon O, Chao L (2007) Understanding the evolutionary fate of finite
populations: the dynamics of mutational effects. PLoS Biol 5(4): e94.
Sniegowski PD, Gerrish PJ, Johnson T, Shaver A (2000) The evolution of mutation
rates: separating causes from consequences. BioEssays: news and reviews in molecular, cellu-
lar and developmental biology 22(12): 1066.
Sniegowski PD, Gerrish PJ, Lenski RE (1997) Evolution of high mutation rates in ex-
perimental populations of E. coli. Nature 387(6634): 703-705.
Stokes HW, Hall RM (1989) A novel family of potentially mobile DNA elements en-
coding site-specific gene-integration functions: integrons. Molecular microbiology 3(12):
1683.
Streit WR, Schmitz RA (2004) Metagenomics--the key to the uncultured microbes.
Curr Opin Microbiol 7(5): 492-498.
Tenaillon O, Toupance B, Le Nagard H, Taddei F, Godelle B (1999) Mutators, popu-
lation size, adaptive landscape and the adaptation of asexual populations of bacteria. Genetics
152(2): 485-493.
van der Woude M, Baumler A (2004) Phase and Antigenic Variation in Bacteria. Clin
Microbiol Rev 17(3): 611.
Zeyl C, Mizesko M, de Visser JA (2001) Mutational meltdown in laboratory yeast
populations. Evolution 55(5): 909-917.

166
Results – Evolution of recombination rate in integrons
Article I

Figures

Figure 1 - Schematic organization of the integron locus

Integrons forms integrated genetic systems. The intI gene encodes a site-specific tyrosine re-
combinase capable of mobilizing dedicated gene cassettes. Most gene cassettes in the array
are unexpressed. Excision of non-replicative cassette intermediate occurs through random
attC x attC recombination mediated by IntI. Such intermediates are preferentially recombined
in attI through IntI-mediated attC x attI recombination. Newly integrated cassettes are thus
directly put under expression by the Pc promoter. These specific properties enable a versatile
switching mechanism whereby recombination affects the expression of potentially adaptive
traits.

167
168
Article I

Figure 2 - Representative dynamic of recombination rate in a fluctuating environment

A population containing initially 30 different integron genotypes in equal frequencies (5 different cassettes x 6 integrase alleles) was
evolved in three consecutive regimes of stochastic environmental variations (environmental change rate equal to 10-2.5, 10-3, and 10-3.5 per
generation). The optimal value of 75 successive environments is indicated in the upper panel, with white filled dots highlighting environ-
Results – Evolution of recombination rate in integrons

mental shifts. Panel A shows the log frequencies of the 6 integrase alleles over time. Panel B display the corresponding log of the mean re-
combination rate in the whole population. The selected recombination rate is strongly influenced by the rate of environmental changes. See
the main text for details
Results – Evolution of recombination rate in integrons
Article I
.
Figure 3 - Impact of the environmental change rate on the selection of recombination rate

Each data point corresponds to the recombination rate averaged over 100 simulation runs. In
each run, the mean mutation rate selected over 100 environmental changes is selected. Envi-
ronmental changes occur stochastically according to a predefined rate. Filled triangles and
filled squares indicate whether deleterious mutations were allowed or not, respectively. Col-
ors distinguish different combination of dispensability ω0 and specificity σ as follow: dark
blue, W0=0 and σ=0.2; light blue, W0=0 and σ=0.8; violet, W0=0.5 and σ=0.2; and red,
W0=0.5 and σ=0.8. The black line correspond to y=(C-1).x, where C-1 is the number of unex-
pressed cassette in the array.

169
Results – Evolution of recombination rate in integrons
Article I

Figure 4 - Accumulation of deleterious mutations in unexpressed gene cassettes

(silencing load)

The silencing load is defined as the frequency of unexpressed cassette inactivated by deleteri-
ous mutations. Each data point corresponds to the silencing load averaged over 100 simula-
tions. In each run, the mean mutation rate selected over 100 environmental changes is
selected. Environmental changes occur stochastically according to a predefined rate. Deleteri-
ous mutations occurred at a rate of 10-6 per generation. Colors distinguish different combina-
tion of dispensability ω0 and specificity σ as follow: dark blue, W0=0 and σ=0.2; light blue,
W0=0 and σ=0.8; violet, W0=0.5 and σ=0.2; and red, W0=0.5 and σ=0.8.

170
Results – Recombination in integrons is controled by the SOS response to stress

II. RECOMBINATION IN INTEGRONS IS


CONTROLED BY THE SOS RESPONSE TO
STRESS

BACKGROUND
Theoretical considerations strongly suggest that the optimal recombination rate in in-
tegron must match the average rate of environmental changes (see Article 1, p153). Two dis-
tinct strategies can be encompassed to implement such a relationship: i) the recombination
rate could be constitutively coded in the integrase, in which it can be slowly fine-tuned
through mutations affecting the protein activity and/or its expression level; and ii) the expres-
sion of the integrase could be responsively regulated by environmental changes. So far, the
expression pattern of integrases resisted experimental analysis (see p146), which is consistent
with the second hypothesis. We thus undertook to identify stress-responses capable of modu-
lating the expression of the integrases.

METHODS
The promoter region of all intI genes deposited in GenBank were recovered and ana-
lyzed for the presence of specific sequence motifs using custom scripts, leading to the identi-
fication of LexA binding sites. To investigate the involvement of the SOS response, electro-
mobility shift assays (EMSA) were performed with material from V. cholerae N16961. The
expression of the integrase in different genetic background and stressing conditions was
monitored using LacZ reporters in V. cholerae and a class 1 integron in E. coli. We devised a
positively selectable reporter of recombination in order to further examine the link between
integrase induction and recombination rate.

RESULTS AND DISCUSSION


We identified a LexA binding motif in the promoter region of most integron inte-
grases. The site is effectively bound by LexA in V. cholerae. The expression of the integrase
is induced by classical trigger of the SOS response, including widely used antibiotics, in both
V. cholerae and E. coli. In contrast, no induction was measured when the SOS response is im-

171
Results – Recombination in integrons is controled by the SOS response to stress

paired. The induction of the integrase has a neat functional impact and strongly increases the
recombination rate. By ensuring diversification in response to a wide range of environmental
challenges, the regulation of recombination rate promotes the evolvability of the organism.
This complex adaptive phenotype arises from the coupling of two simpler genetic modules
and only involves few mutations. Mapping of the LexA binding sites identified in silico to the
IntI phylogenetic tree revealed that the SOS control of recombination pervades marine spe-
cies. In contrast, this trait appears very sporadically in soil and freshwater species, suggesting
different selective pressure in these niches. Strikingly, all clinically relevant multi-resistance
integrons are subjected to SOS control, irrespective of their phylogenetic relationships. This
observation is particularly meaningful in the light of antibiotic-mediated induction of the inte-
grase.

ARTICLE II
This article has been published as a brevia in Science and concisely reports the SOS
control of recombination rate (pp 154-186)

ARTICLE III
This manuscript is in preparation to be submitted to Nucleic Acids Research. It further
discusses the implication of the coupling between the SOS and integron system and its phy-
logenetic distribution (pp 186-211).

172
Results – Recombination in integrons is controled by the SOS response to stress
Article II

173
Results – Recombination in integrons is controled by the SOS response to stress
Article II

174
Results – Recombination in integrons is controled by the SOS response to stress
Article II

175
Results – Recombination in integrons is controled by the SOS response to stress
Article II

176
Results – Recombination in integrons is controled by the SOS response to stress
Article II

177
Results – Recombination in integrons is controled by the SOS response to stress
Article II

178
Results – Recombination in integrons is controled by the SOS response to stress
Article II

179
Results – Recombination in integrons is controled by the SOS response to stress
Article II

180
Results – Recombination in integrons is controled by the SOS response to stress
Article II

181
Results – Recombination in integrons is controled by the SOS response to stress
Article II

182
Results – Recombination in integrons is controled by the SOS response to stress
Article II

183
Results – Recombination in integrons is controled by the SOS response to stress
Article II

184
Results – Recombination in integrons is controled by the SOS response to stress
Article II

185
Results – Recombination in integrons is controled by the SOS response to stress
Article III

A manuscript to NAR

SOS control of recombination in integron is a primeval feature

Guillaume Cambray1*, Neus Sanchez-Alberola2*, Ivan Erill3*, Susana Campoy3,


Émilie Guerin4, Sandra Da Re4, Bruno Gonzales-Zorn5, Marie-Cécile Ploy4, Jordi Barbé3,
Didier Mazel1

1
Institut Pasteur, Unité de Plasticité du Génome Bactérien, CNRS URA 2171, 75015 Paris,
France
2
Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Barcelona,
Spain
3
Department of Biological Sciences, University of Maryland Baltimore County, Baltimore
21228, USA.
4
Université de Limoges, Faculté de Médecine, EA3175, INSERM, Equipe Avenir, Limoges
87000, France
5
Departamento de Sanidad Animal, Facultad de Veterinaria, Universidad Complutense de
Madrid, 28040 Madrid, Spain.

*: equal contribution

186
Results – Recombination in integrons is controled by the SOS response to stress
Article III

SUMMARY
Integrons are found in the genome of hundreds of environmental bacterial species, but

are mainly known as the genetic agents responsible for the capture and spread of antibiotic re-

sistance determinants among Gram-negative pathogens. The SOS response is a regulatory

network under control of the repressor protein LexA and is targeted at repairing and bypass-

ing DNA damages, thus promoting genetic variation in time of stress. We recently reported a

direct link between the SOS response and the expression of integron integrases in Vibrio

cholerae and a plasmid-borne class 1 mobile integron. . Here we conduct a systematic study

of all integron integrase promoter regions available in genomic databases and we show that

LexA controls the expression of most integron integrases. We also provide experimental vali-

dation of integrase LexA control for another Vibrio chromosomal integron and a multi-

resistance plasmid harbouring two integrons. By mapping the distribution of predicted LexA-

binding sites onto an IntI phylogeny, we propose that SOS control arose early and was proba-

bly the ancestral state in integron evolution. Importantly, these data indicates that SOS regula-

tion has been positively selected for in mobile integrons. The coupling of both genetic

systems enhances the potential for cassette swapping and capture in cells undergoing stress

and changing conditions, while freezing the cassette arrangement in steady environments. In

agreement with this, we find a strong correlation between the lack of LexA control and inte-

grase inactivation by mutation, which suggests that unregulated integrase activity may be

deleterious. This discovery highlights the role of integrons and the SOS response as integrated

adaptive systems and will likely have important implications for antibiotic treatment policies.

187
Results – Recombination in integrons is controled by the SOS response to stress
Article III
Integrons are bacterial genetic elements capable of incorporating exogenous and pro-

moter-less open reading frames (ORF), referred to as gene cassettes, by site-specific recombi-

nation (Figure 1). First described in the late 1980’s in connection to the emergence of

antibiotic resistance (Stokes and Hall, 1989), integrons always contain three functional com-

ponents: an integrase gene (intI), which mediates recombination, a primary recombination site

(attI) and an outward-orientated promoter (Pc) (Mazel, 2006a). Cassette integrations mainly

occur at the attI site (Collis et al., 2002), ensuring the correct expression of mobilized cas-

settes’ genes by placing them under the control of Pc (Levesque et al., 1994). To date, two

main subsets of integrons have been described. On the one hand, mobile integrons, also re-

ferred to as resistance integrons, contain relatively few (2-8) cassettes and encode resistance

to a broad spectrum of antibiotics (Rowe-Magnus and Mazel, 2002; Fluit and Schmitz, 2004;

Partridge et al., 2009). They have been conventionally divided into five different classes ac-

cording to their intI gene sequence (Mazel, 2006a). These are typically associated with mobile

elements, such as transposons and conjugative plasmids, ensuring their dissemination across

bacterial species. They are present mostly in the Proteobacteria, but have also been reported in

other bacterial phyla, such as Gram-positive bacteria (Mazel, 2006a). On the other hand,

chromosomal integrons have been identified in the genomes of many bacterial species

(Boucher et al., 2007). Although many chromosomal integrons comprise a limited number of

cassettes (ref ACID), a subset of them – termed superintegrons (SI) – exhibits large arrays

spanning hundreds of cassettes (Mazel, 2006a). SIs have been specifically identified in the

Vibrionaceae and, to some extent, in the Xanthomonadaceae and Pseudomonadaceae (Mazel

et al., 1998; Rowe-Magnus et al., 1999; Rowe-Magnus et al., 2001; Vaisvila et al., 2001;

Rowe-Magnus et al., 2003; Gillings et al., 2005) and seem to be ancient residents of the host

genome (Rowe-Magnus et al., 2001). Contrasting with mobile integrons, most cassette in a

given SIs display recombination site (attC) that are typical of the species, suggesting that SI

188
Results – Recombination in integrons is controled by the SOS response to stress
Article III
harbouring bacteria are implicated in cassette genesis (Rowe-Magnus et al., 2003). Most cas-

sette-borne genes in chromosomal integrons are of unknown function (Boucher et al., 2007),

though some of them are related to existing resistance cassettes (Rowe-Magnus et al., 2002;

Melano et al., 2002; Petroni et al., 2004). While stable under laboratory conditions, superinte-

grons have been reported to be the most variable loci among V. cholerae natural isolates

(Rowe-Magnus et al., 1999; Labbate et al., 2007).

Despite the importance of integrons in the acquisition and spread of antibiotic resis-

tance determinants and – from a broader perspective – in bacterial adaptation, little was

known on the dynamics of cassette recombination. Integron integrases mediate recombination

by interacting with single-stranded (ss) attC sites present in all reported cassettes, employing

a unique site-specific recombination process (MacDonald et al., 2006; Bouvier et al.,

2005)xxx(Bouvier, submited). However, the level and control of integrase expression, which

are central to this process, remained enigmatic until recently, when we reported that expres-

sion of the integrases of the V. cholerae superintegron and of a class 1 mobile integron were

controlled by the SOS response (Guerin et al., 2009).

The SOS response is a global regulatory network governed by a repressor protein

(LexA) and principally targeted at addressing DNA damage (Walker, 1984; Erill et al., 2007).

LexA represses SOS genes by binding to highly specific binding sites present in their pro-

moter regions. In E. coli and most β- and γ-Proteobacteria these sites consist of a 16 bp long

palindromic motif (5’-CTGTatatatatACAG-3’), commonly known as LexA box (Walker,

1984). The SOS response is typically induced by the presence of single stranded DNA frag-

ments (ssDNA), which can arise from a number of environmental stresses (Aertsen and Mi-

chiels, 2006), but is normally linked to replication-fork stall due to DNA lesions. These

ssDNA fragments bind non-specifically to the universal RecA protein (Sassanfar and Roberts,

1990), enabling it to promote LexA inactivation by autocatalytic cleavage (Little, 1991) and

189
Results – Recombination in integrons is controled by the SOS response to stress
Article III
thus inducing the SOS response. Up to 40 genes have been shown to be directly regulated by

LexA in E. coli (Fernandez De Henestrosa et al., 2000; Courcelle et al., 2001), encoding pro-

teins to stabilize the replication fork, repair DNA, promote translesion synthesis and arrest

cell division. Following its initial description in E. coli (Walker, 1984), the SOS response has

been characterized in many other bacterial classes and phyla and LexA has been shown to

bind very different motifs in different phyla (Erill et al., 2007).

In recent years, the SOS response has been linked to clinically relevant phenotypes,

such as the activation and dissemination of virulence factors carried in bacteriophages (Kim-

mitt et al., 1999; Waldor and Friedman, 2005), transposons, pathogenicity islands and inte-

grating conjugative elements encoding antibiotic resistance genes (Erill et al., 2007; Kelley,

2006). Moreover, it has recently become established that some widely used antibiotics, such

as fluoroquinolones, trimethoprim and β-lactams are able to trigger SOS induction and are

thus able to promote the dissemination of antibiotic resistance genes (Erill et al., 2007; Kel-

ley, 2006) or the generation of resistant alleles (Cirz et al., 2005). This puts forward a positive

feedback loop that has been postulated to have important consequences for the emergence and

dissemination of antibiotic resistance (Avison, 2005). Our recent work demonstrating a direct

link between the SOS response and integrase-mediated recombination further reinforces this

line of reasoning, as it provides bacteria with an antibiotic-induced mechanism for gene ac-

quisition, functional expression and dispersal (Guerin et al., 2009). Here we expand on this

recent connection by means of a systematic study of integron integrase promoter regions. Pu-

tative LexA-binding sites are found in the majority of integron integrase promoters, suggest-

ing that the SOS response control recombination in most integrons. We provide further

experimental validation of this control in the Vibrio parahaemolyticus chromosomal integron

and in the integrons of E. coli multi-resistance plasmid pMUR (Gonzalez-Zorn et al., 2005).

The phylogenetic distribution of the LexA controlled integrases suggests that SOS control

190
Results – Recombination in integrons is controled by the SOS response to stress
Article III
evolved in the ancestor of chromosomal integrons, and that it has been positively maintained

in mobile integrons. We also find a correlation between the loss of LexA control and integrase

inactivation by mutation, indicating that unregulated recombination may be deleterious in

these genetic elements. The only exceptions to this rule appear to be multi-resistance mobile

integrons, in which SOS deregulation leads to the creation of a secondary cassette promoter.

We discuss these findings for the adaptive dynamics of integrons and their implications on the

antibiotic resistance genes acquisition and dissemination.

Results

Identification of LexA-binding sites in intI promoters


We recently identified Escherichia coli-like LexA binding sites overlapping the pro-

moter of the integrase genes intI from all clinically relevant mobile integrons and intIA from

the V. cholerae SI (Figure 2A). We have shown that intI expression was indeed controlled by

the SOS response, eventually resulting in heightened rates of integrase-mediated recombina-

tion upon SOS induction (Guerin et al., 2009).

To gain insight into the general relevance of this observation, we undertook an exhaus-

tive in silico study. Using BLASTP, we identified 296 homologues of intIA in the GenBank

database and systematically searched the nucleotide sequences corresponding to their coding

region plus 501 bp upstream. We conducted independent searches for each of the described

LexA-binding motifs (Erill et al., 2007). Putative LexA-binding sites were detected in 66%

(196) of the 296 sequences (Table S1) All the identified LexA-binding sites corresponded to

the motif found in E. coli and most β/γ-Proteobacteria,. This suggests that the putative LexA

regulation of intI genes probably originated after the split of the α- and β/γ -Proteobacteria

subclasses, since the LexA-binding motif of α-Proteobacteria is markedly divergent from the

E. coli one (Tapias and Barbe, 1998). When we examined the core 16 bp of the identified E.

191
Results – Recombination in integrons is controled by the SOS response to stress
Article III
coli-like LexA-binding sites, 54 distinct sequences were identified (Table S2). Nonetheless,

the LexA-binding sites exhibit a high level of conservation, as reflected in their joint informa-

tion content logo (Figure 2C), which contrast with the immediately surrounding sequences.

This strongly support the functionality of these motifs. Importantly, E. coli-like LexA sites

were detected in all but one of the mobile integron classes and in almost all Vibrionaceae su-

per-integrons (Figure 2B), evidencing that putative LexA regulation of intI genes is a wide-

spread phenomenon pervading all integron divisions.

Predicted LexA-binding sites correspond to functional transcriptional control elements.


We have previously shown that LexA regulates the expression of intI in V. cholerae,

and our in-silico search identified LexA-binding sites in the promoter region of intI for all se-

quenced Vibrio species but V. fischeri (Table S1). To further assess the overall functionality

of the in silico predicted LexA-binding sites, we evaluated integrase LexA regulation in V.

parahaemolyticus ATCC 17802, which harbours a LexA-binding site upstream of its intIA

gene in a genetic context that is substantially different from the one of V. cholerae (Figure

2A). Using RT-PCR, we determined the intIA expression level in both the wild-type strain

and its lexA(Def) derivative. We found an expression ratio of 6.18, revealing a strong LexA

regulation of the intIA gene expression (Figure 3A).

In several class 1 integrons, heightened expression of the cassette genes has been

shown to rely on a secondary cassette promoter called P2, located just upstream of the intI1

gene (Figure S1). P2 is enabled by a CCC insertion that increases the distance between a -35

box sequence and a sequence resembling the -10 box consensus from 13 to 16 bp, thereby

generating a functional 70 promoter (Kim et al., 2007; Collis and Hall, 1995). In all its re-

ported instances, this CCC insertion takes place in what appears to be a disrupted LexA-

binding site. Therefore, the CCC insertion that enables P2 should simultaneously abolish inte-

grase regulation by LexA. Here we tested this hypothesis using the E. coli multi-resistant

192
Results – Recombination in integrons is controled by the SOS response to stress
Article III
plasmid pMUR050 (Gonzalez-Zorn et al., 2005). This plasmid provides an ideal material to

address this issue because it harbours two integrons with inactivated copies of the intI1 gene

(Figure S1). However, only one of these intI genes contains a functional LexA-binding site in

its promoter, while the other presents a CCC insertion disrupting the LexA-binding site (Fig-

ure S1). Using EMSA, we found that the CCC insertion effectively prevents LexA-binding

(Figure 3B). Furthermore, RT-PCR in WT and lexA(Def) backgrounds confirmed that LexA

regulation was only observed in the integron carrying an intact LexA-binding site, with a

strong deregulation (6.55 ratio) in the lexA(Def) background (Figure 3A). Thus, the CCC in-

sert does not only enable the secondary cassette promoter P2, but concomitantly disrupts the

LexA-binding site of the integrase promoter. Evidence of increased cassette expression due to

the CCC insert was obtained by comparing RT-PCR expression profiles for the first cassette

gene of both pMUR integrons, and this increase was found to be independent of the lexA(Def)

background (data not shown). In-silico searches for disrupted LexA-binding sites revealed 31

such instances in integrons from a wide variety of species (Table 1). Furthermore, all the

identified CCC insertions corresponded to multi-resistance mobile integrons. Together, these

results suggest that LexA regulation may be eventually lost under heavy selection to promote

higher basal levels of the antibiotic resistance transcript.

Analysis of LexA-binding sites distribution


The presence of confirmed LexA regulation in V. cholerae and V. parahaemolyticus

SIs suggested that SOS induction of intI genes probably originated very early in the evolu-

tionary history of integrons. To further explore this hypothesis, we mapped the in silico iden-

tified LexA-binding sites onto a phylogenetic tree of IntI protein sequences. The tree shown

in Figure 4 is in overall agreement with previously published IntI phylogenies (Diaz-Mejia et

al., 2008; Mazel, 2006a; Nemergut et al., 2008; Boucher et al., 2007). It distinguishes two

major ecological groups. Integrons borne by marine species form a monophyletic clade, with

193
Results – Recombination in integrons is controled by the SOS response to stress
Article III
the Vibrionaceae super-integrons sitting at the root of the tree. Integrons from soil and fresh-

water bacteria, on the other hand, seem to constitute a more recent branch. As has been noted

previously, the tree also put forward that multiresistance mobile integrons probably arose sev-

eral times in both ecological groups (classes 2, 4 and 5 [green panel] and classes 1 and 3 [or-

ange panel] in Figure 4).

The distribution of identified LexA boxes in Figure 4 shows that LexA regulation of

intI genes is prevalent among marine chromosomal integrons and their cognate mobile rela-

tives. Conversely, no functional LexA-binding sites can be identified in the chromosomal in-

tegrons from soil and freshwater species. Nonetheless, the mobile integrons branching off

from soil and freshwater bacteria do contain functional LexA-binding sites, hinting that LexA

regulation could have been lost in most non-marine chromosomal integrons but has been pre-

served in their related mobile counterparts.

DISCUSSION
Coupling of integrons with the SOS response
We have recently demonstrated that the SOS response regulates the expression of two

integron integrase genes, leading to heightened recombination rates upon SOS induction, both

in a class 1 mobile integron and in the V. cholerae SI (Guerin et al., 2009). The extensive in

silico search reported here shows that about two thirds of the available integron integrase se-

quences are putatively regulated by LexA, and this regulation has been confirmed here for ad-

ditional integrase genes. In hindsight, the coupling of genetic elements capable of cassette

integration with a global response to stress comes out as an elegant and powerful pairing. As

illustrated in Figure 1, integrons can be seen as stockpiling agents of genetic diversity, which

in addition, can tap into a huge and variable pool of cassettes through horizontal gene transfer

from the surrounding bacterial communities (Boucher et al., 2007; Michael et al., 2004;

Koenig et al., 2008). Nonetheless, the efficient expression of these acquired traits is highly

194
Results – Recombination in integrons is controled by the SOS response to stress
Article III
dependent on integrase-mediated recombination. Newly integrated cassettes sitting in the

proximal region of the integron are highly expressed by the Pc promoter, but they can be

moved to distal parts of the integron and thus progressively put away from expression by con-

secutive recombination events (Figure 1), which may also reinstate formerly acquired cas-

settes under full expression.

The SOS response comes thus as an obvious choice for regulation of integrase activ-

ity, as it is already a key component of adaptive mutagenesis in bacteria, triggering both tran-

slesion synthesis and activation of transposable elements (Bjedov et al., 2003; Ubeda et al.,

2007). Furthermore, SOS induction is carefully timed to those periods of stress in which adap-

tive mutagenesis can be particularly advantageous. In the early chromosomal integrons, where

SOS regulation apparently arose, LexA repression of the intI gene may have contributed to

integron stability by minimizing the basal expression levels of intI and thus decreasing the

rates of integrase-mediated recombination. Then again, SOS regulation would have ensured

that both the occasional cassette reordering and the acquisition of exogenous cassettes took

place at a time of need for innovation, such as in reaction to antibiotic exposure. Therefore,

regulation of integrase activity by the SOS response comes as a natural way to optimize inte-

gron-mediated adaptation without incurring in excessive integron destabilization or in the

possible toxic effects of sustained integrase expression.

Loss and persistence of integrase LexA-regulation


The phylogenetic distribution of LexA-binding sites reveals an apparent loss of LexA

regulation in several instances. LexA regulation of intI genes is clearly prevalent among ma-

rine species. Loss of LexA regulation is only observed in the SXT integrating-conjugative

element, for which SOS-dependent transfer has been reported (Kelley, 2006) and in Vibrio

fischeri. Curiously enough, in V. fischeri LexA shares its binding motif with the LuxR quo-

rum sensing regulator (Shadel et al., 1990), suggesting that LexA regulation of intI may have

195
Results – Recombination in integrons is controled by the SOS response to stress
Article III
been lost in this species to prevent interference with the lux regulon.

Conversely, loss of LexA regulation seems to be the norm among soil and freshwater

species harbouring chromosomal integrons. In some cases, this loss of regulation has an obvi-

ous explanation. Some families, like the Nitrosomonadaceae and the Chromatiaceae, simply

do not possess any LexA homologues, thus explaining the absence of this motif upstream of

their intI genes (Erill et al., 2007). A similar, yet less powerful argument can be made for the

Xanthomonadaceae, in which neither of the two identified LexA proteins recognizes the β/γ-

Proteobacteria LexA-binding site (Yang et al., 2002),. However, the main mechanism associ-

ated with the loss of LexA-regulation appears to be the inactivation of the integrase gene. The

majority of Xanthomonadaceae chromosomal integrases, for instance, are inactivated by di-

verse types of mutations and deletions (Gillings et al., 2005). There is also evidence that

frame-shift mutations may have inactivated most of the remaining intI genes lacking apparent

LexA regulation (Nemergut et al., 2008). Thus, it seems that many species may have opted

for inactivating their intI gene upon loss of LexA regulation, or that accidental inactivation of

intI has made LexA regulation superfluous. Both lines of reasoning strongly suggest that un-

regulated intI expression must be somehow detrimental to the cell, thereby introducing an ad-

ditional selective pressure towards the initial emergence of LexA regulation of integron

integrase genes. Further strengthening this conclusion, it is a well known informal observation

for worker in the integron field that experimental overexpression of the integrase is deleteri-

ous to the cell.

In contrast to their soil and freshwater chromosomal relatives, most class 1 and class 3

integrases are both LexA regulated and functional. This indicates that the capability of cas-

sette uptake and shuffling is a useful trait in mobile integrons, since it would allow their hosts

to express novel phenotypes in selective environments. This parallels the persistent regulation

by LexA of functional integrase genes in marine species, in which reorganization of the su-

196
Results – Recombination in integrons is controled by the SOS response to stress
Article III
perintegrons has been evidenced by comparative genomics (Labbate et al., 2007)(+ref feng

2008). In any case, the preservation of LexA regulation in integrons harbouring functional in-

tegrases suggests again that, if not mandatory, LexA regulation of intI genes must be overtly

beneficial for integron hosts when the product of the intI gene is a functional protein. The

only exception to this general rule appears to be deregulation mediated by a CCC insertion

that disrupts the LexA-binding site. This same CCC insert, however, enables a secondary cas-

sette promoter (P2) that enhances cassette expression, and in silico search results evidence

that this insert is only found in multi-resistance plasmids. This suggests that the detrimental

effects of unregulated intI expression trade-off with the selective pressure towards increased

expression of multi-resistance phenotypes. Nonetheless, as the pMUR case illustrates, en-

hanced cassette promoter may be maintained with a subsequent inactivation of the integrase,

in this case by a IS26 insertion.

Clinical implications of SOS-induced integrase activity

Both integrons and the SOS response have been previously singled out as elements of

clinical importance and have therefore been the focus of abundant research in the fight against

antibiotic resistance and antibacterial drug development (Weldhagen, 2004; Nijssen et al.,

2005; Cirz et al., 2005). Beyond its fundamental relevance to bacterial adaptation, serious

clinical implications emerge from the discovery of a direct link between SOS induction and

integrase activity. Since most multi-resistant Gram-negative bacteria carry mobile integrons,

this establishes a generic system for genetic interchange under control of a general stress re-

sponse shared by a large group of human and animal pathogens. In this setting, it is important

to note that integron cassettes encoding resistance to several antibiotics known to induce the

SOS response, such as trimethoprim, quinolones and β-lactams, are common today (Rowe-

Magnus and Mazel, 2002; Fonseca et al., 2008). This suggests that the indirect triggering ef-

fect of these antibiotics on the capture of resistance cassettes has been very efficient.

197
Results – Recombination in integrons is controled by the SOS response to stress
Article III

A less obvious consequence of integrase SOS regulation is its repercussion on antibi-

otic resistance policies. Current policies in the fight against antibiotic resistance rely largely

on the detrimental effects most resistance mechanisms inflict on bacteria, which eventually

lead to loss of resistance genes in the absence of antibiotic exposure (Andersson, 2006). Since

most cassettes are promoter-less, the most ancient cassettes (located at the distal part of the

integron) are subject to severe polar effects, leading to rare or non-existent protein products

(see Figure 1) (Collis and Hall, 1995). In this context, the incorporation of SOS regulation in

integrons puts forward a mechanism by which antibiotic-resistance genes and other useful ad-

aptations can be silently set aside, while current adaptive traits are steadily kept under expres-

sion. In time of stress, such as exposure to antibiotics, the relevant resistance cassette can be

called upon by integrase-mediated translocation, and thus selected for only when its expres-

sion is required. Furthermore, cassette’s genes temporarily relegated to distal positions in in-

tegrons may also sustain increased evolution rates, generating a substantial pool of variability

from which to draw on when the appropriate selective pressures resurface. Therefore, SOS

mediated regulation of integron integrases should be taken into account regarding the time

spans currently being considered for spontaneous loss of antibiotic resistance in restrictive use

policies and, ultimately, concerning the future development and assessment of antibiotic

guidelines.

198
Results – Recombination in integrons is controled by the SOS response to stress
Article III

MATERIALS AND METHODS


In silico searches and phylogenetic analyses were made on sequences deposited in
GenBank as described previously (Abella et al., 2007; Mazel, 2006b). The electro-mobility
shift assays were performed as described before (Abella et al., 2007). The different lacZ re-
porter constructions were made by fusion at the initiation codon of the tested genes and -
galactosidase activities were measured in Miller units. The full list of strains and plasmids is
available as Table S7, and oligonucleotide primers are listed in Table S8. Full methods and
associated references are described in the supplementary text.

ACKNOWLEDGEMENTS
We thank Mike C. O’Neill for his careful reading and comments on the different ver-
sions of this manuscript. This work was supported by grants from the Ministère de la Recher-
che et de l’Enseignement supérieur, the Conseil Régional du Limousin, the Fondation pour la
Recherche médicale (FRM) and from the Institut National de la Santé et de la Recherche
Médicale (Inserm) for the Ploy lab; by the Institut Pasteur, the Centre National de la Recher-
che Scientifique (CNRS-URA 2171), the FRM and the EU (STREP CRAB, LSHM-CT-2005-
019023, and NoE EuroPathoGenomics, LSHB-CT-2005-512061), for the Mazel lab; and by
grants BFU2008-01078/BMC from the Ministerio de Ciencia e Innovación de España and
2005SGR-533 from the Generalitat de Catalunya, for the Jordi lab.

199
Results – Recombination in integrons is controled by the SOS response to stress
Article III

REFERENCES

Abella M, Campoy S, Erill I, Rojo F, Barbe J (2007) Cohabitation of two different lexA regu-
lons in Pseudomonas putida. J Bacteriol 189(24): 8855-8862.
Aertsen A, Michiels CW (2006) Upstream of the SOS response: figure out the trigger. Trends
In Microbiology 14(10): 421-423.
Andersson DI (2006) The biological cost of mutational antibiotic resistance: any practical
conclusions? Curr Opin Microbiol 9(5): 461-465.
Avison MB (2005) New approaches to combating antimicrobial drug resistance. Genome
Biol 6(13): 243.
Bjedov I, Tenaillon O, Gerard B, Souza V, Denamur E et al. (2003) Stress-induced
mutagenesis in bacteria. Science 300(5624): 1404-1409.
Boucher Y, Labbate M, Koenig JE, Stokes HW (2007) Integrons: mobilizable platforms that
promote genetic diversity in bacteria. Trends Microbiol 15(7): 301-309.
Bouvier M, Demarre G, Mazel D (2005) Integron cassette insertion: a recombination process
involving a folded single strand substrate. Embo J 24(24): 4356-4367.
Cirz RT, Chin JK, Andes DR, de Crecy-Lagard V, Craig WA et al. (2005) Inhibition of muta-
tion and combating the evolution of antibiotic resistance. PLoS Biol 3(6): e176.
Collis CM, Hall RM (1995) Expression of antibiotic resistance genes in the integrated cas-
settes of integrons. Antimicrob Agents Chemother 39(1): 155-162.
Collis CM, Kim MJ, Stokes HW, Hall RM (2002) Integron-encoded IntI integrases preferen-
tially recognize the adjacent cognate attI site in recombination with a 59-be site. Mol
Microbiol 46(5): 1415-1427.
Courcelle J, Khodursky A, Peter B, Brown PO, Hanawalt PC (2001) Comparative gene ex-
pression profiles following UV exposure in wild-type and SOS-deficient Escherichia
coli. Genetics 158(1): 64.
Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo: a sequence logo generator.
Genome Res 14(6): 1188-1190.
Diaz-Mejia JJ, Amabile-Cuevas CF, Rosas I, Souza V (2008) An analysis of the evolutionary
relationships of integron integrases, with emphasis on the prevalence of class 1 inte-
grons in Escherichia coli isolates from clinical and environmental origins. Microbiol-
ogy 154(Pt 1): 94-102.
Erill I, Campoy S, Barbe J (2007) Aeons of distress: an evolutionary perspective on the bacte-
rial SOS response. FEMS Microbiol Rev 31(6): 637-656.
Fernandez De Henestrosa AR, Ogi T, Aoyagi S, Chafin D, Hayes JJ et al. (2000) Identifica-
tion of additional genes belonging to the LexA regulon in Escherichia coli. Mol Mi-
crobiol 35(6): 1560-1572.
Fluit AC, Schmitz FJ (2004) Resistance integrons and super-integrons. Clin Microbiol Infect
10(4): 272-288.
Fonseca EL, Dos Santos Freitas F, Vieira VV, Vicente AC (2008) New qnr Gene Cassettes
Associated with Superintegron Repeats in Vibrio cholerae O1. Emerg Infect Dis
14(7): 1129-1131.
Gillings MR, Holley MP, Stokes HW, Holmes AJ (2005) Integrons in Xanthomonas: a source
of species genome diversity. Proc Natl Acad Sci U S A 102(12): 4419-4424.
Gonzalez-Zorn B, Catalan A, Escudero JA, Dominguez L, Teshager T et al. (2005) Genetic
basis for dissemination of armA. J Antimicrob Chemother 56(3): 583-585.
Guerin E, Cambray G, Sanchez-Alberola N, Campoy S, Erill I et al. (2009) The SOS response

200
Results – Recombination in integrons is controled by the SOS response to stress
Article III
controls integron recombination. Science 324(5930): 1034.
Kelley WL (2006) Lex marks the spot: the virulent side of SOS and a closer look at the LexA
regulon. Mol Microbiol 62(5): 1228-1238.
Kim T-E, Kwon H-J, Cho S-H, Kim S, Lee B-K et al. (2007) Molecular differentiation of
common promoters in Salmonella class 1 integrons. Journal of Microbiological Meth-
ods 68(3): 453-457.
Kimmitt PT, Harwood CR, Barer MR (1999) Induction of type 2 Shiga toxin synthesis in Es-
cherichia coli O157 by 4-quinolones. Lancet 353(9164): 1588-1589.
Koenig JE, Boucher Y, Charlebois RL, Nesbo C, Zhaxybayeva O et al. (2008) Integron-
associated gene cassettes in Halifax Harbour: assessment of a mobile gene pool in
marine sediments. Environ Microbiol 10(4): 1024-1038.
Labbate M, Boucher Y, Joss MJ, Michael CA, Gillings MR et al. (2007) Use of chromosomal
integron arrays as a phylogenetic typing system for Vibrio cholerae pandemic strains.
Microbiology 153(Pt 5): 1488-1498.
Levesque C, Brassard S, Lapointe J, Roy PH (1994) Diversity and relative strength of tandem
promoters for the antibiotic-resistance genes of several integrons. Gene 142(1): 49-
54.
Little JW (1991) Mechanism of specific LexA cleavage: autodigestion and the role of RecA
coprotease. Biochimie 73(4): 411-421.
MacDonald D, Demarre G, Bouvier M, Mazel D, Gopaul DN (2006) Structural basis for
broad DNA-specificity in integron recombination. Nature 440(7088): 1157-1162.
Mazel D (2006a) Integrons: agents of bacterial evolution. Nat Rev Microbiol 4(8): 608-620.
Mazel D (2006b) Integrons: agents of bacterial evolution. Nature Reviews Microbiology
4(8): 620.
Mazel D, Dychinco B, Webb VA, Davies J (1998) A distinctive class of integron in the Vibrio
cholerae genome. Science 280(5363): 605-608.
Melano R, Petroni A, Garutti A, Saka HA, Mange L et al. (2002) New Carbenicillin-
Hydrolyzing beta-Lactamase (CARB-7) from Vibrio cholerae Non-O1, Non-O139
Strains Encoded by the VCR Region of the V. cholerae Genome. Antimicrob Agents
Chemother 46(7): 2162-2168.
Michael CA, Gillings MR, Holmes AJ, Hughes L, Andrew NR et al. (2004) Mobile gene cas-
settes: a fundamental resource for bacterial evolution. Am Nat 164(1): 1-12. Epub
2004 May 2003.
Nemergut DR, Robeson MS, Kysela RF, Martin AP, Schmidt SK et al. (2008) Insights and
inferences about integron evolution from genomic data. BMC Genomics 9: 261.
Nijssen S, Florijn A, Top J, Willems R, Fluit A et al. (2005) Unnoticed spread of integron-
carrying Enterobacteriaceae in intensive care units. Clin Infect Dis 41(1): 1-9.
Partridge SR, Tsafnat G, Coiera E, Iredell JR (2009) Gene cassettes and cassette arrays in
mobile resistance integrons. FEMS Microbiol Rev 13: 13.
Petroni A, Melano RG, Saka HA, Garutti A, Mange L et al. (2004) CARB-9, a carbenicilli-
nase encoded in the VCR region of Vibrio cholerae non-O1, non-O139 belongs to a
family of cassette-encoded beta-lactamases. Antimicrob Agents Chemother 48(10):
4042-4046.
Rowe-Magnus DA, Guerout AM, Biskri L, Bouige P, Mazel D (2003) Comparative analysis
of superintegrons: engineering extensive genetic diversity in the vibrionaceae. Ge-
nome Res 13(3): 428-442.
Rowe-Magnus DA, Guerout AM, Mazel D (1999) Super-integrons. Res Microbiol 150(9-10):
641-651.
Rowe-Magnus DA, Guerout AM, Mazel D (2002) Bacterial resistance evolution by recruit-
ment of super-integron gene cassettes. Mol Microbiol 43(6): 1657-1669.

201
Results – Recombination in integrons is controled by the SOS response to stress
Article III
Rowe-Magnus DA, Guerout AM, Ploncard P, Dychinco B, Davies J et al. (2001) The evolu-
tionary history of chromosomal super-integrons provides an ancestry for multiresis-
tant integrons. Proc Natl Acad Sci U S A 98(2): 652-657.
Rowe-Magnus DA, Mazel D (2002) The role of integrons in antibiotic resistance gene cap-
ture. Int J Med Microbiol 292(2): 115-125.
Sassanfar M, Roberts JW (1990) Nature of the SOS-inducing signal in Escherichia coli. The
involvement of DNA replication. J Mol Biol 212(1): 79-96.
Shadel GS, Devine JH, Baldwin TO (1990) Control of the lux regulon of Vibrio fischeri. J
Biolumin Chemilumin 5(2): 99-106.
Stokes HW, Hall RM (1989) A novel family of potentially mobile DNA elements encoding
site-specific gene-integration functions: integrons. Mol Microbiol 3(12): 1669-1683.
Tapias A, Barbe J (1998) Mutational analysis of the Rhizobium etli recA operator. J Bacteriol
180(23): 6325-6331.
Ubeda C, Maiques E, Tormo MA, Campoy S, Lasa I et al. (2007) SaPI operon I is required
for SaPI packaging and is controlled by LexA. Mol Microbiol 65(1): 41-50.
Vaisvila R, Morgan RD, Posfai J, Raleigh EA (2001) Discovery and distribution of super-
integrons among Pseudomonads. Molecular Microbiology 42: 587-601.
Waldor MK, Friedman DI (2005) Phage regulatory circuits and virulence gene expression.
Curr Opin Microbiol 8(4): 459-465.
Walker GC (1984) Mutagenesis and inducible responses to deoxyribonucleic acid damage in
Escherichia coli. Microbiol Rev 48(1): 60-93.
Weldhagen GF (2004) Integrons and beta-lactamases--a novel perspective on resistance. Int J
Antimicrob Agents 23(6): 556-562.
Yang MK, Yang YC, Hsu CH (2002) Characterization of Xanthomonas axonopodis pv. citri
LexA: recognition of the LexA binding site. Mol Genet Genomics 268(4): 477-487.

202
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figures

Figure 1 – Schematic organization of integrons

The functional platform of integrons is constituted by an intI gene encoding an integrase, a


cassette promoter Pc and a primary recombination site attI. The system maintains an array
that can consist in more than 200 cassettes in chromosomal superintegrons. Only the few first
cassettes are expressed by the Pc, a feature represented by the fading filling color. The rest of
the array can be seen as a reservoir of standing genetic variation. A cassette is generally con-
stituted of a promoterless ORF flanked by two recombination sites termed attCs. Cassettes
can be excised from any position in the array through attC x attC recombination mediated by
the integrase. The resulting circular intermediate can then be integrated by the integrase, pref-
erentially at attI bringing the cassette under control of Pc. Note that exogenous circular inter-
mediate can be integrated owing to the low specificity of the integrase activity, rendering the
system prone to horizontal transfer. In the present study, the integrase promoter Pint is shown
to be under the control of the SOS system, thus conditioning recombination to periods of SOS
inducing stress.

203
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figure 2 – In silico analysis of integrases promoter

(A) Alignment of representative promoter regions of Vibrionaceae intIA homologues. Puta-


tive LexA-binding sequences are boxed, while putative σ70 promoter elements (-35 and -10)
are underlined and the translation start site of intIA is boxed and highlighted in bold type. (B)
Representative examples of LexA-binding sites identified upstream of different integrase
genes. MI stands for mobile integron, while SI stands for superintegron and the subsequent
number (1-5) denotes integrase class. The provided accessors correspond to IntI proteins
from: E. coli pSa (AAA92752), Providencia stuartii ABR23a (ABG21674), Serratia marces-
cens AK9373 (BAA08929), V. cholerae 569B (AAC38424) and Vibrio salmonicida VS224
pRVS1 (CAC35342). (C) Sequence logos (Crooks et al., 2004) of the profile used to search
for β/γ- Proteobacteria LexA-binding sites (top) and the profile emerging from the distinct 54
located binding sites (bottom). Abbreviations for panel A are as follows: Lan, Listonella an-
guillarum; Lpe_CIP, L. pelagia CIP 102762T; Val_12G01, Vibrio alginolyticus 12G01;
Vch_N16961, V. cholerae O1 biovar Eltor str. N16961; Vha_ATCC, V. harveyi ATCC BAA-
1116; Vha_HY01, V. harveyi HY01; Vme, V. metschnikovii; Vmi, V. mimicus; Vna_CIP, V.
natriegens strain CIP 10319; Vpa, V. parahaemolyticus; Vpa_RIMD, V. parahaemolyticus
RIMD 2210633; Vsh_AK1, V. shilonii AK1; Vsp_DAT722, Vibrio sp. DAT722; Vsp_Ex25,
Vibrio sp. Ex25; Vvu_CIP754, V. vulnificus CIP 75.4; Vvu_YJ016, V. vulnificus YJ016.
Corresponding accession numbers can be found in Table S1.

204
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figure 3 – Electro-mobility shift assays on different intIA promoter mutants.

(A) Sequence of the V.cholerae intIA promoter region and of the different LexA box mutants
tested. The putative LexA-binding sequence is boxed in red, the putative σ70 promoter ele-
ments (-35 and -10) are framed in green and the intIA 5’ region is indicated by a grey open
frame. (B) EMSA of the different lexA box mutants in presence the V.cholerae LexA purified
protein. F, free DNA, R, retarded complex.

205
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figure 5 – Phylogenetic tree of intI genes.

206
Results – Recombination in integrons is controled by the SOS response to stress
Article III

The tree illustrates the distribution of identified and experimentally verified LexA-binding
sites. This is the best-distance neighbour-joining tree obtained using MEGA3 and was rooted
using E. coli and Thiobacillus denitrificans XerCD protein sequences as outgroup[Mazel,
2006 #159]. Bootstrap values are based on 1,000 pseudo-replicates and the scale bar indicates
the number of substitutions per site. Broken LexA-binding sites denote disrupted sites located
through in silico searches. Abbreviations are as follows: Azo, Azoarcus sp. EbN1; Dar,
Dechloromonas aromatica; Eco, E. coli; Gme, Geobacter metallireducens; Lan, Listonella
anguillarum; Lpe, Listonella pelagia; Mfl, Methylobacillus flagellatus; Neu, Nitrosomonas
europaea; Nmo, Nitrococcus mobilis; Pal, Pseudomonas alcaligenes; Pme, Pseudomonas
mendocina; Ppr, Photobacterium profundum; PstuBA, Pseudomonas stutzeri BAM; PstuQ,
Pseudomonas stutzeri Q; Rei, Reinekea sp.; Rba, Rhodopirellula baltica; Rge, Rubrivivax ge-
latinosus; Sde, Saccharophagus degradans; Sam, Shewanella amazonensis; Ssp, Shewanella
sp. MR-7; Son, Shewanella oneidensis; Spu, Shewanella putrefaciens; Tden, Treponema den-
ticola; Tde, Thiobacillus denitrificans; Vch, Vibrio cholerae; Vfi, Vibrio fischeri; Vme, Vi-
brio metschnikovii; Vmi, Vibrio mimicus; Vpa, Vibrio parahaemolyticus; Vsp, Vibrio
splendidus; Vvu, Vibrio vulnificus; Xca, Xanthomonas campestris; Xor, Xanthomonas oryzae;
Xsp, Xanthomonas sp.

207
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figure 4 – Impact of SOS induction on the integron system

The increase in expression levels of V. cholerae’s intIA (A) and of pAT674’s intI1 in E. coli
(B) genes upon induction of the SOS response was monitored using a β-galactosidase re-
porter. Several antibiotics were used to induce the SOS response in a WT background, as in-
dicated by the letter below the histogram’s bars (M, mitomycin; C, ciprofloxacin; T,
trimethoprim; A, ampicilin). To validate the involvement of the SOS system, induction of intI
expression was further assessed in several SOS defective mutant backgrounds. Mutants are
specified in the insets. The LexA box mutants correspond to substitution of the canonical site
CTG-N10-CAG by TAA-N10-CAG (#1) and CTC-N10-GAG (#2), in the promoter of intIA;
or by TAA-N10-CAG (#1) and CTG-N10-ACT (#2), in the promoter of intI1. The functional
impact of SOS mediated induction of intI expression on cassette recombination was measured
using a dedicated reporter. In V. cholerae recombination rate was monitored upon mitomycin
C treatment (C), while in E. coli, induction was genetically mimicked by comparison of re-
combination rate between wild type and LexA box mutant #2 (see Material and Methods).

208
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Supplementary materials

Figure S1 – Organization of the plasmid pMUR050

(A)

(B) Pint1
Pint2
Pc Pc Pc2
5’ 3’
intI1 ant3'9 linF tnp intI1 ant3'9 qacEdelta1

GACTGTTTTTTTGTACAGTCTATGCCTCGGGCATCCAAGCAGCAAGC

GCTTGTTATGACTGTTTTTTTGGGGTACAGTCTATGCCTCGGGCATCCAAGCAGCAAGC

(A) Schematic diagram of the pMUR050 plasmid (AY522431) showing both copies of the in-
tegron integrase (red arrows). (B) Detail of both integron integrase promoter regions. LexA-
binding sites are denoted by red rectangles. Black circles indicate the integron integrase Pint
promoter elements (-10 and -35). Green rectangles define the Pc2 promoter elements (-10 and
-35). Pc2 is a secondary cassette gene promoter activated by the disruption of the LexA-
binding site.

209
Results – Recombination in integrons is controled by the SOS response to stress
Article III

Figure S2 – CCC insertions in the intI’s LexA-box prevent LexA-binding.

Electrophoretic mobility shift assay using E. coli purified LexA protein and the two
pMUR050 integrase promoters PintI1D1 (Pint1) and PintI1D2 (Pint2), the last presenting a CCC in-
sertion in the LexA-box (see material and methods). F: free DNA; R: retarded DNA.

210
Results – Intrinsic evolutionary potential of genes
Article III

Figure S3 – Schematic representation of the positively selectable IntI-mediated excision


reporter assay

(A) A cassette bearing cat (CmR) associated with a VCR is inserted between a attCaadA7 site
and the aac(6’)-Ib gene whose 3’ part is identical to its wild-type counterpart except for the
start codon, thereby preventing expression of the gene aac(6’)-Ib*. This construction allows
resistance to chloramphenicol only. (B) IntI-mediated excision of the cat-VCR cassette allow
expression a functional AAC(6’)-Ib fused in its N-terminal part with the translation of a small
attC site (in red) and the peptide from pSU38. Homologous recombination between the two
attC sites is impossible because these are different in sequence, ensuring that proper expres-
sion of the functional gene only relies on site-specific recombination. Expression of the hy-
brid aac(6’)-Ib* provides selectable resistance to tobramycin, while the cat cassette deletion
leads to the loss of Cm resistance.

211
Results – Intrinsic evolutionary potential of genes

III. INTRINSIC EVOLUTIONARY POTENTIAL OF


GENES

BACKGROUND
The standard genetic code is redundant and its structure is not random. Irrespective of
its function, the sequence of a gene heavily constrains the genotypic and thus phenotypic
space accessible by point mutations. Particularly, synonymous codons can access different
sets of amino-acids from each other. A given protein would then reach different areas of the
phenotype landscape depending on its actual nucleotide sequence. Over evolutionary time, the
essentially neutral diversification of a coding sequence – corresponding to the exploration of
its synonymous space – would then grant access to new phenotypes. We develop a strategy to
take advantage of this observation in the framework of directed evolution. Because directed
and natural evolutions are based on the same principles, this approach also shed light on the
constraints experienced by natural sequences.

METHODS
We implemented an algorithm (ELP) to output synonymous sequences with evolution-
ary perspectives as different as possible from each others and from an input sequence. ELP
was used to design a synonymous version of the aac(6’)-Ib gene, which is naturally borne by
an integron cassette and encode resistance to aminoglycoside antibiotics. The synthetic gene
was then constructed. Both versions of the gene were mutagenized by error-prone PCR. The
resulting libraries were concurrently screened for increase resistance pattern on a variety of
antibiotics. Monte-Carlo simulations were performed to assess the impact of selection on the
exploration of adaptive landscape.

RESULTS AND DISCUSSION


The conceptual translation of alleles randomly picked from the mutant libraries before
selection demonstrated that the two gene versions effectively experience different areas of the
phenotypic space. Accordingly, distinct advantageous variants were isolated from these ver-
sions. Considering the fast development of DNA synthesis services, the incorporation of ELP-

212
Results – Intrinsic evolutionary potential of genes

designed synonymous sequences can thus greatly enhance the efficiency of directed evolution
experiments at little cost.
In nature, the diversification of sequences is possibly constrained by rugged adaptive
landscapes. Our simulations show that even slightly deleterious intermediates can signifi-
cantly constraints the following of evolutionary routes. In this context, selective regimes char-
acterized by the alternation of relaxed and intense selection periods – such as the one
experienced by gene cassettes – are likely to promote the exploration of neutral spaces. This
effect would favor protein diversification and evolvability at the population level.

ARTICLE IV
This article has been published in PLoS Genetics (pp 214-231).

213
Results – Intrinsic evolutionary potential of genes
Article IV

214
Results – Intrinsic evolutionary potential of genes
Article IV

215
Results – Intrinsic evolutionary potential of genes
Article IV

216
Results – Intrinsic evolutionary potential of genes
Article IV

217
Results – Intrinsic evolutionary potential of genes
Article IV

218
Results – Intrinsic evolutionary potential of genes
Article IV

219
Results – Intrinsic evolutionary potential of genes
Article IV

220
Results – Intrinsic evolutionary potential of genes
Article IV

221
Results – Intrinsic evolutionary potential of genes
Article IV

222
Results – Intrinsic evolutionary potential of genes
Article IV

Figure S1: Relative Evolutionary Potentials of the different synonymous codons.

LEU TTA TTG CTT CTC CTA CTG SER TCT TCC TCA TCG AGT AGC ARG CGT CGC CGA CGG AGA AGG
TTA 1 1 1 2 3 TCT 0 3 3 4 4 CGT 0 3 3 4 4
TTG 2 3 3 4 3 TCC 0 3 3 4 4 CGC 0 3 3 4 4
CTT 3 4 0 2 3 TCA 1 1 0 3 3 CGA 1 1 0 3 3
CTC 3 4 0 2 3 TCG 2 2 1 4 4 CGG 2 2 1 4 3
CTA 3 4 1 1 1 AGT 4 5 5 5 0 AGA 3 3 4 4 1
CTG 4 3 2 2 1 AGC 4 5 5 5 0 AGG 4 4 5 4 2

PRO CCT CCC CCA CCG THR ACT ACC ACA ACG VAL GTT GTC GTA GTG ALA GCT GCC GCA GCG GLY GGT GGC GGA GGG
CCT 0 1 1 ACT 0 2 3 GTT 0 2 3 GCT 0 1 1 GGT 0 2 2
CCC 0 1 1 ACC 0 2 3 GTC 0 2 3 GCC 0 1 1 GGC 0 2 2
CCA 1 1 0 ACA 2 2 1 GTA 2 2 1 GCA 1 1 0 GGA 2 2 0
CCG 1 1 0 ACG 3 3 1 GTG 3 3 1 GCG 1 1 0 GGG 3 3 1

ILE ATT ATC ATA LYS AAA AAG


ATT 0 3 AAA 1
ATC 0 3 AAG 1
ATA 3 3

Ten amino-acids display synonymous codons with different evolutionary potential


from each other. These include all 6-fold degenerate amino-acids: leucine, serine and argin-
ine; five out of six 4-fold degenerate amino-acids: proline, threonine, valine, alanine and gly-
cine; and finally isoleucine and lysine. These tables show the REP for every pair of
synonymous codons corresponding to these amino-acids. We define the Relative Evolutionary
Potential of codon XXX relative to its synonymous counterpart YYY (REPXXX/YYY) as the
number of different amino-acids reachable from XXX but not from YYY through single mu-
tation. Note that the REP is not a symmetrical index.

223
Results – Intrinsic evolutionary potential of genes
Article IV

Figure S2: Alignment of aacWT and aacELP sequences

A T G A C C A A C A G C A A C G A T T C C G T C A C A C T G C G C C T C A T G A C T G A G C A T G A C C T T G C G
A T G A C G A A C T C G A A T G A C A G C G T G A C C C T C A G A T T G A T G A C G G A A C A C G A T T T G G C C
A T G C T C T A T G A G T G G C T A A A T C G A T C T C A T A T C G T C G A G T G G T G G G G C G G A G A A G A A
A T G T T G T A C G A A T G G T T G A A C A G A A G T C A C A T T G T G G A A T G G T G G G G G G G T G A G G A G
G C A C G C C C G A C A C T T G C T G A C G T A C A G G A A C A G T A C T T G C C A A G C G T T T T A G C G C A A
G C T A G A C C C A C T T T G G C A G A T G T C C A A G A G C A A T A T C T T C C C T C G G T G C T G G C C C A G
G A G T C C G T C A C T C C A T A C A T T G C A A T G C T G A A T G G A G A G C C G A T T G G G T A T G C C C A G
G A A A G T G T G A C G C C C T A T A T C G C T A T G C T T A A C G G T G A A C C C A T C G G T T A C G C A C A A
T C G T A C G T T G C T C T T G G A A G C G G G G A C G G A T G G T G G G A A G A A G A A A C C G A T C C A G G A
A G T T A T G T G G C A T T G G G T T C G G G T G A T G G T T G G T G G G A G G A G G A G A C G G A C C C C G G T
G T A C G C G G A A T A G A C C A G T T A C T G G C G A A T G C A T C A C A A C T G G G C A A A G G C T T G G G A
G T C A G A G G T A T T G A T C A A C T G C T T G C C A A C G C T A G T C A G C T T G G G A A G G G G C T T G G T
A C C A A G C T G G T T C G A G C T C T G G T T G A G T T G C T G T T C A A T G A T C C C G A G G T C A C C A A G
A C G A A A T T A G T G A G A G C A T T A G T G G A A C T T C T T T T T A A C G A C C C A G A A G T G A C G A A A
A T C C A A A C G G A C C C G T C G C C G A G C A A C T T G C G A G C G A T C C G A T G C T A C G A G A A A G C G
A T T C A G A C T G A T C C C A G T C C C T C G A A T C T T A G A G C C A T T A G A T G T T A T G A A A A G G C C
G G G T T T G A G A G G C A A G G T A C C G T A A C C A C C C C A G A T G G T C C A G C C G T G T A C A T G G T T
G G T T T C G A A C G T C A G G G G A C G G T C A C G A C G C C C G A C G G G C C C G C A G T T T A T A T G G T G
C A A A C A C G C C A G G C A T T C G A G C G A A C A C G C A G T G A T G C C T A A aac
aac_wt
WT
C A G A C T A G A C A A G C T T T T G A A A G A A C T A G A T C G G A C G C A T G A aac_syn
aac ELP

While encoding identical proteins, aacWT and aacELP only share 61% identity. In this figure,
different bases are highlighted in red. Overall, 119 codons out of 184 are different between
the two sequences.

Figure S3: Number of mutations and protein space exploration

Min. number of mutations


1 2 3
Average % From single codon 30 51 19
of aa acces-
sible From all synonymous codons 40 53 7

No amino acid shows more than four codons with different REP. Thus, at any position, a set
of four ELP-designed sequences accesses the same evolutionary landscape as do all the syn-
onymous codons corresponding to the position considered. We compared the minimum num-
ber of mutations necessary to reach the other 19 aa from either a single codon or all
synonymous codons. This figure summarizes the percentage of amino acid accessible in 1, 2
or 3 mutations averaged over the 61 single codon or the 20 sets of synonymous codons. The
use of four ELP-designed sequences, achieves a shift toward a lower number of mutations. It
drastically decreases the number of substitutions requiring three mutations by codon.

224
Results – Intrinsic evolutionary potential of genes
Article IV

Table S1: Oligonucleotides used in this study

Name Sequence (5' → 3')3


AACelp_t11 pAATTCATATGACGAACTCGAATGACAGCGTGACCCTCAGATTGATGACGGAACACGATTTGGCCATGTTGTAC

AACelp_t21 pGAATGGTTGAACAGAAGTCACATTGTGGAATGGTGGGGGGGTGAGGAGGCTAGACCCACTTTGGCAGATG
1
AACelp_t3 pTCCAAGAGCAATATCTTCCCTCGGTGCTGGCCCAGGAAAGTGTGACGCCCTATATCGCTATGCTTAACGG

AACelp_t41 pTGAACCCATCGGTTACGCACAAAGTTATGTGGCATTGGGTTCGGGTGATGGTTGGTGGGAGGAGGAGACG
1
AACelp_t5 pGACCCCGGTGTCAGAGGTATTGATCAACTGCTTGCCAGGTTCGGGTGATGGTTGGTGGGAGGAGGAGACG
1
AACelp_t6 pGACCCCGGTGTCAGAGGTATTGATCAACTGCTTGCCACCCAGAAGTGACGAAAATTCAGACTGATCCCAG

AACelp_t71 pTCCCTCGAATCTTAGAGCCATTAGATGTTATGAAAAGGCCGGTTTCGAACGTCAGGGGACGGTCACGACG
1
AACelp_t8 pCCCGACGGGCCCGCAGTTTATATGGTGCAGACTAGACAAGCTTTTGAAAGAACTAGATCGGACGCATGAG

AACelp_b11 pCAATCTGAGGGTCACGCTGTCATTCGAGTTCGTCATATG
1
AACelp_b2 pCCCACCATTCCACAATGTGACTTCTGTTCAACCATTCGTACAACATGGCCAAATCGTGTTCCGTCATATG

AACelp_b31 pTCCTGGGCCAGCACCGAGGGAAGATATTGCTCTTGGACATCTGCCAAAGTGGGTCTAGCCTCCTCACCCC
1
AACelp_b4 pCAATGCCACATAACTTTGTGCGTAACCGATGGGTTCACCGTTAAGCATAGCGATATAGGGCGTCACACTT

AACelp_b51 pTGGCAAGCAGTTGATCAATACCTCTGACACCGGGGTCCGTCTCCTCCTCCCACCAACCATCACCCGAACC
1
AACelp_b6 pTGGCAAGCAGTTGATCAATACCTCTGACACCGGGGTCCGTCTCCTCCTCCCACCAACCATCACCCGAACC
1
AACelp_b7 pCTTTTCATAACATCTAATGGCTCTAAGATTCGAGGGACTGGGATCAGTCTGAATTTTCGTCACTTCTGGG
1
AACelp_b8 pGTCTAGTCTGCACCATATAAACTGCGGGCCCGTCGGGCGTCGTGACCGTCCCCTGACGTTCGAAACCGGC
1
AACelp_b9 pGATCCTCATGCGTCCGATCTAGTTCTTTCAAAAGCTT

AACmut_Rw2 TGTGAGCGGATAACAATTTCACACAGgaattcAT
2
AACmut_Fw GCATGCCTGCAGGTCGACTCTAGAggatcc

1
AACelps were designed to construct the synthetic gene aacELP
2
AACmuts were used to amplify cloned genes
3
The letter p indicates phosphorylation

225
Results – Intrinsic evolutionary potential of genes
Article IV

able S2: Properties of the mutant libraries

# Size2 Mut. Rate


1 >106 0.5
aacWT1

2 >106 1.3
3 >106 3.1
4 >106 5.2
1 >106 1.3
aacELP1

2 >106 0.9
3 >106 2.5
4 >106 3.2

1
Libraries were generated by error-prone PCR from each version of the gene aac(6’)-Ib.
2
Each pool contained approximately the same number of clones, as estimated on plates be-
fore selection.
3
The mean mutation rate is 2.5 mut./kb for aacWT and 2 mut./kb for aacELP

226
Results – Intrinsic evolutionary potential of genes
Article IV

Text S1: Modeling of the relationship between protein space exploration and library size

Let us assume that the sequence is L nucleotides long and that any modification in a fraction f
of its positions is not lethal (i.e. leads to properly folded proteins [1]).
The probability that a sequence codes for a properly folded proteins after m independent mu-
tations is:

C Lm. f ( Lf )! ( L  m)!
Pf ( m )   . (1)
C m
L ( Lf  m)! L!

The denominator is the total number of mutants bearing m mutations, while the numerator is

the number of combinations in which these mutations does not adversely affect protein func-

tion. Assuming the sequence length L is much larger than the number of introduced mutations

m (L >> m), this equation simplifies into:

Pf ( m )  f m
(2)

which is consistent with several studies [1, 2].

Let us now consider a given target optimal genotype that is k mutations away from the
reference one. Among sequences with m mutations, the probability that the k desired muta-
tions are present is:

C Lmkk C mk m!
P ( solution | m mutations )  m
 k  , if m  k and 0 otherwise (3)
CL C L (m-k)!Lk

The probability that a sequence with m mutations encodes a properly folded protein and con-
tains the k desired mutations directly stems from equation (2) and (3):

m!
P ( solution and folded | m mutations )  f m
, if m  k and 0 otherwise (4)
(m-k)!Lk

227
Results – Intrinsic evolutionary potential of genes
Article IV

If we assume, as usual, that a library is composed of sequences with a Poisson distributed


number of mutations with mean X, then the probability to find the target sequence coding for
a properly folded protein is:


eX X m m!
P ( solution and folded | X mutations on average )  
mk m!
f m

( m  k )! Lk
(5)

which simplifies into:

k
 X (1 f )  fX 
P ( solution and folded | X mutations on average )  e   (6)
 L 

The inverse of (6) is the mean library size required to generate one target clone.
Deriving equation (6) with respect to X gives the optimal mean mutation rate respective to
targets k mutations away

k
X opt  (7)
1 f

The graph below displays the inverses of equation (6) for target variants at k=1 (red), k=2
(orange) and k=3 (yellow) mutations away from the template. We assumed a standard bacte-
rial gene length (L=1000) and a conservative proportion of non-deleterious mutations at the
DNA level (f=3/4, corresponding to 1/3 of lethal aa substitutions (Dawid et al.)). Numbers on
the left side scale are obtained by calculating the inverse of equation (6) for X equal to Xopt
from equation (7).

228
Results – Intrinsic evolutionary potential of genes
Article IV

The increase in required size between a library covering a mutational distance k+i and one
targeting k mutations is:

P( solution k and folded | X mutations on average) L i


( ) (8)
P( solution k  i and folded | X mutations on average) X.f

The larger the mean number of mutations, the higher the chance to recover a target further
away. However, optimal mutation rate for error-prone PCR derived libraries are predicted to
be rather low, even when subtle advantages of high mutation rate are taken into account [2].
The following graph displays equation (8) for i=1 (red) and i=2 (orange). A substantial in-
crease in library size is required to fully explore possibilities, even with a somewhat high mu-
tation rate of 4 mutations on average per gene (dotted line). As the occurrence of several
mutations in the same codon is very rare using error-prone PCR, these curves can be inter-
preted as lower bounds to the increase in library size necessary to obtain a 2 or 3 mutations in
the same codon instead of 1.

229
Results – Intrinsic evolutionary potential of genes
Article IV

The overall picture could have been worse if we had assumed a cumulative effect of muta-
tions: due to negative epistasis neutral mutations may become deleterious when they accumu-
late [4].

1. Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH (2005) Thermody-
namic prediction of protein neutrality. PNAS :606–611.
2. Drummond DA, Iverson BL, Georgiou G, Arnold FH (2005) Why High-error-rate Random
Mutagenesis Libraries are Enriched in Functional and Improved Proteins. J. Mol. Biology
350: 806-816.
3. Guo HH, Choe J, Loeb LA (2004) Protein tolerance to random amino acid change. PNAS
101: 9205-9210.
4. Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS (2006) Robustness-epistasis
link shapes the fitness landscape of a randomly drifting protein. Nature 444: 929.

230
Results – Intrinsic evolutionary potential of genes
Article IV

Data S1: Alignment of the aac(6')-Ib homologs identified by BlastP

The protein sequence AAC(6')-Ib was blasted again the NCBI nr protein database as of
2007/08/26. Corresponding nucleotide sequences were fetched, sorted and aligned using a
dedicated BioPerl script.

The file can be downloaded on the PLoS Genetics server


(http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1000256#s5)

231
232
DISCUSSION

233
Discussion – Integrons are powerful adaptive systems

Integrons are mostly known for their involvement in the emergence of multi-resistance
to antibiotics in gram-negative and – to a lesser extent – gram-positive bacteria. However,
chromosomally borne integrons associated with arrays containing 0 to >200 cassettes have
been identified in ca. 10% of all sequenced bacteria, and have been detected in a variety of
environments. Although most of the genes present in these cassettes are of unknown func-
tions, their mere existence underscores their functional value as accessory factors. The rapid
success of multi-resistance integrons in overcoming the human-imposed antibiotic selective
pressure reflects the co-option of ancient chromosomal systems by mobile elements, and pro-
vides the best illustration of their adaptive potential. Nevertheless, the propensity of integrons
to recombine cassettes – a very specific type of mutation – was unknown. The discovery that
recombination is controlled by the SOS response in most integrons sheds new light on this
problem and provides the opportunity to highlight the idiosyncrasies of the system with re-
spect to other diversity-generating mechanisms. Interestingly, ssDNA is a central component
of both the SOS and integron systems. Beyond its impact on evolvability, the coupling of
these systems may thus have a profound mechanistic significance. Overall, these observations
may have important medical implications. The two themes developed in this work address the
evolvability of biological systems. From a biotechnological perspective, the results presented
above can be used to increase the generation of diversity at different levels of organization.

I. INTEGRONS ARE POWERFUL ADAPTIVE SYSTEMS

I.1. The expression of gene cassettes

I.1.1. Coupling between recombination and expression


Two characteristics are essential to our understanding of the integron system: i) the Pc
promoter located upstream of the attI recombination point is the only element responsible for
the consistent expression of cassette-borne genes (Stokes and Hall, 1989; Levesque et al.,
1994; Bunny et al., 1995; Collis and Hall, 1995); and ii) the excision of cassettes through attC
x attC recombinations occurs randomly within the cassette array, while subsequent reintegra-
tions preferentially involve attI x attC recombination events (Collis et al., 1993; Collis et al.,
2001). Taken together, these two features straightforwardly couples recombination with varia-

234
The expression of gene cassettes – Coupling between recombination and expression

tion in cassette expression. Indeed, only the few proximal cassettes of a given array are sub-
jected to expression, while the remaining cassettes are kept silent (Collis and Hall, 1995).
The mobilization of a gene cassette can lead to three different outcomes: i) the expres-
sion of a fitness-enhancing trait, in which case the recombination is positively selected; the
expression of a trait irrelevant to the current environment, which can either prove: ii) nearly
neutral, if none of the previously expressed cassettes are adaptive; or iii) deleterious, if the
event brings previously integrated adaptive cassettes away from expression, in which case it
would be counter selected. As evidenced by the model presented in Article I (see Evolution of
recombination rate in integrons, p154), this behavior enables a strong link between integron-
mediated fitness and recombination rate, thereby allowing second-order selection to operate
on this latter modifier trait. To some extent, this model would still hold if cassettes were re-
integrated randomly in the array. However, higher recombination rates would then be required
– with all the drawbacks this is implying (see below). In contrast, the conditional expression
of cassettes according to their position in the array is a strictly required feature in this frame-
work.
A genome-wide expression profiling study reported a slight pattern of differential ex-
pression for a hundred of cassette-borne genes in hapR, rpoS and rpoN mutants with respect
to wild-type (Yildiz et al., 2004). Decreased expression ratios averaging 0.74 and 0.66 were
observed in the hapR and rpoS backgrounds respectively, while increased ratios averaging
1.68 were measured in an rpoN mutant. Although these data suggest that the typical attC sites
of V. cholerae (VCR) can act as cryptic promoters, they must be interpreted with caution. In-
deed, these experiments were not specifically undertaken to monitor cassette expression and
the microarray used in this work was designed from the publically available annotation of the
V. cholerae genome (Heidelberg et al., 2000). Now, the nucleotide composition of the genes
found in cassettes is generally at odd with the rest of the genome – a phenomenon that under-
pin their exogenous origin. Consequently, gene identification algorithms perform particularly
badly in integrons, and several misannotated ORFs overlap a VCR in the released annotation.
The probe designed for these genes may thus hybridize with the VCR-containing transcripts
that originate from the Pc, blurring the signal. Besides, only log-ratios are presented in this
publication while absolute expression values are not available. Faint expression increases may
result from the diversification of the integron through induction of IntI-mediated recombina-
tion in mutant backgrounds. Despite their weaknesses, the biological implications of these ob-
servations are interesting. HapR is part of the quorum-sensing system of V. cholerae and is
induced at high cell densities, while RpoS orchestrates a general stress response. Thus, cas-

235
Discussion – Integrons are powerful adaptive systems

sette expression might be slightly activated in crowded and stressed bacterial communities
that contain non-replicating organisms. The design of accurate experiments using a dedicated
microarray would be an interesting project to confirm these results, and further assay the ef-
fect of various conditions on cassette expression.
Some cassettes exceptionally carry their own promoter (Bissonnette et al., 1991;
Stokes and Hall, 1991; Naas et al., 2001b; Biskri and Mazel, 2003). Such promoters may also
control the expression of downstream cassettes – thereby providing an alternative to the Pc
promoter. The Toxin-Antitoxin cassettes that are frequently found in large chromosomal inte-
grons also harbor a functional promoter (Szekeres et al., 2007). However, these genes are
generally oriented divergently with respect to canonical cassettes so that they cannot directly
influence the expression of neighboring cassettes. Although no data support this hypothesis,
small cassettes that are unlikely to encode proteins might constitute floatting promoters. Be-
cause they would not benefit from being recombined in attI, all such cassettes do not really fit
within our model: attC x attC integrations being rare events, the generation of variability
downstream of floating promoters would then rely on cassette excision.
Clearly, these unusual cassettes short-circuit the normal functioning of the system.
However, they seem anecdotical enough not to challenge our working model. If all cassettes
were to harbor their own promoter, the role of integrons would be restricted to the facilitated
collection of constitutively expressed traits. As highlighted below, the maintenance of an un-
expressed reservoir of cassette endows the system with valuable properties – especially when
associated with the SOS response.

I.1.2. The integron system mimics inducible promoters


As discussed previously (see Genetic versus physiological changes, p100), physiologi-
cal regulation allows responsive, targeted and non-heritable modification of the phenotype.
This process relies on the molecular sensing of environmental cues (either internal or exter-
nal) and their subsequent transduction into functional effects. The evolution and maintenance
of such fine-tuned machineries is slow and costly – and may even not be possible in most
cases. Nevertheless, the modification of expression pattern is a major source of phenotypic
novelty (Gerhart and Kirschner, 2007). Several mechanisms, including slipped strand mispair-
ing (p61); gene conversion (p71); site-specific recombination (p71) and epigenetic (p89) have
evolved to facilitate the constitutive alteration of gene expression.

236
The expression of gene cassettes – Increased rate of cassette evolution

Most of theses systems are limited in the range of phenotypes they can confer. In the
simplest cases, they implement a stochastic switch between two genotypes – and hence two
phenotypes –, which enables a bet-hedging strategy (see p102). In few instances, the switch-
ing frequency has been shown to be modulated by the environment (see p110). This last situa-
tion tends to ressemble a context of physiological regulation. Although it can generate of huge
number of slightly different proteins and ensure their mutually exclusive expression, the
mechanism of segmental gene conversion is focused on a single trait (generally surface anti-
gens or immune system, see p71). Some genetic shufflons can access a ten of interdependent
phenotypes using site-specific recombination (see p83).
In contrast, integrons provide a standardized system to switch the expression of an in-
definite number of functionally independent traits. To some extent, the coupling with the SOS
response enables a complex behavior that mimics the function of a multitude of inducible
promoters. Instead of controlling each cassette-borne gene by specialized stress-responsive
promoters, the system maintains a single constitutive promoter and a global stress signal –
DNA damage – is used to switch expression through a site-specific recombination mecha-
nism. This setup is simply mediated by the connection to an existing regulatory network, and
is thus very economic for the cell. At the difference of inducible promoters, changes in cas-
sette expression rely on modifications of the genome structure, and are thus heritable. Be-
sides, if switching a cassette ON is pretty straightforward, driving an expressed cassette to the
OFF state is uneasy and requires the excision – and probable loss – of the cassette, or the suc-
cessive occurrence of several cassette insertions.

I.1.3. Increased rate of cassette evolution


Overall, the time spent under selection by a gene cassette depends on the occurrence of
environments in which it confers an adaptive phenotype. This is generally true for any induc-
ible gene – and to a lesser extent for any accessory gene. Continuous cycles of effective and
relaxed selection are known to increase the rate of protein evolution. As shown in Article I
(see p154), non-expressed cassettes tend to accumulate mutations. As exemplified by the
common insertion of ISs in cassette arrays, many of these mutations are deleterious. However,
occasional selective episodes can purge this deleterious load and only maintain those variants
that retained function. Overall, this process widens the exploration of neutral phenotypic
spaces, which would increase gene evolvability according to a similar principle as the one de-
veloped in Article IV (see p214). Because cassettes are exposed to varying expression

237
Discussion – Integrons are powerful adaptive systems

strength depending on their proximity to the Pc promoter, this effect is probably more effi-
cient than in a binary ON-OFF switch (Bershtein and Tawfik, 2008). Besides, the recombina-
tion mechanism characteristic of integrons is likely to results in frequent duplication of
cassettes (see pp 129 and 146). Such cassettes may experience a pattern of concerted evolu-
tion comparable to the one driving diversification through segmental gene conversion (see
p72).

I.2. Responsive and oriented mutagenesis

I.2.1. Responsive versus constant mutation rates


The regulatory switch afforded by the integron system essentially enables a bet-
hedging strategy (see Bet-hedging, p103). Accordingly, the optimal recombination rate is ex-
pected to equal the mean rate of environmental change (see Article I, p154). Such a relation-
ship may be achieved by fine-tuning the constitutive activity of integrase alleles. According to
this strategy, inadequate variants are produced at a loss in steady environments. This repre-
sents a variability cost at the level of a clonal population. The second figure of Article I show
that longer periods of environmental stasis entail the rise in frequency of the intI null allele,
abolishing recombination (see p168). This probably accounts for the inactivation of roughly
one third of the integrases alleles in natural integrons (Nemergut et al., 2008). In addition, en-
vironmental shifts are not strictly periodic, and episodes with frequent disturbances may al-
ternate with sustained stasis. Rapid modification of recombination rate would be required to
meet the needs imposed by such a variance. It might be that specific processes have been se-
lected to facilitate the ON-OFF switching of the integrase gene. In class 2 integron, the intI2
gene is inactivated but different array arrangements have been isolated, demonstrating that re-
combination can occur. While this probably involves occasional cross-talks with other IntIs
(Hansson et al., 2002), it is noteworthy that functional intI2 alleles have been reported. The
gene might then be subjected to a mechanism facilitating its reversible inactivation.
As discussed in the introduction, the mechanisms mediating stress-induced mutagene-
sis can be viewed as refinements to constitutive mutator phenotypes (see p47). In the same
light, the branching of integrons onto the SOS response permits a faster and more appropriate
response, whereby the fitness costs of untimely recombination are decreased. Interestingly, if
all individuals in the population recombined precisely at environmental shifts, the recombina-

238
Responsive and oriented mutagenesis – Integron and adaptive mutagenesis

tion rate integrated over a large period still equates the rate of environmental changes.
An unknown parameter in the integron system is the probability of cassette reinsertion
following excision. Significant loss of excised cassettes would impoverish the available reser-
voir of standing genetic variations, thereby severely limiting the adaptive potential of the sys-
tem. Excised cassettes exhibit a more stable attC sites which may ensure higher
recombinogenic potential (see p129). Cautiously designed experiments based on the excision
assay developed in Article II (see p173) would permit to shed light on this property. In any
case, the regulation of integrase potentially limits the loss of cassettes. In support to this point,
it seems that integrons lacking SOS regulation tend to have shorter arrays (unpublished ob-
servation).
Globally, bet-hedging strategies correspond to the pre-emptive generation of pheno-
typic variations – which may permit a fraction of the population to survive sudden and harsh
changes. In integrons, the resolution of recombination intermediates may require a round of
genomic replication (see Figure 33, p130). This mechanism may not be fast enough upon ex-
posure to severe bactericidal or bacteriostatic stresses. Besides and as general as the SOS re-
sponse may be, the range of inducing conditions is necessarily limited (but see Single
stranded DNA: a bridge between two systems, p241). Some challenging environments may
not be able to trigger cassette shuffling, limiting the advantages of responsive mutability. In
contrast, the continuous diversification driven by constitutive recombination rates ensures that
the population is poised to adapt to any situations that can be dealt with the available cas-
settes. The SOS response is induced by a factor >20-fold in ca. 0.3% of exponentially grow-
ing E. coli cells – a phenomenon that probably reflects the occurrence of spontaneous DNA
damages (McCool et al., 2004). This subpopulation probably experience increased recombi-
nation rates following derepression of the integrase. To some extent, by allowing the coexis-
tence of pre-emptive and responsive strategies, the SOS regulation thus combines the best of
two worlds.

I.2.2. Integron and adaptive mutagenesis


As compared to other systems enabling programmed genetic variations, integrons
bring together a unique set of features that turn it into a potent adaptive system.
(i) The functional platform constitutes of tightly packed locus – which ensures maximal ge-
netic linkage between recombinational mutations and integrase alleles. This feature is essen-
tial for efficient second-order selection (Tenaillon et al., 2000). However, the genetic link

239
Discussion – Integrons are powerful adaptive systems

with the silent part of the cassette array is inherently loose. In some genomes, the cassette ar-
ray is even split between several locations, e.g. Saccharophagus degradans 2-40T (Weiner et
al., 2008) and Vibrio splendidus LGP32 (Le Roux et al., 2009). Cassettes are prone to IntI-
mediated excision and may not be reintegrated. Furthermore, the homogeneity of attC sites
may also favor cassette loss by homologous recombination or replication slippage in superin-
tegrons. Nonetheless, the TA-harboring cassettes exert a stabilizing influence on the sur-
rounding cassette and may play a significant role in limiting the loss of silent cassette
(Szekeres et al., 2007).
(ii) Gene cassettes essentially correspond to modular and independent units of genetic infor-
mation. Cassette recombination allows instantaneous expression of single and well-defined
traits. Alhough the recombination process is random, the type of genetic mutation afforded by
integron is definitely oriented in the phenotype space. In this way, integrons shift the stochas-
ticity of the mutational process from the level of DNA sequence to that of functional gene.
(iii) Silent gene cassettes constitute a source of standing genetic diversity that can be mobi-
lized through site specific recombination, just like meiotic sex can brought about new adap-
tive alleles from preexisting mutations by homologous recombination (see pp 29 and 71).
Contrasting with genome-wide mutagenesis, integrons rely on a preexisting set of possible
variations. Nevertheless, gene cassettes constitute an extremely vast and diverse metagenome
in which integrons can tap through HGT (see p143). At least some species must encode the
machinery to manufacture new gene cassettes in their respective genomes. Overall, integrons
can thus access an extent and potentially limitless amount of genetic diversity.
(iv) Most of the cassettes that are present in a given array must have been initially recombined
at attI, expressed and selected accordingly. Integrons thus provide a kind of long lasting ge-
netic memory, whereby cassette that previously proved adaptive are stockpiled and can effi-
ciently be mobilized when past selective conditions are renewed.
(v) The coupling of integrons with the SOS response enables a temporal regulation of recom-
bination rate. The system thus combines both spatial and structural confinement of mutagene-
sis, which profoundly limits the cost of increased mutagenesis. Overall, integrons constitute
the perfect embodiment of a Lamarckian process (see Appendix, p260).

I.2.3. A clear case of stress-induced mutagenesis


All documented examples of stress-induced mutagenesis arise as side-effects of DNA
repair mechanisms (Redfield, 2001; Matic et al., 2004; Tenaillon et al., 2004; Roth et al.,

240
A deep connection with SOS triggers – Single stranded DNA: a bridge between two systems

2006; Galhardo et al., 2007). Proximal benefits of increased survival and distal effects on
evolvability are thus intricately linked – and it is very difficult to disentangle which of these
two phenomena is actually selected for. While evolvability is a complex trait that arises from
second-order selection, immediate survival is the matter of first-order selection. Now, the sci-
entific method generally favors the most parsimonious and straightforward explanation – a
rule known as Ockham’s razor. Stress-induced mutagenesis should be regarded as a by-
product of desperate effort to survive that incidentally increases evolvability (for further dis-
cussion see Evolvability, p111). A few of the systems that are specialized in the generation of
targeted variability seem to be controlled by environmental cues (see Targeted mutagenesis
can be regulated, p110). Although these constitute cases of stress-induce mutagenesis, they
are clearly confined to a very limited repertoire of phenotypes. Hence, their general contribu-
tion to adaptation remains largely anecdotic.
In this context, the incorporation of IntI to the SOS regulon provides a clear-cut exam-
ple of stress-induced mutagenesis. As discussed above, the broad evolutionary potential of in-
tegrons is guaranteed by the wide repertoire of genetic diversity that is potentially available in
the cassette metagenome. To the best of our knowledge, there is no short-term advantage in
increasing the recombination rate in time of SOS-inducing stress, aside from improved capac-
ity of adaptation. Therefore, we are compelled to admit that the coupling with the SOS re-
sponse is driven by increased evolvability. The integron case thus constitutes a strong line of
evidence advocating the advantageous evolutionary repercussions of stress-induced
mutagenesis.

I.3. A deep connection with SOS triggers

I.3.1. Single stranded DNA: a bridge between two systems


The recombination mechanism exhibited by the integron integrases is atypical with re-
spect to other tyrosine-recombinase (Grainge and Jayaram, 1999). Indeed, attCs sites are only
recognized and processed as folded ssDNA by the integrase (Bouvier et al., 2005; MacDonald
et al., 2006). This interaction is mediated by a characteristic functional domain (Messier and
Roy, 2001; MacDonald et al., 2006). Cassettes are consequently mobilized in a single
stranded form, and their insertion at the double stranded attI site may involve accessory host
factors – such as the replication machinery (Bouvier et al., 2005; and see Figure 33, p130).

241
Discussion – Integrons are powerful adaptive systems

The exact pressure that drove the evolution of such an idiosyncratic process remains elusive.
Interestingly, ssDNA is also the central trigger of the SOS response. The combination of inte-
grons with this particular stress-response puts forward a system whereby ssDNA is both the
trigger and the substrate of recombination. The phylogenetic mapping of SOS-controlled inte-
grases suggests that the association between the two systems is an ancient trait – if not ances-
tral (see Article III, p186). Although the dual role of ssDNA might be purely contingent, it
may also offer a mechanistic coupling that benefits the system and have been selected accord-
ingly.
Stalling of a replication fork in an integron array potentially provides a RecA-coated
nucleofilament, leading to a local depletion of LexA and subsequent integrase expression. The
integrase may then readily access the structured attC sites located on the available ssDNA.
Besides, restart of the replication fork may help resolving recombination intermediates. In this
perspective, an interesting experiment would consist in monitoring the recombination of a re-
porter cassette upon generation of DNA damages in the integron of V. cholerae N16961. The
introduction of homing endonuclease restriction sites may be used to induce DSB at various
locations (Ponder et al., 2005) – e.g. near intI, in the middle of the cassette array and at an un-
related locus in the genome. Alternatively, a targeted lesion (e.g. G-AAF adduct) can be in-
troduced on either strands of a plasmid to uncouple synthesis of the leading and lagging
strands, which results in the accumulation of ssDNA (Pagès and Fuchs, 2003). With this
method, each strand can be monitored separately, which may be convenient because only
structured attC from the bottom strand are recombinogenic (Bouvier et al., 2005). In large
chromosomal arrays, the high density of secondary structures due to the extrusion of attC
sites may promote stalling of replication forks (Bichara et al., 2006), thereby favoring the
production of ssDNA. This process may also lead to gene conversion between similar attC
sites or duplicated cassettes.
As exemplified by the gathering of antibiotic resistance cassettes from various chro-
mosomal integrons by mobile integrons, the horizontal transfer of gene cassettes is an essen-
tial source of genetic diversity in the integron system (see p142). Both conjugation and
transformation involve the entry of ssDNA in the cell. Several lines of evidences suggest that
conjugation can indeed trigger the SOS response. Mating between S. typhimurium Hfr and E.
coli F- is known to strongly induce the response. However, most of this effect is probably due
to sequence divergence between the two genomes, because intraspecific mating only slightly
induces the response (Matic et al., 1995). Another hint comes from the presence of the psiB
anti-SOS gene in some conjugative plasmids (Bagdasarian et al., 1986; Golub et al., 1988;

242
A deep connection with SOS triggers – Potential SOS triggers relevant to integrons

Bailone et al., 1988). This gene is specifically expressed upon entry in the recipient cell – a
fetures that involves the structuration of the incoming ssDNA molecule to form a promoter
(Jones et al., 1992; Bates et al., 1999). We are currently showing in the lab that conjugation is
indeed able to significantly induce the SOS system (Z. Baharoglu, unpublished results). In
this context, the conjugation of an integron-containing plasmid would trigger the expression
of its own and/or a resident IntI, thereby promoting cassettes swapping between mobile and
chromosomal integrons. Likewise, massive transformation may similarly favor the acquisition
of new cassettes by inducing the expression of intI genes. This may be particularily signifi-
cant in Vibrionales, which become competent in the presence of chitin in crowded environ-
ments (Meibom et al., 2005; Bartlett and Azam, 2005; Miller et al., 2007b; and see p60).
Preliminary experiments carried out in the lab show that natural transformation of competent
V. cholerae cells indeed induces the expression of the integrase (Z. Baharoglu, unpublished
results).

I.3.2. Potential SOS triggers relevant to integrons


The functional and mechanistic link between SOS induction and integron recombina-
tion sheds new light onto essential aspect of integrons’ biology. As mentioned above, access
to the cassette metagenome is probably potentiated by the integrated role of ssDNA as both a
trigger and substrate of recombination. Aside from this far reaching example, several other
conditions may favor the induction of the SOS response in adequate conditions.
As reported in Article II, several antibiotics trigger integrase expression by eliciting
the SOS response (see p172). This effect is particularly interesting because cassette harboring
resistance determinant against these antibiotics have been isolated. Indeed, gene cassettes
providing resistance to fluoroquinolones (Robicsek et al., 2006b; Fonseca et al., 2008) and
trimethoprime (Le Roux et al., 2009) have been repeatedly reported. This mechanism is
highly relevant to the spread of antibiotic resistance via mobile integrons (see below
Implications for health, p246).
By leaving gaps in the donor site, the excision of some transposons may induce the
SOS response (Roberts and Kleckner, 1988; Lane et al., 1994). Specifically, this phenomenon
has been demonstrated for Tn7 in which class 2 integrons are embedded (Stellwagen and
Craig, 1997). Moreover, as the mobility of some transposons is also SOS-promoted (see p80),
the SOS response might coordinate the exchange of cassettes with the mobilization of genetic
shuttles to spread integrons between individuals.

243
Discussion – Integrons are powerful adaptive systems

The chromosomal integron of V. cholerae contains numerous TA-harboring cassettes.


TA modules are commonly found in plasmids – where they are though to ensure correct seg-
regation of the replicons. Likewise, the addictive properties of TA cassettes have been shown
to stabilize the cassette arrays (Szekeres et al., 2007). The observation of TA systems at stable
genomic locations in E. coli led to the suggestion that these may be implicated in pro-
grammed-cell death (Aizenman et al., 1996; Hazan et al., 2004) or cell growth arrest (Gerdes
et al., 2005) – thereby mediating cooperation in time of stress. Although the validity of these
results have been questioned (Tsilibaris et al., 2007), TA cassettes may also serve to relay
stress signals to the integron system in order to promote recombination. In this respect, the
ccdBA module found in several integron-containing bacteria (Rowe-Magnus et al., 2003; un-
published observations) is particularly interesting. Indeed, the stable CcdB toxin induces the
SOS response by poisoning the DNA gyrase – just as quinolone antibiotics do (Karoui et al.,
1983; Aertsen and Michiels, 2006). Under standard growth conditions, this effect is thwarted
by the unstable CcdA antitoxin. Stresses impacting the steady production of antitoxin may
thus indirectly elicit the SOS response, widening the range of triggering conditions.

I.3.3. SOS-controlled accessory factors?


As mentioned in the introduction, it is possible that additional host-factors are required
in some integrons to reach high recombination frequency (Biskri et al., 2005; and see p131).
Furthermore, the generation of gene cassettes must rely on unidentified determinant (see
p141). Given the deep association between the SOS and integrons systems, the incorporation
of such host factors to the LexA regulon would be biologically relevant. Various DNA proc-
essing proteins have been tested in the lab for their impact on recombination in E. coli using a
standardized assay, whereby an attC-containing suicide plasmid is delivered by conjugation to
a recipient cell harboring attI. Several of these proteins are encoded by gene pertaining to the
SOS regulon. No effect were observed in recA, uvrD, ruvB, ruvC and ssb background while a
slight decrease of recombination was observed when ruvA was inactivated (C. Loot, unpub-
lished results). Apparently, known members of the SOS response thus play little role in the
recombination process. By crossing the results of a microarray experiment carried after expo-
sure to UVs (M. Waldor, unpublished) with in silico data from a whole genome search for
LexA binding-sites, we identified potential members of the V. cholerae N19661 SOS regulon
that have no counterpart in E. coli. These genes constitute prominent candidates, but their in-
fluences on recombination have not been tested yet. Because the mechanism of cassette gen-

244
Are integron really successful? – SOS-controlled accessory factors?

eration is totally unknown, no assay has been developed to assess this phenotype – rendering
any effort to unravel this process particularily difficult to set up.

I.4. Are integron really successful?

The data presented so far strongly advocate the advantages of the integron adaptive
system. If integrons are so powerful, however, why are they not a basic genetic component
common to all microorganisms? While integrons are present in ca. 10% of all sequenced bac-
terial genomes (Boucher et al., 2007), their phylogenetic distribution is globally restricted to
characteristic taxa. In addition, the mapping of IntI is very spotty in some clades, such as the
Shewanella (Nemergut et al., 2008). Integrons may then be regarded as a declining system
that has been revived few decades ago owing to the antibiotic selective pressure. In this con-
text, what kind of forces would drive the apparent loss of integron?
The success of an integron principally lies in the availability of diverse gene cassettes.
Integrons with large cassette arrays harboring a specific attC signature seem to be responsible
for the primary assembly of gene cassettes – and may largely prove self-sufficient in this re-
spect (see p141). The IntI phylogeny is congruent with the organismal phylogeny in most
clades, indicating that integrons are maintained for a long time. However, some species ap-
pear to have acquired an integron more recently through HGT (see p135). As the genetic
neighborhood of chromosomal integrons is not conserved, the factors required for cassette
generation are unlikely to be transferred with the functional platform. Then, some chromoso-
mal integrons would essentially depend on exogenous sources of cassettes in which they can
tap through transformation or via mobile integrons. Such sources may be unavailable or lim-
ited in particular ecological niches (see p143), thereby leading to the decay and eventual loss
of the useless integron system.
Integrons may also prove directly deleterious in some cases. While the occasional in-
corporation of toxic cassettes can clearly impose a cost to the population, it only affects a sub-
set of individuals at the same time – which is most probably not sufficient to drive the
systematic counter-selection of the whole system. In contrast, overexpression of the integrase
greatly impacts the growth rate of a population by causing cell death (mazel lab, unpublished
observations). Although the exact mechanism underlying this phenomenon is unknown, one
may venture that integrases loses their site-specificity at high concentration and cause irre-
coverable DNA damages. In this perspective, even weak expression of the integrase may exert

245
Discussion – Implications for health

a deleterious pressure if constitutive, eventually leading to the loss of the integron in the ab-
sence of counterbalancing advantageous effects. This putative effect provides an additional
explanation for the frequent inactivation of integrases and the emergence of SOS regulation.

II. IMPLICATIONS FOR HEALTH

Host-pathogen interactions typically favor an explosive mechanism of evolution


whereby the constant generation of genetic novelty is required to keep up with competitors
(Van Valen, 1973; Dawkins, 1986). Hence, it is not surprising that most traits affected by
programmed mechanisms of genetic variation are involved in biotic interactions. While mi-
crobes use various processes of phase and antigenic variation to modify their surface antigens
and evade the host immune system, eukaryotic hosts use very same kinds of mechanisms to
diversify and refine their antibody repertoire. Even prokaryotes developed the complex
CRISPR system that allows them to acquire exogenous sequences to fight phage infections
(Sorek et al., 2008). A substantial fraction of integrons cassettes seem to code for functions
involved in the interaction with biotic factors (see p123). However, integrons are implicated
in the expression of a wide variety of accessory factors rather than the stealth evasion of the
immune system. Although some integron cassettes have been involved in virulence (Labbate
et al., 2009), the clinical importance of integrons rather lies in their involvement in multi-
resistance to antibiotics (Fluit and Schmitz, 2004; Partridge et al., 2009).
The chromosomal origin of mobile integrons and their associated cassettes is now well
established (see Chromosomal integron as the source of mobile integrons, p135). However,
the evolutionary dynamic of these elements was largely unknown. The discovery that inte-
grons are governed by an almost ubiquitous stress response carries serious clinical implica-
tions. Indeed, the integrases of the three clinically relevant multi-resistant integrons are
controlled by the SOS response. Yet, IntI1 and intI3 branch to a clade in which SOS-
regulation is mostly absent. This pattern strongly suggests that the SOS control is a particu-
larly advantageous feature in the context of antibiotic resistance (see Article III, p186).
Several studies recently highlighted the profund involvement of the SOS response in
the evolution of pathogenic bacteria (see pp 50 and 54). Particularly, a functional SOS system
was shown to be required for the rise of clones resistant to fluoroquinolones and rifamycin in
a mouse model (Cirz et al., 2005). Besides, SOS is implicated in the mobilization of trans-

246
Are integron really successful? – SOS-controlled accessory factors?

posons (Aleshkin et al., 1998), prophages (Bunny et al., 2002; Quinones et al., 2005),
pathogenicity islands (Ubeda et al., 2005) and ICEs (Beaber et al., 2004) – all of which carry
virulence and/or resistance determinants. In this context, the SOS-mediated regulation of re-
combination rates not only allows multi-resistance integrons to swap cassette expression in
time of stress, but also coordinates these events with other potentiating mechanisms.
Multi-resistance integrons use mobile elements as shuttles between different genomic
backgrounds. As illustrated above (see p243), the mobilization of these shuttles is dependent
on the SOS response to some extent. Besides, some of these elements such as TEs and conju-
gative plasmids are able to trigger the SOS response. Altogether, the SOS system thus con-
trols a set of processes promoting both cassette exchanges and the spread of integrons in the
population. Furthermore, the use of many antibiotics triggers the SOS response with the unin-
tended consequence of promoting the spread of both bacterial virulence factors and antibiotic
resistances (see p54). Both trimetoprime and fluoroquinolones induce the expression of the
integrase (see Article II, p173). As pointed out earlier, cassettes encoding resistance to these
antibiotics have been isolated. In this perspective, some antibiotics would directly promote the
recruitment and dissemination of appropriate resistance determinants. Antibiotic guidelines
should take these considerations into account, and limit the use of these drugs accordingly.
Current policies in the fight against antibiotic resistances mostly rely on the cost pre-
sumably imposed by the very resistance mechanisms on fitness. In the absence of antibiotic
selective pressures, such deleterious effects are expected to be counter-selected in the popula-
tion. Because integrons tightly couple recombination and expression, potentially deleterious
resistance traits can be put away from expression and kept silent in the cassette array without
impacting fitness until they are needed again. The relaxed selective pressure experienced by
non-expressed cassettes may favor the diversification of the resistance gene and promote its
efficient adaptation to successive generations of antibiotic – as exemplified by the evolution-
ary success of extended spectrum β-lactamases (Gniadkowski, 2008).
Several authors had noted the advantages of developing drugs targeting the SOS sys-
tem in order to combat the adaptive properties of bacteria (Avison, 2005; Cirz et al., 2005;
Kelley and William, 2006; Potts et al., 2008). The integron case further illustrates the poten-
tial benefit of this endeavor, but drastically restricts the range of suitable targets. Indeed, inte-
grons being a fully independent system that is simply pluged on the SOS regulon, the RecA
protein stands as the sole target that can alter its functioning.

247
Discussion – Biotechnological considerations

III. BIOTECHNOLOGICAL CONSIDERATIONS

As put forward in Article IV (see p214), directed evolution is a powerful tool to engi-
neer elaborate behaviors that are too complex to be fully predictable in biological systems.
The generation of large amounts of diversity is a general prerequisite to successful selection,
be it natural or artificial. The two themes developed in this work – the integron system and the
ELP principle – deal with the evolvability of biological systems. In the framework of directed
evolution, these properties can be used to increase the generation of diversity at different lev-
els of organization.

III.1. The ELP principle

The benefits of incorporating several ELP-designed synonymous sequences in directed


evolution protocols are presented in Article IV (see p214). This principle and its embodiment
as software (see http://mobyle.pasteur.fr/cgi-bin/portal.py?form=elp) have been patented with
the support of the Pasteur Institute under the name Modulating mutational frequency to opti-
mize protein evolution. This technology would benefit biotechnological companies involved
in directed evolution and/or can enrich the range of services proposed by companies special-
ized in de novo gene synthesis.
So far, we applied this strategy to identify variants of the Aac(6’)-Ib enzyme, which
confer increased level of resistance to aminoglycosides in E. coli. Despite significant widen-
ing of the explored protein space before selection, we only identified few advantageous mu-
tants. These mitigated results probably reflect the narrow evolutionary perspectives of the
enzyme. We are now using two ELP-designed synonymous genes to isolate IntI1 mutants
achieving increased attI x attI recombination frequency. Such variants may exhibit increased
affinity to attI and would eventually allow Int1I-attI complexes to be crystallized, which is
not afforded by the low affinity of the wild-type enzyme with this substrate. The screening
procedure is not straightforward, but has already been successfully used to isolate variants
with increased attI x attC recombination rate (Demarre et al., 2007). Nevertheless, a definitive
demonstration of the advantages of the ELP approach would rely on well known model genes,
for which efficient screening strategies and extensive dataset obtained with other directed
evolution assays are available. The TEM β-lactamases (Zaccolo and Gherardi, 1999; Ber-

248
Synthetic integrons – SOS-controlled accessory factors?

shtein et al., 2006; Weinreich et al., 2006) and the fluorescent proteins – such as GFP
(Crameri et al., 1996; Sacchetti et al., 2000; Miyawaki et al., 2005; Shaner et al., 2007) –
would be privileged candidates in this respect.

III.2. Synthetic integrons

While the ELP principle can speed up protein evolution through point mutations, the
integrons system may prove useful to engineer whole metabolic pathways. Indeed, integrons
basically perform combinatorial rearrangement of silent gene cassettes under the control of a
single promoter, thereby providing the opportunity to select for the best arrangement of ex-
pressed cassette under appropriate conditions.
We developed a directed evolution protocol whereby a library of synthetic cassettes
harboring genes of interest is introduced in an E.coli strain containing an inducible intI1 gene
and a chromosomal attI site associated with a strong inducible promoter. Upon induction of
the integrase, cassettes from the library are randomly recombined at the attI site – so that ex-
tensive variability is generated at the population level. Expression of the array can then be
turned on to screen individual cells for desired properties. Successive rounds of recombina-
tion-selection may be chained until no further improvements could be detected. As a proof of
principle, we introduced the five genes of the E. coli tryptophan operon into separate cas-
settes. These functional cassettes – interspersed with three inappropriate ones (one containing
lacZ and two harboring a transcription terminator) – were cloned in disarray into a library
plasmid. Cells transformed by this plasmid were selected for growth in minimal medium after
thay have been subjected to IntI1-mediated shuffling. Preliminary data indicate that several
functional arrangements can readily be selected (D. Bikard, unpublished results). Although
the involvement of the few first cassettes of a natural integron in the same functional path-
ways has never been reported (but see Elsaied et al., 2007), these data suggest that integrons
may facilitate the emergence of operons.
The synthetic cassettes have been constructed according to a standardized, fast, and ef-
ficient procedure inspired by the rise of synthetic biology standards. This integron-based
combinatorial strategy would be easily applicable to the optimization of artificial biochemical
pathways, when competing candidate genetic elements are available. Alternatively, the design
of attC sites with primary sequences coding for flexible polypeptides would permit to use this
system to shuffle protein domains. Such a tool may prove particularly valuable in the genera-

249
Discussion – Biotechnological considerations

tion of the multi-modular enzymes which are responsible for the synthesis of polyketides
(Menzella and Reeves, 2007). In this perspective, the very system that initially led to the rise
of antibiotic resistances would ironically be subverted to produce brand new antiobiotics.

250
251
252
APPENDIX

253
Appendix – Epistemological considerations on the role of variations in biology

EPISTEMOLOGICAL CONSIDERATIONS ON THE


ROLE OF VARIATIONS IN BIOLOGY

Maintenance versus variability: a major evolutionary


trade-off

Cats do not make dogs and children tend to resemble their parents. These simple ob-
servations are accessible to everyone’s immediate experience. They nonetheless underlie two
related, essential and long mysterious biological processes: the maintenance of species char-
acteristics and the inheritance of individual traits over time. Another compelling observation
is that, beyond the astonishing diversity of living forms on Earth, some species exhibit patent
similarities between each others. This allowed generation of naturalists since Aristotle to un-
dertake systematic classifications of animals and vegetals. These two concepts are seemingly
difficult to reconcile: on the one hand, heredity entails the transfer of unchanged information
from parent to offspring within species, while on the other hand the study of diversity unveils
the profound link between species.
For ages, at least in occident, the most successful explanation of this paradox was one
coupling the platonic’s idea of essential types to the existence of an intelligent and omnipotent
agent responsible for their creation and embodiment. Essentialism holds that, for any specific
kind of entity, there is a set of permanent, unalterable, and eternal characteristics, all of which
any entity of that kind must possess. Real entities then stand as imperfect manifestations of
their ideal essence. This view goes along well with the concept of biological species. Indeed,
if there are some accidental differences between individuals, none had ever observed modifi-
cation of a species’ representative traits over a human lifetime. The belief that an intelligent
force, a demiurge, created the essences reaches back to Plato (ca. 428-348 BC) and may
somehow account for the observed relationship between species. Suffice it to say that, along
its initiative, the demiurge was inspired by its former creations and accordingly developed a
range of resembling forms. Some pre-Socratic philosophers, e.g. Empedocles (ca. 490–430
BC) and Democritus (ca. 460-370 BC) and their followers, such as Epicurus (ca. 341-270 BC)
and Lucretius (ca. 99-55 BC) rejected these deterministic ideas and let much space to chance
and contingency in their world views. Such metaphysical edifices are not meant to satisfy the
scientific principle of objectivity, but rather appear as a posteriori constructions justifying

254
Maintenance versus variability: a major evolutionary trade-off

ethical and political beliefs. The essentialist view established itself in the Eastern and Middle-
Eastern thoughts because it fitted well the precepts of the Abrahamic religions (essentially Ju-
daism, Christianity and Islam). These religions had, and still have, a profound impact on hu-
man societies, ethics and sciences. A famous illustration, on which we will come back later, is
the natural theologism and fixism of Carl Linnaeus. The Swedish taxonomist believed his
classification scheme to reveal the divine order of God's creation. In his own words: “There
are as many species as the number of different forms created by the Infinite Being in the be-
ginning. These forms have then according to the inherent laws of creation always produced
offspring like themselves, so that we do not now find more species than have previously ex-
isted. Thus, there are as many species as there are different forms or structures if we exclude
the non-essential deviations (varieties) that are conditioned by the habitat or by fortuities”
(As cited in (Gustafsson, 1979)). Beyond the three main monotheisms and to the best of my
knowledge, all civilizations developed cosmogonies to account for the biosphere by the ab ni-
hilo appearance of demiurge-like entities and the subsequent transformations, emanations or
creations of living forms.
Because of complex socio-cultural factors, a lot a people still believe in these types of
metaphysical explanations. However, the last 150 years have seen the birth and development
of a consistent and powerful body of knowledge, i.e. a theory, the modern evolutionary syn-
thesis, which allow biologists to account for these natural facts in an objective, scientific, and
much more satisfactory manner. We now know the nature and mechanisms of transmission of
the genetic material responsible for heredity. We also know that all diverse living forms, be-
yond their perceptible macroscopic similarities, dwell in a profound mechanistic unity and re-
late to each other by common descent.
Given the aforementioned fixist grasp of life, the explanation of relationship between
species by way of common descent was difficult to admit. Indeed, the concept of common de-
scent supposes the apparition and perpetuation of modifications into species. When consid-
ered separately, the ideas of modification and perpetuation were not so problematic. For
instance, the appearance of viable variations in cultivated plants is so common that they can-
not be unnoticed. Regarding this issue, Linnaeus wrote: “Let a garden be sown with a thou-
sand different seeds, let to these be given the incessant care of the Gardener in producing
abnormal forms, and in a few years it will contain six thousand varieties, which the common
herd of Botanists calls species. And so I distinguish the species of the almighty Creator which
are true from the abnormal varieties of the gardener: the former I reckon of the highest im-
portance because of their author, the latter I reject because of their authors. The former

255
Appendix – Epistemological considerations on the role of variations in biology

persist and have persisted from the beginning of the world, the latter, being monstrosities, can
boast of but a brief life” (as cited in (Gouyon et al., 2002)). This way, modifications were
usually seen as fortuitous anomalies that could only be perpetuated by artificial selection, but
would be quickly eliminated otherwise. The real trouble arose when naturally occurring and
heritably stable variants were discovered. Linnaeus was once confronted with a mutant of the
otherwise well described Linaria vulgaris species (see Figure 40), in which the fundamental
symmetry of the flower is changed from bilateral to radial. The specimen was in complete
contradiction with the botanist’s classifying system, which is grounded on flower morphol-
ogy. This naturally led him to name the plant Linaria peloria, i.e. monster in ancient Greek
(Gustafsson, 1979). The case troubled the taxonomist’s faith, and eventually had him embrace
the possibility that “all species be-
longing to the same genus originally
formed a single species which diversi-
fied by hybridization” (as cited in
(Gouyon et al., 2002)). In other
words, the constancy of divine crea-
tion might only hold until genera. We
now know that the flower’s altered
phenotype is the consequence of an
epimutation (Cubas et al., 1999), i.e.
due to an epigenetic phenomenon (see
Epigenetics, p89). As we will see
throughout this work, bacteria are
particularly prone to genetic modifications. It is interesting to muse that, if basic techniques of
microbiology were available before L. Pasteur (1822-1895) and colleagues set those up, a lot
of spontaneous and self-perpetuating variations could have been observed. In hindsight and
despite its falseness, the fixist paradigm initially subtended the edification of classifications
and may thus be considered as a necessary epistemological intermediate. Indeed, the devel-
opment of systematic classifications drove a fantastic accumulation of specimens, such as
peloria, which in turn reinforced the idea of evolution.
Aside from these paradigmatic and religious considerations, the question of genetic
modification remains intricate and is somehow at odds with the concept of heredity. If the ex-
istence of variations is necessary for evolution to occur, their introduction in the hereditary
equation raise tricky questions concerning the control of their generation. Maintenance of ge-

256
The purpose of evolution

netic integrity is indeed essential for the continuation of important traits over time. Because
most alterations are deleterious, too much variation is likely to hinder the stability of the or-
ganism. However, too few variations might not allow sufficient evolution to changing living
conditions. In this respect, successful adaptation obviously requires an exquisite balance be-
tween stability and variability. How such a balance can be established is the central theme of
this work. Before addressing this issue, it is worth wondering why evolution is necessary,
what is exactly meant by adaptation and what constitute its fundamental mechanisms.

The purpose of evolution

The title of this section is deliberately provocative. Evolution is often presented as


blind and contingent process, and I will certainly not argue against that. Nevertheless, this de-
scription alone might be misleading and I would like to highlight the reasons for that. At the
same time, this will permit to bring out the necessity for evolutionary processes and thus the
necessity of variations.

Form, function and the watchmaker


When I was an undergraduate student, I was taught not to say that eyes are made to
see. In the same light, I was also said that an animal has sight because it has eyes, while for-
bidden to think that it has eyes because he has the need of sight. Without further explanations
(and there were not) these assertions are absolute nonsense in the light of modern biology. In
his essay Chance and Necessity, J. Monod (1910-1976) highlights “how much arbitrary and
pointless it would be to deny that the natural organ, the eye, represents the materialization of
a ‘project’ (the one of capturing image)”, and that “one of the fundamental properties com-
mon to all living beings without exception [is] that of being objects endowed with a purpose,
which at the same time they exhibit in their structure and carry out in their performance”
(Monod, 1970). The seemingly perfect adequacy between forms and functions pervade all
levels of biological organization, from whole organs to nanoscopic molecular machines,
Whether the function followed the form or the contrary (i.e. do we see because we
have eyes or have we eyes in order to see?) is an age old question again tracing back to Greek
philosophers such as Plato (ca. 428-348 BC), Democritus (ca. 460-370 BC) and Aristotle (ca.
384-322 BC). The implications of this debate extend beyond natural sciences and reach met-
physical concepts. The true question behind the alternative is to determine whether the exis-

257
Appendix – Epistemological considerations on the role of variations in biology

tence of purposeful biological structures is merely accidental or whether a force drives the de-
velopment of their intrinsic projects. That the former point must be false stands as an obvious
fact today. The functions, i.e. the adaptations of structures toward defined ends that are appar-
ent in biological entities put them at odd with other physical manifestations. It is is a statisti-
cal impossibility that structures as complex and refined as an eye, a bacteria and even a
functional enzyme can suddenly emerge with fully functional features (Salisbury, 1969;
Dawkins, 1986). Besides, the existence of spontaneous generation has been definitively ruled
out since L. Pasteur (1822-1895) (Pasteur, 1861). A rational mind feel compelled to admit the
existence of a creative force to account for the projects expressed in biological functions.
What is however the nature of this creative force? An immediate explanation would be
a theological one: the purpose apparent in living intities is the reflection of the will of the
creator. This issue is best illustrated by the so-called watchmaker analogy. This argument was
famously put forward by W. Paley (1743-1805) (Paley, 1809), but similar ideas were formerly
evoked by numerous thinkers. Let us imagine one happens to find a watch in the middle of a
virgin natural landscape. In contrast to simpler natural objects, such as stones, the obvious
complexity of the artifact, the fine and purposeful arrangement of perfectly suited mechanics
irremediably argue for the existence of a intelligent watchmaker, who designed and craft it.
Similarly, Paley argue, the complexity of living forms, their exquisite adaptations to specific
functions definitely prove the existence of an intelligent designer. There is, however, no logi-
cal demonstration in this reasoning: the long watch preamble is not a sound premise to an ar-
gument, but merely serves to establish the plausibility of the general premise one can tell,
simply by looking at something, whether or not it was the product of intelligent design, which
eventually remains unproven. This rhetorical slippage is known as the design inference and is
frequently used as an argument to the existence of God.
For a long time there was no satisfying alternative to those kinds of argument. Never-
theless, the methods of natural sciences are grounded on objectivity not projectivity and hence
do not leave room for supernatural explanations. A heuristic scientific principle, known as
Ockham razor after the logician and Franciscan friar William of Ockham (ca. 1288-1348)
holds that the explanation of any phenomenon should make as few assumptions as possible.
In this respect, the assumption of an omnipotent and omnipresent creature transcending the
law of the universe is particularily not parsimonic explanation. Before Darwin (1809-1882),
people that did not admit the divine intervention as an explanation of life were somehow
compelled to admit its spontaneous apparition.

258
The purpose of evolution

Adaptation, teleonomy and blindness


Living organisms are able to reproduce themselves in an almost identical manner, at
the exception of few variations. Because the resource of a given environment are limited, a
population of organism cannot grow indefinitely but soon become restricted. This creates ex-
trinsic selective conditions: any genetic variant that has higher chances to reproduce is stabi-
lized and increase in frequency in the population. In this light, the propagation of self appears
as the ultimate end of a living organism. It relies on proper capacity to survive, exploit the en-
vironment and reproduce. These performances are carried out by specialized devices that an
organism produce as part of is own self, which together constitute its phenotype. The pheno-
type is defined as the expression of an organism’s genetic information in a given environ-
ment. Any random phenotypic variation allowing a particular function to be performed more
efficiently is selected provided it finally results in higher prolificity. If the variation is the re-
sult of a heritable genetic mutation, the sustained selection on the phenotype leads to a rise in
frequency of the mutation in the population. When repeated iteratively, the short-sighted se-
lection of small effect mutations can progressively lead to the appearance of sophisticated
structure. This cumulative selection is a creative process that is not driven by final ends but by
the instantaneous action of the environment on the available variability. The trade-off between
productions of phenotypic variations and genetic stability mentioned earlier is essential in this
process. Sustained selection of a genetically encoded trait relies on the relative invariance of
the global phenotype. In contrast, the existence of variations is mandatory for adaptation.
The seemingly intrinsic projects that single out the phenotypes of organisms reflect
their adaptations to the environment. The adaptation is not only the static state that we can ob-
serve, and which unavoidably appears as purposeful design. Above all, it is a continuous and
dynamic process driven by the environment and resulting from the cumulative selection of
genetic determinant through their impact on the phenotype.
Unambiguously, eyes are made to see, wings to fly and at another scale, DNA poly-
merases to replicate DNA. The ambiguity does not lie in the fact of adaptation but rather in
the process of adaptation. The verb to make inevitably alludes to the existence of an almighty
watchmaker responsible for crafting the universe and its inhabitants. The modern synthesis of
evolution provides a robust and scientific framework to explain the apparent finality of bio-
logical entities. The process of cumulative selection, initially described by C. Darwin (1809-
1882) and A. R. Wallace (1823-1913) (Darwin and Wallace, 1858), fulfills the role of the
creative force that account for the finality of biological artifacts. As eloquently written by R.

259
Appendix – Epistemological considerations on the role of variations in biology

Dawkins (1941): “All appearances to the contrary, the only watchmaker in nature is the blind
forces of physics, albeit deployed in a very special way. A true watchmaker has foresight: he
designs his cogs and springs, and plans their interconnections, with a future purpose in his
mind's eye. Natural selection, the blind, unconscious, automatic process which Darwin dis-
covered, and which we now know is the explanation for the existence and apparently purpose-
ful form of all life, has no purpose in mind. It has no mind and no mind's eye. It does not plan
for the future. It has no vision, no foresight, no sight at all. If it can be said to play the role of
watchmaker in nature, it is the blind watchmaker” (Dawkins, 1986).
The decisive subtlety differentiating the artifacts produced by an intelligent watch-
maker from those generated by Dawkins’ blind watchmaker is caught between the words
teleology and teleonomy, which both refers to the issue of finality. In the one hand, teleology
refers to purposeful systems that are able to elaborate their own ends. Such systems are char-
acterized by intentionality and foresight. As described above, the concept of teleology is
closely linked to the one of theology in the context of biological systems. On the other hand,
teleonomy is a property of goal seeking systems. Such systems are not internally driven to-
ward a defined end, but result from an exploratory process composed of several round of
variation and subsequent stabilization supervised by extrinsic conditions. Biological evolution
is a perfect illustration of a goal seeking process. In this case, the goal that is sought is to
maximize the reproduction of an organism. The adaptations of this organism are both the re-
sults and the consequences of this process. In this light, the debate concerning the respective
primacy of forms over functions is pointless. Forms and functions interact in a dialectic man-
ner to results in adaptation through time. None has the primacy over the other: a given func-
tion is the consequence of the form, but the form has been selected through the function it
confers. Besides, the simplest cell is a complex network of interdependent processes that
evolved on top of each other. As a result, the expression of a given form and function relies
on preexistence of other forms and associated functions.

Impact of the environment

What is the environment?


From the standpoint of an organism, the environment represents all what is outside of
the self and comprises abiotic and biotic components. The abiotic factors correspond to the

260
Impact of the environment

physicochemical conditions experienced by the organism. They are subjected to both random
variations and regular fluctuations over a wide range of timescales. Random fluctuations typi-
cally results from meteorological phenomena: variation in temperature, drought, rain and sub-
sequent afflux of chemical… Examples of regular fluctuation include the day-night and
seasonal cycles. The biotic factors comprise all other living forms, from the same or different
species. The essential difference between the biotic and abiotic factors lies in the ability of the
formers to evolve. The co-evolution of different species results in the establishment of diverse
and interdependent relationships in the ecosystem, such as competition, symbiose or com-
mensality. The antagonistic interactions resulting from competition, predation and parasitism
are particularly interesting. They determine situations in which the survival of one species is
threatened by the existence of another species, while the survival of the latter is stricly de-
pendent on these harassments. In such contexts, every innovation developed by one camp
must be counteracted in the other resulting in an explosive mechanism of evolution, which is
often referred to as an arm race (Dawkins, 1986). The broad ecological significance of such
interactions is captured in the Red Queen analogy (Van Valen, 1973), which refers to a chap-
ter in L. Carroll's novel Through the Looking-Glass in which Alice and the Red Queen are
running in one side, while the entire world is moving in the opposite direction. As the net
movement of the protagonists is null, the Red Queen explains: “It takes all the running you
can do, to keep in the same place”. This highlights the fact that continuous adaptation is man-
datory to simply persist in an evolving biotic world.
Exposure to environmental variations depends on the actual biology of organisms.
Mobile organisms can actively forage for foods and, to some extent, can escape or avoid chal-
lenging environments, including biotic and abiotic factors. The ability to sample a larger set
of conditions renders them less dependent on local variations. In contrast, sessile organisms
are constrained to one location and condemned to undergo the vicissitude of the weather, the
food availability and cannot escape from other organisms. Although they are generally able to
move actively on a microscopic scale, microorganism can be considered as mostly sessile or-
ganisms on a macroscopic scale (Andrews, 1998).
An organism delimits a physical separation between its self and the surrounding envi-
ronment, thereby establishing an internal compartment. Apart from the particular case of
niche construction (see below), the external environment is uncontrolled by the organism. In
contrast, the internal environment is part of the individual phenotype and is subjected to so-
phisticated mechanisms to adjust its composition. This maintains a certain physicochemical
homeostasis which fundamental to the occurrence of metabolic reactions. The first replicating

261
Appendix – Epistemological considerations on the role of variations in biology

molecules to undergo evolution had to cope with direct exposure to the environment. The
evolution of the cells allowed emancipation from the hazards of the external environment.
However, this process required the coordinated association of different replicators and their
subsequent functional specialization driven by increased collective survival over evolutionary
time (Szathmary and Maynard Smith, 1997). The individuality was eventually shifted from
single independent replicators to a consortium of interdependent molecules replicating in a
coordinated fashion. In the same fashion, individuality was shifted from single cells to several
related cells during the evolution of pluricellular organisms. The progressive liberation from
the external conditions relies on the construction of a phenotype resulting from the coordi-
nated action of several entities and is accompanied by modification of the selection units. In
this view, an organism constitutes a microenvironment constructed by an assembly of genes
to cooperatively increase their survival. The coordinated action of the genes is the results of
ongoing evolution. Any genetic variation in one of these genes can be selected if it increases
the survival of the cell. However, a net survival increase may hide possibly inadequate inter-
actions between some components of the cellular environment. Hence, the phenotypic varia-
tion of one gene product may constitute a change in the internal environment for another
gene, resulting in complex epistatic relationship. The emancipation from the environment re-
lieved some constraints, while creating others.
Beyond the edification of an internal environment, an organism’s phenotype can also
significantly impact its external environment. Modified environments can affect the progeny
of the organism, resulting in ecological inheritance and influencing biological evolution. This
overlooked biotic factor is often referred to as niche construction (Laland et al., 2000).

The inheritance of acquired characteristics


The first formal theory of evolution was proposed by Jean-Baptiste de Lamarck
(Gould, 2002). Lamarck recognized transformation of species by way of progressive modifi-
cations and highlighted the theoretical necessity for evolution (Lamarck, 1809). Following
naturalists since Aristotle, he favored the ordering of living form along a complexity scale. He
proposed that an inner complexifying force is driving evolution from the simplest living enti-
ties to the more complex. In his time, the absence of spontaneous generation was admitted for
higher animals, but the issue was not yet settled concerning microorganisms. Lamarck pro-
posed that complex organisms arose by progressive transformation of simpler one, all the way
down to microorganisms, that were conceived as simple enough to appear spontaneously (see

262
Impact of the environment

Figure 41). In this view, evolution is necessary to explain the existence of complex creatures
that cannot appear by chance alone. In Lamarck’s thought, the evolution of complexity under-
lie an idea of progress that is an inherent feature of life. This process is driven by an elusive
force referred to as “Le pouvoir de la vie”. The creative component of Lamarckian evolution
is thus teleological. Nevertheless, Lamarck clearly outlined the importance of the environ-
ment in evolution. By determining the use and disuse of phenotypic characteristics, the envi-
ronment drives the modification necessary to evolutionary change. The frequent and
continuous use of a function is expected to subtend the development of the structure carrying
this function. In contrast, a characteristic that is not used in an environment will progressively
shrink, until eventual loss. These mechanisms are grounded on the observation of diverse
phenotypic plasticity, for instance the development of the musculature upon physical exer-
cises or the deformation of certain organs subject to constant physical constraints.
Following the predominant idea of his time, Lamarck assumed that the changes affect-
ing the phenotype of the parental organisms are transmitted to their offspring. In Lamarck
own words: “All the acquisitions or losses wrought by nature on individuals, through the in-
fluence of the environment in which their race has long been placed, and hence through the
influence of the predominant use or permanent disuse of any organ; all these are preserved
by reproduction to the new individuals which arise, provided that the acquired modifications

263
Appendix – Epistemological considerations on the role of variations in biology

are common to both sexes, or at least to the individuals which produce the young” (Lamarck,
1809). At the time, the mechanisms underlying heredity were completely unknown and the
inheritance of acquired characters was a common belief. This idea was notably refuted by A.
Weismann (1834-1914), who established the distinction between germen and soma in meta-
zoan. Only a subset of cells is transmitted to the next generation, while the vast majority of
cells participates to the elaboration of the phenotype and only serve the individual. In this
context, the transmission of acquired character requires that genetic information supposedly
received by the soma be communicated to the germen. The establishment of the central
dogma of molecular biology which can be summarized as follows ADN ↔ARN→Protein –
was the ultimate proof that no information modifying the phenotype (proteins) can trace back
to the genetic information (DNA) (Monod, 1970). The reciprocal ADN ↔ARN relationship
reflects the existence of reverse transcriptase coded in retrolements.
Lamarck had a remarkable intuition concerning the role of the environment in direct-
ing adaptation of individual organisms. Nevertheless he failed to identify the actual mecha-
nisms driving this evolution. Half a century later, Darwin proposed that evolution is driven by
natural selection. In his time, the laws of heredity and the nature of the genetic information
were still unknown and his idea about the generations of variability where extremely fuzzy. In
the origins of species, he wrote: “I have hitherto sometimes spoken as if the variations… were
due to chance. This, of course, is a wholly incorrect expression, but it serves to acknowledge
plainly our ignorance of the cause of each particular variation. [The facts] lead to the con-
clusion that variability is generally related to the conditions of life to which each species has
been exposed during several successive generations” (Darwin, 1859). Ignorant of the source
of mutations, Darwin did not reject the idea that organisms may respond to environmental
conditions and furnish the gametes with information enhancing the next generation’s re-
sponse. He even suggested that stress might generate the variability upon which natural selec-
tion operates.
The idea that the environment can directly influence heritable variation is appealing
because it straightforwardly couple the rate of evolution to its immediate necessity. The whole
concept was however firmly rejected by the neo-Darwinian synthesis, which established the
unilateral primacy of selection. Any mechanisms that somehow suggest a coupling between
environment and mutation were discredited and dubbed Lamarckian.

264
Impact of the environment

The Neo-Darwinian focus on selection


A fundamental tenet of the synthetic theory of evolution is that mutations occur ran-
domly in time and genomic space. Mutations are conceived as accidental error altering the in-
tegrity of the genetic information transmitted from one generation to the other. Apart from
exposure to mutagenic conditions, the environment is considered to play no role in this proc-
ess. In contrast, selection by the environment is conceived as the sole driving force in evolu-
tion, an ordering process that sorts the preexisting random variations generated
spontaneously. No more teleological forces are required to account for the orientation of evo-
lution; the process is blindly directed by the selective action of the natural environment. Evo-
lution is essentially a random process. The shifting balance theory developed by S. Wright
contributed to show that theoretically, evolution is a short-sighted and favor immediate adap-
tation irrespective of its long term consequences (Johnson, 2008). S. Luria and M. Delbrück
provided the first experimental demonstration of the precedence of mutations over selection.
They exposed bacterial population to phage infections and carefully analyzed the distribution
of resistant variants selected in independent experiments. They showed that this distribution
was in agreement with the random accumulation of resistance mutations prior to exposure to
the phage (Luria and Delbrück, 1943). E. and J. Lederberg reached similar conclusions by
monitoring the apparition of penicillin-resistant clones (Lederberg and Lederberg, 1952).

Anticipating and responding environmental changes


The teleological idea that a mysterious force directs evolution is known as orthogene-
sis. This concept has ideological implication and as been use to legitimate the incorporation of
evolution to various doctrine. Under Stalin in URSS, Lysenko emphasizes the capacity of the
environment to direct heritable variation. The geneticist that did not agree to that position
were harassed, incarcerated or expulsed. The rising synthetic theory of evolution was seen as
a bourgeois science that denied the aspiration of the regime (Fisher, 1948). Religions rather
tend to hold that the mysterious force is the hand of God. The Jesuit T. de Chardin famously
tried to reconcile the Christian faith with the idea of evolution (Teilhard de Chardin, 1955).
Presently, an institution such as the Catholic Church officially accepts the existence of theo-
ries of evolution, but practically favors an interpretation whereby important variations are dic-
tated by God rather by contingent mutations. Other congregations are less subtle and the
belief in creationism is widespread in some countries (Miller et al., 2006; Berkman et al.,

265
Appendix – Epistemological considerations on the role of variations in biology

2008). The most moderate proponents of the Intelligent Design movement still argue that mo-
lecular machines are too complex to have evolved by cumulative selection. Re-actualizing
Paley’s design inference, they present this argument as a proof of the existence of a superior
intelligence (Behe, 1996).
Continuous pseudoscientific interpretations of evolution somehow prompted the pro-
ponents of the mainstream synthetic theory to strengthen their position concerning the pri-
macy of selection. The idea that the environment may influence another step than selection in
the evolutionary algorithm was out of the paradigm and difficult to defend in the scientific
community. A classic illustration concerns the observation of stress-induced chromosomal re-
arrangement in maize by B. McClintock (McClintock, 1950), which showed that high order
genetic changes can be elicited by the environment. It received little credit until the loci im-
plicated mere identified as transposon providing a mechanistic basis for the phenomenon
(McClintock, 1984).
However, the idea that mutations can be induced by the environment does not contra-
dict the existing evolutionary theory, but rather appear as a sound consequence of it. A huge
controversy that fuelled extensive researches in this field was initiated by the publication of a
paper by John Cairns and colleagues in 1988 (Cairns et al., 1988). In this paper and several
subsequent works, the authors established a genetic system to follow the apparition of muta-
tion in E. coli. In this setting, bacteria carrying a lacZ gene inactivated by a reversible
frameshift are selected on lactose agar plates, so that only revertants can grow. The plates
were incubated during six days. The fluctuating apparition of spontaneous revertants was ob-
served as in the Luria-Delbrück experiments during the two first days of incubation. But the
apparition of revertants continued during the following days. The late mutants were not slow
growers and their number exceeded expectations under the Luria-Delbrück model. Instead,
their distribution was consistent with apparition under selection. Overall, the observed rever-
sion rate in the selective environment exceed by 100-fold the rate measured in a non-selective
one. It was initially reported that the increased mutation rate was specifically directed to lacZ.
This process, referred to as directed mutation, supposes that a kind of molecular cognitive
system is able to predict the consequences of mutations, so that only adaptive loci are tar-
geted… Or it can easily be interpreted as evidence of orthogenesis. Subsequent studies
showed that the increased mutation rate under selection was not restricted to the lacZ gene,
but distributed over the whole genome, though the region surrounding lacZ was more vari-
able. Furthermore, the mutational signature observed under selection was found to be differ-
ent than the one observed in the absence of selection. It thus appeared that a distinct

266
Impact of the environment

mechanism is responsible for increased mutagenesis. These results fostered a comprehensive


research effort and the experimental system was fully dissected (Roth et al., 2006; Galhardo
et al., 2007).
These studies and others highlighted that the role of the environment is not restricted
to selection alone. Clearly, the results of the Luria-Delbrück (Luria and Delbruck, 1943) and
Lederberg-Lederberg (Lederberg and Lederberg, 1952) experiments rapidly gained general
acceptance because they fitted paradigmatic expectations, in spite of their restricted biological
significance (harsh selective pressure, specific type of mutations). As will be illustrated be-
low, cells evolved mechanisms to sense stressful conditions that happen to responsively con-
vert the collected information into mutations (see Stress-induced mutagenesis, pp 47-60).
Thus, the generation of variability is not necessarily constant in time but can be informed by
the environment. Moreover, mutation rates are not only variable through time but are also
variable in genomic space. Indeed, descriptions of genetic mechanisms dedicated to or favor-
ing localized and oriented mutations accumulated over the last decades. The detailed presenta-
tion of these mechanisms is covered in the introduction of this thesis (see Programed
generation of genetic variations, pp 60-99).
Collectively, these processes participate in a kind of molecular intelligence that allow
cells to anticipate changes or directs genetic changes according to the environment and are of-
ten perceived as Lamarckian. However, these discoveries do not contradict but extend the
classical synthetic theory of evolution (Thaler, 1994). In final analysis, mutations always ap-
pear in a random fashion. Genomes simply evolved the capacity to control this randomness,
so that the production of variation can be tuned to the demands of the environment. To take
up a weel put sentence, this reflects the fact that chance favors the prepared genome
(Caporale, 1999).

267
268
References

REFERENCES

Acar M, Mettetal JT, van Oudenaarden A (2008) Stochastic switching as a survival strategy
in fluctuating environments. Nat Genet 40(4): 471-475. Epub 2008 Mar 2023.
Achaz G, Coissac E, Netter P, Rocha EP (2003) Associations between inverted repeats and
the structural evolution of bacterial genomes. Genetics 164(4): 1279-1289.
Ackermann M, Chao L (2006) DNA sequences shaped by selection for stability. PLoS genet-
ics 2(2).
Aertsen A, Michiels CW (2005) Mrr instigates the SOS response after high pressure stress in
Escherichia coli. Mol Microbiol 58(5): 1381-1391.
Aertsen A, Michiels CW (2006) Upstream of the SOS response: figure out the trigger. Trends
in microbiology 14(10): 423.
Aizenman E, Engelberg-Kulka H, Glaser G (1996) An Escherichia coli chromosomal "addic-
tion module" regulated by guanosine [corrected] 3',5'-bispyrophosphate: a model for
programmed bacterial cell death. Proc Natl Acad Sci U S A 93(12): 6059-6063.
Aleshkin GI, Kadzhaev KV, Markov AP (1998) High and low UV-dose responses in SOS-
induction of the precise excision of transposons tn1, Tn5 and Tn10 in Escherichia coli.
Mutat Res 401(1-2): 179-191.
Alon U (2007) Network motifs: theory and experimental approaches. Nature Reviews Genet-
ics 8(6): 461.
Ancel LW, Fontana W (2000) Plasticity, evolvability, and modularity in RNA. J Exp Zool
288(3): 242-283.
Andrews JH (1998) Bacteria as modular organisms. Annual review of microbiology 52: 126.
Aoki SK, Pamma R, Hernday AD, Bickham JE, Braaten BA et al. (2005) Contact-dependent
inhibition of growth in Escherichia coli. Science 309(5738): 1245-1248.
Archetti M (2006) Genetic robustness and selection at the protein level for synonymous
codons. J Evol Biol 19(2): 353-365.
Au N, Kuester-Schoeck E, Mandava V, Bothwell LE, Canny SP et al. (2005) Genetic compo-
sition of the Bacillus subtilis SOS system. Journal of bacteriology 187(22): 7666.
Avison MB (2005) New approaches to combating antimicrobial drug resistance. Genome bi-
ology 6(13).

Bagdasarian M, Bailone A, Bagdasarian MM, Manning PA, Lurz R et al. (1986) An inhibitor
of SOS induction, specified by a plasmid locus in Escherichia coli. Proc Natl Acad Sci
U S A 83(15): 5723-5726.
Bailone A, Backman A, Sommer S, Celerier J, Bagdasarian MM et al. (1988) PsiB polypep-
tide prevents activation of RecA protein in Escherichia coli. Mol Gen Genet 214(3):
389-395.
Baldwin JM (1896) 'A New Factor In Evolution.' Science 4(83): 139.

269
References

Barbour AG, Dai Q, Restrepo BI, Stoenner HG, Frank SA (2006) Pathogen escape from host
immunity by a genome program for antigenic variation. Proc Natl Acad Sci U S A
103(48): 18290-18295.
Barker A, Clark CA, Manning PA (1994) Identification of VCR, a repeated sequence associ-
ated with a locus encoding a hemagglutinin in Vibrio cholerae O1. J Bacteriol
176(17): 5450-5458.
Barlow RS, Gobius KS (2006) Diverse class 2 integrons in bacteria from beef cattle sources.
J Antimicrob Chemother 58(6): 1133-1138.
Barnes RL, McCulloch R (2007) Trypanosoma brucei homologous recombination is depend-
ent on substrate length and homology, though displays a differential dependence on
mismatch repair as substrate length decreases. Nucleic Acids Res 35(10): 3478-3493.
Epub 2007 May 3473.
Bartlett DH, Azam F (2005) Microbiology. Chitin, cholera, and competence. Science
310(5755): 1775-1777.
Bates S, Roscoe RA, Althorpe NJ, Brammar WJ, Wilkins BM (1999) Expression of leading
region genes on IncI1 plasmid ColIb-P9: genetic evidence for single-stranded DNA
transcription. Microbiology 145(Pt 10): 2655-2662.
Baute J, Depicker A (2008) Base excision repair and its role in maintaining genome stability.
Critical reviews in biochemistry and molecular biology 43(4): 276.
Bayles KW (2007) The biological role of death and lysis in biofilm development. Nat Rev
Microbiol 5(9): 721-726.
Bayliss CD, van de Ven T, Moxon ER (2002) Mutations in polI but not mutSLH destabilize
Haemophilus influenzae tetranucleotide repeats. Embo J 21(6): 1465-1476.
Beaber JW, Hochhut B, Waldor MK (2004) SOS response promotes horizontal dissemination
of antibiotic resistance genes. Nature 427(6969): 74.
Behe MJ (1996) Darwin's Black Box: The Biochemical Challenge to Evolution: Free Press.
Benavente R, Volff J (2009) Meiosis; Benavente R VJ-N, editor. Basel: Karger.
Bennett M, Hasty J (2007) A DNA methylation based switch generates bistable gene expres-
sion. Nature Genetics 39(2): 147.
Bergman A, Siegal ML (2003) Evolutionary capacitance as a general feature of complex
gene networks. Nature 424(6948): 552.
Bergman MP, Engering A, Smits HH, van Vliet SJ, van Bodegraven AA et al. (2004) Helico-
bacter pylori modulates the T helper cell 1/T helper cell 2 balance through phase-
variable interaction between lipopolysaccharide and DC-SIGN. The Journal of ex-
perimental medicine 200(8): 990.
Berkman MB, Pacheco JS, Plutzer E (2008) Evolution and creationism in America's class-
rooms: a national portrait. PLoS Biol 6(5): e124.
Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H et al. (2005) The genome of
the African trypanosome Trypanosoma brucei. Science 309(5733): 416-422.
Bershtein S, Goldin K, Tawfik DS (2008) Intense neutral drifts yield robust and evolvable
consensus proteins. J Mol Biol 379(5): 1029-1044. Epub 2008 Apr 1016.
Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS (2006) Robustness-epistasis link
shapes the fitness landscape of a randomly drifting protein. Nature 444(7121): 929-
932. Epub 2006 Nov 2019.
Bershtein S, Tawfik DS (2008) Ohno's model revisited: measuring the frequency of poten-
tially adaptive mutations under various mutational drifts. Mol Biol Evol 25(11): 2311-
2318. Epub 2008 Aug 2316.
Bichara M, Wagner J, Lambert IB (2006) Mechanisms of tandem repeat instability in bacte-
ria. Mutat Res 598(1-2): 144-163. Epub 2006 Mar 2007.
Bird A (2007) Perceptions of epigenetics. Nature 447(7143): 396-398.

270
References

Biskri L, Bouvier M, Guerout AM, Boisnard S, Mazel D (2005) Comparative study of class 1
integron and Vibrio cholerae superintegron integrase activities. J Bacteriol 187(5):
1740-1750.
Biskri L, Mazel D (2003) Erythromycin esterase gene ere(A) is located in a functional gene
cassette in an unusual class 2 integron. Antimicrob Agents Chemother 47(10): 3326-
3331.
Bissonnette L, Champetier S, Buisson JP, Roy PH (1991) Characterization of the nonenzy-
matic chloramphenicol resistance (cmlA) gene of the In4 integron of Tn1696: similar-
ity of the product to transmembrane transport proteins. J Bacteriol 173(14): 4493-
4502.
Bjedov I, Tenaillon O, Gérard B, Souza V, Denamur E et al. (2003) Stress-induced
mutagenesis in bacteria. Science 300(5624): 1409.
Blomfield IC (2001) The regulation of pap and type 1 fimbriation in Escherichia coli. Adv
Microb Physiol 45: 1-49.
Bloom J, Lu Z, Chen D, Raval A, Venturelli O et al. (2007) Evolution favors protein muta-
tional robustness in sufficiently large populations. BMC Biology 5(1).
Bloom JD, Labthavikul ST, Otey CR, Arnold FH (2006) Protein stability promotes evolvabil-
ity. Proc Natl Acad Sci U S A 103(15): 5869 - 5874.
Blount Z, Borland C, Lenski R (2008) Historical contingency and the evolution of a key inno-
vation in an experimental population of Escherichia coli. Proceedings of the National
Academy of Sciences: 0803151105.
Boe L, Danielsen M, Knudsen S, Petersen JB, Maymann J et al. (2000) The frequency of mu-
tators in populations of Escherichia coli. Mutat Res 448(1): 47-55.
Bolker M, Kahmann R (1989) The Escherichia coli regulatory protein OxyR discriminates
between methylated and unmethylated states of the phage Mu mom promoter. Embo J
8(8): 2403-2410.
Bormann NE, Cleary PP (1997) Transcriptional analysis of mga, a regulatory gene in Strep-
tococcus pyogenes: identification of monocistronic and bicistronic transcripts that
phase vary. Gene 200(1-2): 125-134.
Borst P, Greaves DR (1987) Programmed gene rearrangements altering gene expression.
Science 235(4789): 667.
Boucher Y, Labbate M, Koenig JE, Stokes HW (2007) Integrons: mobilizable platforms that
promote genetic diversity in bacteria. Trends in microbiology 15(7): 309.
Boucher Y, Nesbo C, Joss M, Robinson A, Mabbutt B et al. (2006) Recovery and evolution-
ary analysis of complete integron gene cassette arrays from Vibrio. BMC Evol Biol
6(1).
Bouvier M, Demarre G, Mazel D (2005) Integron cassette insertion: a recombination process
involving a folded single strand substrate. Embo J 24(24): 4356-4367.
Boyd F, Almagro-Moreno S, Parent M (2009) Genomic islands are dynamic, ancient integra-
tive elements in bacterial evolution. Trends in Microbiology 17(2): 53.
Braaten BA, Nou X, Kaltenbach LS, Low DA (1994) Methylation patterns in pap regulatory
DNA control pyelonephritis-associated pili phase variation in E. coli. Cell 76(3): 577-
588.
Bunny K, Liu J, Roth J (2002) Phenotypes of lexA mutations in Salmonella enterica: evidence
for a lethal lexA null phenotype due to the Fels-2 prophage. J Bacteriol 184(22):
6235-6249.
Bunny KL, Hall RM, Stokes HW (1995) New mobile gene cassettes containing an aminogly-
coside resistance gene, aacA7, and a chloramphenicol resistance gene, catB3, in an
integron in pBWH301. Antimicrob Agents Chemother 39(3): 686-693.
Burbulys D, Trach KA, Hoch JA (1991) Initiation of sporulation in B. subtilis is controlled by

271
References

a multicomponent phosphorelay. Cell 64(3): 545-552.


Burma S, Chen BP, Chen DJ (2006) Role of non-homologous end joining (NHEJ) in main-
taining genomic integrity. DNA repair 5(9-10): 1048.
Bushman F (2004) Gene regulation: selfish elements make a mark. Nature 429(6989): 253-
255.

Cadet J, Douki T, Gasparutto D, Ravanat JL (2003) Oxidative damage to DNA: formation,


measurement and biochemical features. Mutat Res 531(1-2): 5-23.
Caetano-anollés G, Wang M, Caetano-anollés D, Mittenthal JE (2009) The origin, evolution
and structure of the protein world. Biochem J 417(3): 621-637.
Cairns J, Overbaugh J, Miller S (1988) The origin of mutants. Nature 335(6186): 145.
Cameron FH, Groot Obbink DJ, Ackerman VP, Hall RM (1986) Nucleotide sequence of the
AAD(2'') aminoglycoside adenylyltransferase determinant aadB. Evolutionary rela-
tionship of this region with those surrounding aadA in R538-1 and dhfrII in R388.
Nucleic Acids Res 14(21): 8625-8635.
Caporale LH (1999) Chance favors the prepared genome. Ann N Y Acad Sci 870: 1-21.
Casadesus J, Low D (2006) Epigenetic gene regulation in the bacterial world. Microbiol Mol
Biol Rev 70(3): 856.
Cavalier-Smith T (2002) Origins of the machinery of recombination and sex. Heredity 88(2):
125-141.
Centron D, Roy PH (2002) Presence of a group II intron in a multiresistant Serratia marces-
cens strain that harbors three integrons and a novel gene fusion. Antimicrob Agents
Chemother 46(5): 1402-1409.
Cerdeno-Tarraga AM, Patrick S, Crossman LC, Blakely G, Abratt V et al. (2005) Extensive
DNA inversions in the B. fragilis genome control variable gene expression. Science
307(5714): 1465.
Chain E, Florey H, Gardner A, Heatley N, Jennings M et al. (1940) Penicillin as a chemo-
therapeutic agent. The Lancet: 226-228.
Chang DK, Metzgar D, Wills C, Boland CR (2001) Microsatellites in the eukaryotic DNA
mismatch repair genes as modulators of evolutionary mutation rate. Genome research
11(7): 1146.
Chao L, Vargas C, Spear BB, Cox EC (1983) Transposable elements as mutator genes in evo-
lution. Nature 303(5918): 635.
Chen CY, Wu KM, Chang YC, Chang CH, Tsai HC et al. (2003) Comparative genome analy-
sis of Vibrio vulnificus, a marine pathogen. Genome research 13(12): 2587.
Chen I, Dubnau D (2004) DNA uptake during bacterial transformation. Nat Rev Microbiol
2(3): 241-249.
Chernoff YO, Lindquist SL, Ono B, Inge-Vechtomov SG, Liebman SW (1995) Role of the
chaperone protein Hsp104 in propagation of the yeast prion-like factor [psi+]. Sci-
ence 268(5212): 880-884.
Choi PJ, Cai L, Frieda K, Xie XS (2008) A stochastic single-molecule event triggers pheno-
type switching of a bacterial cell. Science 322(5900): 442-446.
Chu D, Blomfield IC (2007) Orientational control is an efficient control mechanism for phase
switching in the E. coli fim system. Journal of theoretical biology 244(3): 551.
Chung JD, Stephanopoulos G, Ireton K, Grossman AD (1994) Gene expression in single cells
of Bacillus subtilis: evidence that a threshold mechanism controls the initiation of
sporulation. J Bacteriol 176(7): 1977-1984.
Cirz RT, Chin JK, Andes DR, de Crécy-Lagard V, Craig WA et al. (2005) Inhibition of muta-

272
References

tion and combating the evolution of antibiotic resistance. PLoS Biol 3(6).
Clancy S (2008a) DNA damage & repair: mechanisms for maintaining DNA integrity. Nature
Education 1(1).
Clancy S (2008b) Genetic mutation. Nature Education 1(1).
Clark CA, Purins L, Kaewrakon P, Manning PA (1997) VCR repetitive sequence elements in
the Vibrio cholerae chromosome constitute a mega-integron. Mol Microbiol 26(5):
1137-1138.
Claverys JP, Martin B (2003) Bacterial "competence" genes: signatures of active transforma-
tion, or only remnants? Trends Microbiol 11(4): 161-165.
Claverys JP, Prudhomme M, Martin B (2006) Induction of competence regulons as a general
response to stress in gram-positive bacteria. Annu Rev Microbiol 60: 451-475.
Cohn M, Horibata K (1959) Inhibition by glucose of the induced synthesis of the beta-
galactoside-enzyme system of Escherichia coli. Analysis of maintenance. J Bacteriol
78: 601-612.
Coleman NV, Holmes AJ (2005) The native Pseudomonas stutzeri strain Q chromosomal in-
tegron can capture and express cassette-associated genes. Microbiology (Reading,
England) 151(Pt 6): 1864.
Collis CM, Grammaticopoulos G, Briton J, Stokes HW, Hall RM (1993) Site-specific inser-
tion of gene cassettes into integrons. Molecular microbiology 9(1): 52.
Collis CM, Hall RM (1992) Gene cassettes from the insert region of integrons are excised as
covalently closed circles. Molecular microbiology 6(19): 2885.
Collis CM, Hall RM (1995) Expression of antibiotic resistance genes in the integrated cas-
settes of integrons. Antimicrob Agents Chemother 39(1): 162.
Collis CM, Kim MJ, Partridge SR, Stokes HW, Hall RM (2002a) Characterization of the
class 3 integron and the site-specific recombination system it determines. J Bacteriol
184(11): 3017-3026.
Collis CM, Kim MJ, Stokes HW, Hall RM (1998) Binding of the purified integron DNA inte-
grase Intl1 to integron- and cassette-associated recombination sites. Mol Microbiol
29(2): 477-490.
Collis CM, Kim MJ, Stokes HW, Hall RM (2002b) Integron-encoded IntI integrases prefer-
entially recognize the adjacent cognate attI site in recombination with a 59-be site.
Molecular microbiology 46(5): 1427.
Collis CM, Recchia GD, Kim MJ, Stokes HW, Hall RM (2001) Efficiency of recombination
reactions catalyzed by class 1 integron integrase IntI1. Journal of bacteriology 183(8):
2542.
Conant GC, Wolfe KH (2008a) Turning a hobby into a job: How duplicated genes find new
functions. Nat Rev Genet 9(12): 938.
Conant GC, Wolfe KH (2008b) Turning a hobby into a job: how duplicated genes find new
functions. Nat Rev Genet 9(12): 938-950.
Cooper GM, Nickerson DA, Eichler EE (2007a) Mutational and selective effects on copy-
number variants in the human genome. Nat Genet 39(7 Suppl): S22-29.
Cooper TF, Ostrowski EA, Travisano M (2007b) A negative relationship between mutation
pleiotropy and fitness effect in yeast. Evolution 61(6): 1495-1499.
Cooper TF, Remold SK, Lenski RE, Schneider D (2008) Expression profiles reveal parallel
evolution of epistatic interactions involving the CRP regulon in Escherichia coli.
PLoS Genet 4(2): e35.
Cooper TF, Rozen DE, Lenski RE (2003) Parallel changes in gene expression after 20,000
generations of evolution in Escherichiacoli. Proc Natl Acad Sci U S A 100(3): 1072-
1077.
Cooper VS, Lenski RE (2000) The population genetics of ecological specialization in evolv-

273
References

ing Escherichia coli populations. Nature 407(6805): 736-739.


Cooper VS, Schneider D, Blot M, Lenski RE (2001) Mechanisms causing rapid and parallel
losses of ribose catabolism in evolving populations of Escherichia coli B. J Bacteriol
183(9): 2834-2841.
Courcelle J, Khodursky A, Peter B, Brown PO, Hanawalt PC (2001) Comparative gene ex-
pression profiles following UV exposure in wild-type and SOS-deficient Escherichia
coli. Genetics 158(1): 64.
Coyne MJ, Weinacht KG, Krinos CM, Comstock LE (2003) Mpi recombinase globally modu-
lates the surface architecture of a human commensal bacterium. Proc Natl Acad Sci U
S A 100(18): 10446-10451.
Crameri A, Whitehorn EA, Tate E, Stemmer WP (1996) Improved green fluorescent protein
by molecular evolution using DNA shuffling. Nat Biotechnol 14(3): 315-319.
Crozat E, Philippe N, Lenski RE, Geiselmann J, Schneider D (2005) Long-term experimental
evolution in Escherichia coli. XII. DNA topology as a key target of selection. Genetics
169(2): 523-532.
Cubas P, Vincent C, Coen E (1999) An epigenetic mutation responsible for natural variation
in floral symmetry. Nature 401(6749): 161.

Dagan T, Graur D (2005) The comparative method rules! Codon volatility cannot detect posi-
tive Darwinian selection using a single genome sequence. Mol Biol Evol 22(3): 496-
500. Epub 2004 Nov 2003.
Dai J, Xie W, Brady TL, Gao J, Voytas DF (2007) Phosphorylation regulates integration of
the yeast Ty5 retrotransposon into heterochromatin. Mol Cell 27(2): 289-299.
Dai Q, Restrepo BI, Porcella SF, Raffel SJ, Schwan TG et al. (2006) Antigenic variation by
Borrelia hermsii occurs through recombination between extragenic repetitive ele-
ments on linear plasmids. Mol Microbiol 60(6): 1329-1343.
Darwin C (1859) On the Origin of Species By Means of Natural Selection: Gramercy Books.
Darwin C, Wallace A (1858) On the Tendency of Species to form Varieties; and on the Per-
petuation of Varieties and Species by Natural Means of Selection. Journal of the Pro-
ceedings of the Linnean Society of London: 46-50.
Davies J (2006) Are antibiotics naturally antibiotics? J Ind Microbiol Biotechnol 33(7): 496-
499.
Dawid S, Barenkamp SJ, St Geme JW, 3rd (1999) Variation in expression of the Haemophi-
lus influenzae HMW adhesins: a prokaryotic system reminiscent of eukaryotes. Proc
Natl Acad Sci U S A 96(3): 1077-1082.
Dawkins R (1976) The Selfish Gene: Oxford University Press.
Dawkins R (1986) The Blind Watchmaker: W. W. Norton & Company, Inc.
De Bolle X, Bayliss CD, Field D, van de Ven T, Saunders NJ et al. (2000) The length of a
tetranucleotide repeat tract in Haemophilus influenzae determines the phase variation
rate of a gene with homology to type III DNA methyltransferases. Molecular microbi-
ology 35(1): 222.
de Visser JA, Hermisson J, Wagner GP, Ancel Meyers L, Bagheri-Chaichian H et al. (2003)
Perspective: Evolution and detection of genetic robustness. Evolution 57(9): 1959-
1972.
De Visser JAGM (2002) The fate of microbial mutators. Microbiology (Reading, England)
148(Pt 5): 1252.
de Visser JAGM, Elena SF (2007) The evolution of sex: empirical insights into the roles of
epistasis and drift. Nat Rev Genet 8(2): 139.

274
References

De Visser JAGM, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE (1999) Diminishing returns
from mutation supply rate in asexual populations. Science 283(5400): 404-406.
Deaconescu AM, Savery N, Darst SA (2007) The bacterial transcription repair coupling fac-
tor. Current opinion in structural biology 17(1): 102.
Demarre G, Frumerie C, Gopaul DN, Mazel D (2007) Identification of key structural deter-
minants of the IntI1 integron integrase that influence attC x attI1 recombination effi-
ciency. Nucleic Acids Res 35(19): 6475-6489. Epub 2007 Sep 6420.
Denamur E, Lecointre G, Darlu P, Tenaillon O, Acquaviva C et al. (2000) Evolutionary im-
plications of the frequent horizontal transfer of mismatch repair genes. Cell 103(5):
711-721.
Denamur E, Matic I (2006) Evolution of mutation rates in bacteria. Molecular Microbiology
60(4): 827.
Dickinson WJ, Seger J (1999) Cause and effect in evolution. Nature 399(6731): 30.
Dillingham M, Kowalczykowski S (2008) RecBCD Enzyme and the Repair of Double-
Stranded DNA Breaks. Microbiol Mol Biol Rev 72(4): 671.
Drake J, Charlesworth B, Charlesworth D, Crow J (1998) Rates of Spontaneous Mutation.
Genetics 148(4): 1686.
Drake JW (1991) A constant rate of spontaneous mutation in DNA-based microbes. Proceed-
ings of the National Academy of Sciences of the United States of America 88(16):
7164.
Drummond DA, Wilke CO (2008) Mistranslation-induced protein misfolding as a dominant
constraint on coding-sequence evolution. Cell 134(2): 341-352.
Dubnau D (1999) DNA uptake in bacteria. Annu Rev Microbiol 53: 217-244.
Dubnau D, Losick R (2006) Bistability in bacteria. Molecular Microbiology 61(3): 564-572.
Dubois V, Debreyer C, Litvak S, Quentin C, Parissi V (2007) A new in vitro strand transfer
assay for monitoring bacterial class 1 integron recombinase IntI1 activity. PLoS ONE
2(12).
Dzidic S, Petranovic M (2003) Mismatch repair in the antimutator Escherichia coli mud. Mu-
tat Res 522(1-2): 27-32.

Eaglestone SS, Cox BS, Tuite MF (1999) Translation termination efficiency can be regulated
in Saccharomyces cerevisiae by environmental stress through a prion-mediated
mechanism. Embo J 18(7): 1974-1981.
Ebina H, Levin H (2007) Stress Management: How Cells Take Control of Their Transposons.
Molecular Cell 27(2): 181.
Eichenbaum Z, Livneh Z (1998) UV light induces IS10 transposition in Escherichia coli. Ge-
netics 149(3): 1173-1181.
El-Labany S, Sohanpal BK, Lahooti M, Akerman R, Blomfield IC (2003) Distant cis-active
sequences and sialic acid control the expression of fimB in Escherichia coli K-12. Mol
Microbiol 49(4): 1109-1118.
Elbourne LD, Hall RM (2006) Gene cassette encoding a 3-N-aminoglycoside acetyltrans-
ferase in a chromosomal integron. Antimicrob Agents Chemother 50(6): 2270-2271.
Elena SF, Carrasco P, Daros JA, Sanjuan R (2006) Mechanisms of genetic robustness in RNA
viruses. EMBO Rep 7(2): 168-173.
Elsaied, Hosam, Stokes, Nakamura, Takamichi et al. (2007) Novel and diverse integron inte-
grase genes and integron-like gene cassettes are prevalent in deep-sea hydrothermal
vents. Environmental Microbiology 9(9): 2312.
Erill I, Campoy S, Mazon G, Barbé J (2006) Dispersal and regulation of an adaptive

275
References

mutagenesis cassette in the bacteria domain. Nucleic Acids Res 34(1): 66-77. Print
2006.
Erill I, Campoy S, Barbé J (2007) Aeons of distress: an evolutionary perspective on the bacte-
rial SOS response. FEMS microbiology reviews 31(6): 656.
Eyre-Walker A, Keightley PD (2007) The distribution of fitness effects of new mutations. Nat
Rev Genet 8(8): 610-618.

Fares MA, Ruiz-Gonzalez MX, Moya A, Elena SF, Barrio E (2002) Endosymbiotic bacteria:
GroEL buffers against deleterious mutations. Nature 417(6887): 398.
Feder ME, Hofmann GE (1999) Heat-shock proteins, molecular chaperones, and the stress
response: evolutionary and ecological physiology. Annual Review of Physiology
61(1): 243-282.
Feil R (2008) Epigenetics, an emerging discipline with broad implications. C R Biol 331(11):
837-843. Epub 2008 Sep 2026.
Feng L, Reeves P, Lan R, Ren Y, Gao C et al. (2008) A Recalibrated Molecular Clock and
Independent Origins for the Cholera Pandemic Clones. PLoS ONE 3(12): e4053.
Fernandez De Henestrosa AR, Ogi T, Aoyagi S, Chafin D, Hayes JJ et al. (2000) Identifica-
tion of additional genes belonging to the LexA regulon in Escherichia coli. Mol Mi-
crobiol 35(6): 1560-1572.
Fisher RA (1930) The Genetical Theory of Natural Selection: Oxford University Press.
Fisher RA (1948) What Sort of Man is Lysenko? Listener 40: 875.
Fleming A (1929) On the antibacterial action of cultures of a penicillium, with special refer-
ence to their use in the isolation of B. influenzae. British Journal of Experimental Pa-
thology 10: 226-236.
Fluit AC, Schmitz FJ (2004) Resistance integrons and super-integrons. Clinical microbiology
and infection: the official publication of the European Society of Clinical Microbiol-
ogy and Infectious Diseases 10(4): 288.
Fondon J, Garner H (2004) Molecular origins of rapid and continuous morphological evolu-
tion. PNAS 101(52): 18063.
Fonseca E, dos Santos Freitas F, Vieira V, Vicente A (2008) New qnr Gene Cassettes Associ-
ated with Superintegron Repeats in Vibrio cholerae O1. Emerg Infect Dis.
Francia MV, de la Cruz F, Garcia Lobo JM (1993) Secondary-sites for integration mediated
by the Tn21 integrase. Mol Microbiol 10(4): 823-828.
Francia MV, Garcia Lobo JM (1996) Gene integration in the Escherichia coli chromosome
mediated by Tn21 integrase (Int21). J Bacteriol 178(3): 894-898.
Francia MV, Avila P, de la Cruz F, Garcia Lobo JM (1997) A hot spot in plasmid F for site-
specific recombination mediated by Tn21 integron integrase. J Bacteriol 179(13):
4419-4425.
Francia MV, Zabala JC, de la Cruz F, Garcia Lobo JM (1999) The IntI1 integron integrase
preferentially binds single-stranded DNA of the attC site. J Bacteriol 181(21): 6844-
6849.
Friedberg E, Wood R, Walker G, Siede W, Schultz R et al. (2005a) The SOS response of pro-
karyotes to DNA damage. DNA Repair and Mutagenesis. 2 ed: ASM Press.
Friedberg EC, Lehmann AR, Fuchs RP (2005b) Trading places: how do DNA polymerases
switch during translesion DNA synthesis? Molecular cell 18(5): 505.
Friedman N, Vardi S, Ronen M, Alon U, Stavans J (2005) Precise temporal modulation in the
response of the SOS DNA repair network in individual bacteria. PLoS Biol 3(7).
Fuchs RP, Fujii S, Wagner J (2004) Properties and functions of Escherichia coli: Pol IV and

276
References

Pol V. Advances in protein chemistry 69: 264.


Fujii S, Isogawa A, Fuchs RP (2006) RecFOR proteins are essential for Pol V-mediated tran-
slesion synthesis and mutagenesis. Embo J 25(24): 5754-5763. Epub 2006 Nov 5730.
Funchain P, Yeung A, Stewart JL, Lin R, Slupska MM et al. (2000) The consequences of
growth of a mutator strain of Escherichia coli as measured by loss of function among
multiple gene targets and loss of fitness. Genetics 154(3): 970.

Galhardo RS, Hastings PJ, Rosenberg SM (2007) Mutation as a stress response and the regu-
lation of evolvability. Crit Rev Biochem Mol Biol 42(5): 435.
Gerdes K, Christensen SK, Lobner-Olesen A (2005) Prokaryotic toxin-antitoxin stress re-
sponse loci. Nat Rev Microbiol 3(5): 371-382.
Gerhart J, Kirschner M (2007) The theory of facilitated variation. Proc Natl Acad Sci U S A.
Gerrish PJ, Lenski RE (1998) The fate of competing beneficial mutations in an asexual popu-
lation. Genetica 102-103(1-6): 127-144.
Gillings MR, Holley MP, Stokes HW, Holmes AJ (2005) Integrons in Xanthomonas: A
source of species genome diversity. Proc Natl Acad Sci U S A.
Gillor O, Vriezen JA, Riley MA (2008) The role of SOS boxes in enteric bacteriocin regula-
tion. Microbiology 154(Pt 6): 1783-1792.
Giraud A, Matic I, Tenaillon O, Clara A, Radman M et al. (2001) Costs and Benefits of High
Mutation Rates: Adaptive Evolution of Bacteria in the Mouse Gut. Science 291(5513):
2608.
Gniadkowski M (2008) Evolution of extended-spectrum beta-lactamases by mutation. Clin
Microbiol Infect 14(Suppl 1): 11-32.
Goldberg AD, Allis CD, Bernstein E (2007) Epigenetics: A Landscape Takes Shape. Cell
128(4): 635.
Golub E, Bailone A, Devoret R (1988) A gene encoding an SOS inhibitor is present in differ-
ent conjugative plasmids. J Bacteriol 170(9): 4392-4394.
Gonzalez-Pastor JE, Hobbs EC, Losick R (2003) Cannibalism by sporulating bacteria. Sci-
ence 301(5632): 510-513. Epub 2003 Jun 2019.
Goodman M (2002) Error-prone repair DNA polymerases in prokaryotes and eukaryotes.
Annual Review of Biochemistry 71(1): 50.
Goransson M, Sonden B, Nilsson P, Dagberg B, Forsman K et al. (1990) Transcriptional si-
lencing and thermoregulation of gene expression in Escherichia coli. Nature
344(6267): 682-685.
Gordon AJE, Halliday JA, Blankschien MD, Burns PA, Yatagai F et al. (2009) Transcrip-
tional Infidelity Promotes Heritable Phenotypic Change in a Bistable Gene Network.
PLoS Biology 7(2): e44.
Gould SJ (1996) Full House: The Spread of Excellence from Plato to Darwin: Three Rivers
Press.
Gould SJ (2002) The Structure of Evolutionary Theory: Belknap Press.
Gouyon P, Henry J, Arnould J (2002) Gene Avatars: The Neo-Darwinian Theory of Evolu-
tion: Springer.
Grainge I, Jayaram M (1999) The integrase family of recombinase: organization and function
of the active site. Mol Microbiol 33(3): 449-456.
Gravel A, Fournier B, Roy PH (1998) DNA complexes obtained with the integron integrase
IntI1 at the attI1 site. Nucleic Acids Res 26(19): 4347-4355.
Greig D, Borts RH, Louis EJ (1998) The effect of sex on adaptation to high temperature in
heterozygous and homozygous yeast. Proc Biol Sci 265(1400): 1017-1023.

277
References

Grimberg B, Zeyl C (2005) The effects of sex and mutation rate on adaptation in test tubes
and to mouse hosts by Saccharomyces cerevisiae. Evolution 59(2): 431-438.
Grindley ND, Whiteson KL, Rice PA (2006) Mechanisms of Site-Specific Recombination.
Annu Rev Biochem.
Groban ES, Johnson MB, Banky P, Burnett PG, Calderon GL et al. (2005) Binding of the Ba-
cillus subtilis LexA protein to the SOS operator. Nucleic acids research 33(19): 6295.
Guiral S, Mitchell TJ, Martin B, Claverys JP (2005) Competence-programmed predation of
noncompetent cells in the human pathogen Streptococcus pneumoniae: genetic re-
quirements. Proc Natl Acad Sci U S A 102(24): 8710-8715. Epub 2005 May 8731.
Gupta R, Tawfik D (2008) Directed enzyme evolution via small and effective neutral drift li-
braries. Nature Methods 5(11): 942.
Gustafsson (1979) Linnaeus' Peloria: The history of a monster. TAG Theoretical and Applied
Genetics 54(6): 248.

Haber JE (1998) Mating-type gene switching in Saccharomyces cerevisiae. Annual Review of


Genetics 32(1): 561-599.
Hall B (1999) Transposable elements as activators of cryptic genes in E. coli. Genetica
107(1): 181.
Hall RM, Brookes DE, Stokes HW (1991) Site-specific insertion of genes into integrons: role
of the 59-base element and determination of the recombination cross-over point. Mo-
lecular microbiology 5(8): 1959.
Hall RM, Collis CM, Kim MJ, Partridge SR, Recchia GD et al. (1999) Mobile gene cassettes
and integrons in evolution. Annals of the New York Academy of Sciences 870: 80.
Hammock EA, Young LJ (2005) Microsatellite instability generates diversity in brain and
sociobehavioral traits. Science 308(5728): 1634.
Hamoen LW, Haijema B, Bijlsma JJ, Venema G, Lovett CM (2001) The Bacillus subtilis
competence transcription factor, ComK, overrides LexA-imposed transcriptional inhi-
bition without physically displacing LexA. J Biol Chem 276(46): 42901-42907. Epub
42001 Sep 42912.
Hamoen LW, Smits WK, Jong Ad, Holsappel S, Kuipers OP (2002) Improving the predictive
value of the competence transcription factor (ComK) binding site in Bacillus subtilis
using a genomic approach. Nucl Acids Res 30(24): 5517-5528.
Hanau-Berçot B, Podglajen I, Casin I, Collatz E (2002) An intrinsic control element for trans-
lational initiation in class 1 integrons. Molecular microbiology 44(1): 130.
Hansson K, Skold O, Sundstrom L (1997) Non-palindromic attl sites of integrons are capable
of site-specific recombination with one another and with secondary targets. Mol Mi-
crobiol 26(3): 441-453.
Hansson K, Sundstrom L, Pelletier A, Roy PH (2002) IntI2 integron integrase in Tn7. J Bac-
teriol 184(6): 1712-1721.
Hartl DL, Taubes CH (1998) Towards a theory of evolutionary adaptation. Genetica 102-
103(1-6): 525-533.
Hartl FU, Hlodan R, Langer T (1994) Molecular chaperones in protein folding: the art of
avoiding sticky situations. Trends Biochem Sci 19(1): 20-25.
Haselkorn R (1992) Developmentally regulated gene rearrangements in prokaryotes. Annu
Rev Genet 26: 113-130.
Hastings PJ (2007) Adaptive amplification. Crit Rev Biochem Mol Biol 42(4): 283.
Hattman S, Sun W (1997) Escherichia coli OxyR modulation of bacteriophage Mu mom ex-
pression in dam+ cells can be attributed to its ability to bind hemimethylated Pmom

278
References

promoter DNA. Nucleic Acids Res 25(21): 4385-4388.


Hawkey PM (2008) The growing burden of antimicrobial resistance. J Antimicrob Chemo-
ther 62 Suppl 1: i1-9.
Hazan R, Sat B, Engelberg-Kulka H (2004) Escherichia coli mazEF-mediated cell death is
triggered by various stressful conditions. J Bacteriol 186(11): 3663-3669.
Heidelberg JF, Eisen JA, Nelson WC, Clayton RA, Gwinn ML et al. (2000) DNA sequence of
both chromosomes of the cholera pathogen Vibrio cholerae. Nature 406(6795): 477-
483.
Henderson IR, Owen P (1999) The major phase-variable outer membrane protein of Es-
cherichia coli structurally resembles the immunoglobulin A1 protease class of ex-
ported protein and is regulated by a novel mechanism involving Dam and oxyR. J
Bacteriol 181(7): 2132-2141.
Hengge-Aronis R (2002) Recent insights into the general stress response regulatory network
in Escherichia coli. J Mol Microbiol Biotechnol 4(3): 341-346.
Hermisson J, Wagner GP (2004) The population genetic theory of hidden variation and ge-
netic robustness. Genetics 168(4): 2271-2284.
Hernday AD, Braaten BA, Broitman-Maduro G, Engelberts P, Low DA (2004) Regulation of
the pap epigenetic switch by CpxAR: phosphorylated CpxR inhibits transition to the
phase ON state by competition with Lrp. Mol Cell 16(4): 537-547.
Holmes AJ, Gillings MR, Nield BS, Mabbutt BC, Nevalainen KM et al. (2003) The gene cas-
sette metagenome is a basic resource for bacterial genome evolution. Environmental
microbiology 5(5): 394.
Hood DW, Deadman ME, Allen T, Masoud H, Martin A et al. (1996) Use of the complete ge-
nome sequence information of Haemophilus influenzae strain Rd to investigate
lipopolysaccharide biosynthesis. Molecular microbiology 22(5): 965.
Hopper K (1999) Risk-spreading and bet-hedging in insect population biology. Annual Re-
view of Entomology 44(1): 560.
Horst JP, Wu TH, Marinus MG (1999) Escherichia coli mutator genes. Trends Microbiol
7(1): 29-36.

Ilves H, Horak R, Kivisaar M (2001) Involvement of sigma(S) in starvation-induced transpo-


sition of Pseudomonas putida transposon Tn4652. J Bacteriol 183(18): 5445-5448.
Imlay JA, Linn S (1987) Mutagenesis and stress responses induced in Escherichia coli by hy-
drogen peroxide. J Bacteriol 169(7): 2967-2976.
Ishino Y, Nishino T, Morikawa K (2006) Mechanisms of Maintaining Genetic Stability by
Homologous Recombination. Chemical Reviews 106(2): 339.

Jablonka E, Oborny B, Molnar I, Kisdi E, Hofbauer J et al. (1995) The adaptive advantage of
phenotypic memory in changing environments. Philos Trans R Soc Lond B Biol Sci
350(1332): 133-141.
Jackson AL, Chen R, Loeb LA (1998) Induction of microsatellite instability by oxidative
DNA damage. Proceedings of the National Academy of Sciences of the United States
of America 95(21): 12473.
Jacob KD, Eckert KA (2007) Escherichia coli DNA polymerase IV contributes to spontane-
ous mutagenesis at coding sequences but not microsatellite alleles. Mutat Res 619(1-
2): 93-103. Epub 2007 Mar 2002.

279
References

Johansson C, Kamali-Moghaddam M, Sundstrom L (2004) Integron integrase binds to bulged


hairpin DNA. Nucleic Acids Res 32(13): 4033-4043.
Johnson N (2008) Sewall Wright and the development of shifting balance theory. Nature Edu-
cation 1(1).
Jones AL, Barth PT, Wilkins BM (1992) Zygotic induction of plasmid ssb and psiB genes fol-
lowing conjugative transfer of Incl1 plasmid Collb-P9. Mol Microbiol 6(5): 605-613.

Karoui H, Bex F, Dreze P, Couturier M (1983) Ham22, a mini-F mutation which is lethal to
host cell and promotes recA-dependent induction of lambdoid prophage. Embo J
2(11): 1863-1868.
Kashi Y, King D (2006) Simple sequence repeats as advantageous mutators in evolution.
Trends in Genetics 22(5): 259.
Kashi Y, King D, Soller M (1997) Simple sequence repeats as a source of quantitative ge-
netic variation. Trends Genet 13(2): 74-78.
Kawecki TJ (2000) The evolution of genetic canalization under fluctuating selection. Evolu-
tion 54(1): 1-12.
Kazazian HH (2004) Mobile elements: drivers of genome evolution. Science 303(5664): 1632.
Keller M, Zengler K (2004) Tapping into microbial diversity. Nat Rev Microbiol 2(2): 141-
150.
Kelley, William L (2006) Lex marks the spot: the virulent side of SOS and a closer look at the
LexA regulon. Molecular Microbiology 62(5): 1238.
Kimura M (1967) On the evolutionary adjustment of spontaneous mutation rates. Genet Res
9: 23-34.
Kimura M (1983) The neutral theory of molecular evolution: Univ. of Cambridge Press.
Koch AL (2003) Bacterial wall as target for attack: past, present, and future research. Clin
Microbiol Rev 16(4): 673-687.
Koenig JE, Boucher Y, Charlebois RL, Nesbo C, Zhaxybayeva O et al. (2008) Integron-
associated gene cassettes in Halifax Harbour: assessment of a mobile gene pool in
marine sediments. Environ Microbiol 10(4): 1024-1038.
Kondrashov AS (1995) Modifiers Of Mutation-Selection Balance - General-Approach And
The Evolution Of Mutation-Rates. Genetical Research 66(1): 53-69.
Koonin EV, Wolf YI (2008) Genomics of bacteria and archaea: the emerging dynamic view
of the prokaryotic world. Nucleic Acids Res 36(21): 6688-6719. Epub 2008 Oct 6623.
Krakauer DC, Plotkin JB (2002) Redundancy, antiredundancy, and the robustness of ge-
nomes. Proc Natl Acad Sci U S A 99(3): 1405-1409. Epub 2002 Jan 1429.
Krinos CM, Coyne MJ, Weinacht KG, Tzianabos AO, Kasper DL et al. (2001) Extensive sur-
face diversity of a commensal microorganism by multiple DNA inversions. Nature
414(6863): 555-558.
Krogh S, O'Reilly M, Nolan N, Devine KM (1996) The phage-like element PBSX and part of
the skin element, which are resident at different locations on the Bacillus subtilis
chromosome, are highly homologous. Microbiology 142(Pt 8): 2031-2040.
Kuan CT, Tessman I (1991) LexA protein of Escherichia coli represses expression of the Tn5
transposase gene. J Bacteriol 173(20): 6406-6410.
Kuan CT, Tessman I (1992) Further evidence that transposition of Tn5 in Escherichia coli is
strongly enhanced by constitutively activated RecA proteins. J Bacteriol 174(21):
6872-6877.
Kunkel T (2004) DNA Replication Fidelity. J Biol Chem 279(17): 16898.
Kussell E, Kishony R, Balaban NQ, Leibler S (2005) Bacterial persistence: a model of sur-

280
References

vival in changing environments. Genetics 169(4): 1807-1814. Epub 2005 Jan 1831.
Kuwahara T, Yamashita A, Hirakawa H, Nakayama H, Toh H et al. (2004) Genomic analysis
of Bacteroides fragilis reveals extensive DNA inversions regulating cell surface adap-
tation. Proceedings of the National Academy of Sciences of the United States of
America 101(41): 14924.

Labbate M, Boucher Y, Joss MJ, Michael CA, Gillings MR et al. (2007) Use of chromosomal
integron arrays as a phylogenetic typing system for Vibrio cholerae pandemic strains.
Microbiology (Reading, England) 153(Pt 5): 1498.
Labbate M, Case RJ, Stokes HW (2009) The integron/gene cassette system: an active player
in bacterial adaptation. Methods in molecular biology (Clifton, NJ) 532: 125.
Lachmann M, Jablonka E (1996) The inheritance of phenotypes: an adaptation to fluctuating
environments. J Theor Biol 181(1): 1-9.
Laland KN, Odling-Smee J, Feldman MW (2000) Niche construction, biological evolution,
and cultural change. Behav Brain Sci 23(1): 131-146; discussion 146-175.
Lamarck JB (1809) Zoological Philosophy. An Exposition with Regard to the Natural History
of Animals: University of Chicago Press.
Lane D, Cavaille J, Chandler M (1994) Induction of the SOS response by IS1 transposase. J
Mol Biol 242(4): 339-350.
Le Roux F, Zouine M, Chakroun N, Binesse J, Saulnier D et al. (2009) Genome sequence of
Vibrio splendidus: an abundant planctonic marine species with a large genotypic di-
versity. Environ Microbiol 1: 1.
LeClerc JE, Payne WL, Kupchella E, Cebula TA (1998) Detection of mutator subpopulations
in Salmonella typhimurium LT2 by reversion of his alleles. Mutat Res 400(1-2): 89-97.
Lederberg J, Lederberg EM (1952) Replica plating and indirect selection of bacterial mu-
tants. J Bacteriol 63(3): 399-406.
Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL (2005) Evolution of proteins and gene
expression levels are coupled in Drosophila and are independently associated with
mRNA abundance, protein length, and number of protein-protein interactions. Mol
Biol Evol 22(5): 1345-1354. Epub 2005 Mar 1342.
Lenski RE, Mongold JA (2000) Cell size, shape, and fitness in evolving populations of bacte-
ria. Scaling in biology: Oxford University Press. pp. 221-235.
Lenski RE, Rose MR, Simpson SC, Tadler SC (1991) Long-Term Experimental Evolution In
Escherichia-Coli.1. Adaptation And Divergence During 2,000 Generations. Am Nat
138(6): 1315-1341.
Lenski RE, Travisano M (1994) Dynamics of adaptation and diversification: a 10,000-
generation experiment with bacterial populations. Proc Natl Acad Sci U S A 91(15):
6808-6814.
Leon G, Roy PH (2003) Excision and integration of cassettes by an integron integrase of Ni-
trosomonas europaea. J Bacteriol 185(6): 2036-2041.
Levesque C, Brassard S, Lapointe J, Roy PH (1994) Diversity and relative strength of tandem
promoters for the antibiotic-resistance genes of several integrons. Gene 142(1): 49-
54.
Levit GS, Hossfeld U, Olsson L (2006) From the "Modern Synthesis" to cybernetics: Ivan
Ivanovich Schmalhausen (1884-1963) and his research program for a synthesis of
evolutionary and developmental biology. J Exp Zoolog B Mol Dev Evol 306(2): 89-
106.
Levy SF, Siegal ML (2008) Network Hubs Buffer Environmental Variation in Saccharomyces

281
References

cerevisiae. PLoS Biology 6(11): e264.


Lewis K (2000) Programmed Death in Bacteria. Microbiol Mol Biol Rev 64(3): 503-514.
Li GM (2008) Mechanisms and functions of DNA mismatch repair. Cell research 18(1): 98.
Liebert CA, Hall RM, Summers AO (1999) Transposon Tn21, flagship of the floating ge-
nome. Microbiology and molecular biology reviews: MMBR 63(3): 522.
Lim HN, van Oudenaarden A (2007) A multistep epigenetic switch enables the stable inheri-
tance of DNA methylation states. Nat Genet.
Lobner-Olesen A, Boye E, Marinus MG (1992) Expression of the Escherichia coli dam gene.
Mol Microbiol 6(13): 1841-1851.
Lobry JR (1996) Asymmetric substitution patterns in the two DNA strands of bacteria. Mol
Biol Evol 13(5): 665.
Lobry JR, Sueoka N (2002) Asymmetric directional mutation pressures in bacteria. Genome
Biol 3(10): RESEARCH0058. Epub 2002 Sep 0026.
Lorenz MG, Wackernagel W (1994) Bacterial gene transfer by natural genetic transforma-
tion in the environment. Microbiol Rev 58(3): 563-602.
Love PE, Lyle MJ, Yasbin RE (1985) DNA-damage-inducible (din) loci are transcriptionally
activated in competent Bacillus subtilis. Proc Natl Acad Sci U S A 82(18): 6201-6205.
Lund PM, Cox BS (1981) Reversion analysis of [psi-] mutations in Saccharomyces cere-
visiae. Genet Res 37(2): 173-182.
Luria SE, Delbruck M (1943) Mutations of Bacteria from Virus Sensitivity to Virus Resis-
tance. Genetics 28(6): 511.

MacDonald D, Demarre G, Bouvier M, Mazel D, Gopaul DN (2006) Structural basis for


broad DNA-specificity in integron recombination. Nature 440(7088): 1157-1162.
Maiques E, Ubeda C, Campoy S, Salvador N, Lasa I et al. (2006) beta-lactam antibiotics in-
duce the SOS response and horizontal transfer of virulence factors in Staphylococcus
aureus. J Bacteriol 188(7): 2726-2729.
Martin G, Lenormand T (2006) A general multivariate extension of Fisher's geometrical
model and the distribution of mutation fitness effects across species. Evolution 60(5):
893-907.
Martin P, Makepeace K, Hill SA, Hood DW, Moxon ER (2005) Microsatellite instability
regulates transcription factor binding and gene expression. Proc Natl Acad Sci U S A
102(10): 3800-3804. Epub 2005 Feb 3822.
Martinez E, de la Cruz F (1988) Transposon Tn21 encodes a RecA-independent site-specific
integration system. Molecular & general genetics: MGG 211(2): 325.
Matic I, Rayssiguier C, Radman M (1995) Interspecies gene exchange in bacteria: the role of
SOS and mismatch repair systems in evolution of species. Cell 80(3): 507-515.
Matic I, Taddei F, Radman M (2004) Survival versus maintenance of genetic stability: a con-
flict of priorities during stress. Research in microbiology 155(5): 341.
Mattoo S, Foreman-Wykert AK, Cotter PA, Miller JF (2001) Mechanisms of Bordetella
pathogenesis. Front Biosci 6: E168-186.
Maynard-Smith J, Haigh J (1974) The hitch-hiking effect of a favourable gene. Genet Res
23(1): 23-35.
Mazel D, Dychinco B, Webb VA, Davies J (1998) A distinctive class of integron in the Vibrio
cholerae genome. Science 280(5363): 608.
Mazel D (2006) Integrons: agents of bacterial evolution. Nature Reviews Microbiology 4(8):
620.
McClintock B (1950) The origin and behavior of mutable loci in maize. Proceedings of the

282
References

National Academy of Sciences of the United States of America 36(6): 355.


McClintock B (1984) The significance of responses of the genome to challenge. Science (New
York, NY) 226(4676): 801.
McCool JD, Long E, Petrosino JF, Sandler HA, Rosenberg SM et al. (2004) Measurement of
SOS expression in individual Escherichia coli K-12 cells using fluorescence micros-
copy. Molecular microbiology 53(5): 1357.
Medini D, Serruto D, Parkhill J, Relman D, Donati C et al. (2008) Microbiology in the post-
genomic era. Nature Reviews Microbiology 6(6): 430.
Meibom KL, Blokesch M, Dolganov NA, Wu CY, Schoolnik GK (2005) Chitin induces natu-
ral competence in Vibrio cholerae. Science 310(5755): 1824-1827.
Melano R, Petroni A, Garutti A, Saka HA, Mange L et al. (2002) New carbenicillin-
hydrolyzing beta-lactamase (CARB-7) from Vibrio cholerae non-O1, non-O139
strains encoded by the VCR region of the V. cholerae genome. Antimicrobial agents
and chemotherapy 46(7): 2168.
Melayah D, Bonnivard E, Chalhoub B, Audeon C, Grandbastien MA (2001) The mobility of
the tobacco Tnt1 retrotransposon correlates with its transcriptional activation by fun-
gal factors. Plant J 28(2): 159-168.
Menzella HG, Reeves CD (2007) Combinatorial biosynthesis for drug development. Curr
Opin Microbiol 10(3): 238-245. Epub 2007 Jun 2005.
Messier N, Roy PH (2001) Integron integrases possess a unique additional domain necessary
for activity. J Bacteriol 183(22): 6699-6706.
Michael CA, Gillings MR, Holmes AJ, Hughes L, Andrew NR et al. (2004) Mobile gene cas-
settes: a fundamental resource for bacterial evolution. The American naturalist
164(1): 12.
Miller C, Thomsen LE, Gaggero C, Mosseri R, Ingmer H et al. (2004) SOS response induc-
tion by beta-lactams and bacterial defense against antibiotic lethality. Science
305(5690): 1629-1631. Epub 2004 Aug 1612.
Miller JD, Scott EC, Okamoto S (2006) Science communication. Public acceptance of evolu-
tion. Science 313(5788): 765-766.
Miller MC, Keymer DP, Avelar A, Boehm AB, Schoolnik GK (2007a) Detection and Trans-
formation of Genome Segments That Differ within a Coastal Population of Vibrio
cholerae Strains. Appl Environ Microbiol 73(11): 3695-3704.
Miller MC, Keymer DP, Avelar A, Boehm AB, Schoolnik GK (2007b) Detection and trans-
formation of genome segments that differ within a coastal population of Vibrio chol-
erae strains. Appl Environ Microbiol 73(11): 3695-3704. Epub 2007 Apr 3620.
Miller WJ, Capy P (2004) Mobile genetic elements as natural tools for genome evolution.
Methods in molecular biology (Clifton, NJ) 260: 20.
Mirkin EV, Mirkin SM (2007) Replication fork stalling at natural impediments. Microbiology
and molecular biology reviews: MMBR 71(1): 35.
Mishina Y, Duguid EM, He C (2006) Direct reversal of DNA alkylation damage. Chemical
reviews 106(2): 232.
Mitsuhashi S, Harada K, Hashimoto H, Egawa R (1961) On the drug-resistance of enteric
bacteria. 4. Drug-resistance of Shigella prevalent in Japan. Jpn J Exp Med 31: 47-52.
Miyawaki A, Nagai T, Mizuno H (2005) Engineering fluorescent proteins. Adv Biochem Eng
Biotechnol 95: 1-15.
Monod JY (1970) Chance and Necessity.An Essay on the Natural Philosophy of Modern Bi-
ology.
Morrish TA, Garcia-Perez JL, Stamato TD, Taccioli GE, Sekiguchi J et al. (2007) Endonu-
clease-independent LINE-1 retrotransposition at mammalian telomeres. Nature
446(7132): 208-212.

283
References

Morrison DA, Lee MS (2000) Regulation of competence for genetic transformation in Strep-
tococcus pneumoniae: a link between quorum sensing and DNA processing genes. Res
Microbiol 151(6): 445-451.
Moura A, Soares Mr, Pereira C, Leitão N, Henriques I et al. (2009) INTEGRALL: a data-
base and search engine for integrons, integrases and gene cassettes. Bioinformatics
(Oxford, England).
Moxon R, Bayliss C, Hood D (2006) Bacterial contingency loci: the role of simple sequence
DNA repeats in bacterial adaptation. Annual review of genetics 40: 333.
Müller-Hill B (1996) The lac Operon: a short history of a genetic paradigm: Walter de
Gruyter & Co.
Muller HJ (1964) The Relation Of Recombination To Mutational Advance. Mutat Res 106: 2-
9.

Naas T, Mikami Y, Imai T, Poirel L, Nordmann P (2001a) Characterization of In53, a class 1


plasmid- and composite transposon-located integron of Escherichia coli which carries
an unusual array of gene cassettes. J Bacteriol 183(1): 235-249.
Naas T, Mikami Y, Imai T, Poirel L, Nordmann P (2001b) Characterization of In53, a class 1
plasmid- and composite transposon-located integron of Escherichia coli which carries
an unusual array of gene cassettes. J Bacteriol 183(1): 235-249.
Napolitano R, Janel-Bintz R, Wagner J, Fuchs RP (2000) All three SOS-inducible DNA poly-
merases (Pol II, Pol IV and Pol V) are involved in induced mutagenesis. The EMBO
journal 19(22): 6265.
Nemergut DR, Martin AP, Schmidt SK (2004) Integron diversity in heavy-metal-
contaminated mine tailings and inferences about integron evolution. Applied and en-
vironmental microbiology 70(2): 1168.
Nemergut D, Robeson M, Kysela R, Martin A, Schmidt S et al. (2008) Insights and infer-
ences about integron evolution from genomic data. BMC Genomics 9: 261.
Nordmann P, Poirel L (2005) Emergence of plasmid-mediated resistance to quinolones in En-
terobacteriaceae. J Antimicrob Chemother 56(3): 469.
Novick A, Weiner M (1957) Enzyme Induction As An All-Or-None Phenomenon. Proc Natl
Acad Sci U S A 43(7): 553-566.

Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer and the nature of bacte-
rial innovation. Nature 405(6784): 304.
Opperman T, Murli S, Smith BT, Walker GC (1999) A model for a umuDC-dependent pro-
karyotic DNA damage checkpoint. Proceedings of the National Academy of Sciences
of the United States of America 96(16): 9218-9223.
Orr HA (2000) The rate of adaptation in asexuals. Genetics 155(2): 961-968.
Otto SP, Lenormand T (2002) Resolving the paradox of sex and recombination. Nat Rev
Genet 3(4): 252-261.
Ozbudak EM, Thattai M, Lim HN, Shraiman BI, van Oudenaarden A (2004) Multistability in
the lactose utilization network of Escherichia coli. Nature 427(6976): 737.

284
References

Pagès V, Fuchs RP (2003) Uncoupling of leading- and lagging-strand DNA replication dur-
ing lesion bypass in vivo. Science (New York, NY) 300(5623): 1303.
Pal C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics
158(2): 927-931.
Paley W (1809) Natural Theology. Evidences of the Existence and Attributes of the Deity.
Palmer GH, Brayton KA (2007) Gene conversion is a convergent strategy for pathogen anti-
genic variation. Trends in parasitology 23(9): 413.
Palmer GH, Futse JE, Leverich CK, Knowles DP, Rurangirwa FR et al. (2007) Selection for
simple major surface protein 2 variants during Anaplasma marginale transmission to
immunologically naïve animals. Infection and immunity 75(3): 1506.
Parker BO, Marinus MG (1992) Repair of DNA heteroduplexes containing small heterolo-
gous sequences in Escherichia coli. Proc Natl Acad Sci U S A 89(5): 1730-1734.
Parks AR, Peters JE (2009) Tn7 elements: engendering diversity from chromosomes to epi-
somes. Plasmid 61(1): 1-14. Epub 2008 Nov 2001.
Partridge SR, Tsafnat G, Coiera E, Iredell JR (2009) Gene cassettes and cassette arrays in
mobile resistance integrons. FEMS Microbiol Rev 13: 13.
Pasteur L (1861) Mémoire sur les corpuscules organisés qui existent en suspension dans
l'atmosphère. Examen de la doctrine des générations spontanées. Compte Rendu de
l'Academie des Sciences 2: 1142-1143.
Patino MM, Liu JJ, Glover JR, Lindquist S (1996) Support for the prion hypothesis for inheri-
tance of a phenotypic trait in yeast. Science 273(5275): 622-626.
Pelosi L, Kuhn L, Guetta D, Garin J, Geiselmann J et al. (2006) Parallel changes in global
protein profiles during long-term experimental evolution in Escherichia coli. Genetics
173(4): 1851-1869.
Petroni A, Melano RG, Saka HA, Garutti A, Mange L et al. (2004) CARB-9, a carbenicilli-
nase encoded in the VCR region of Vibrio cholerae non-O1, non-O139 belongs to a
family of cassette-encoded beta-lactamases. Antimicrobial agents and chemotherapy
48(10): 4046.
Philippe N, Crozat E, Lenski RE, Schneider D (2007) Evolution of global regulatory networks
during a long-term experiment with Escherichia coli. Bioessays 29(9): 846-860.
Philippe N, Pelosi L, Lenski RE, Schneider D (2009) Evolution of penicillin-binding protein 2
concentration and cell shape during a long-term experiment with Escherichia coli. J
Bacteriol 191(3): 909-921.
Plotkin JB, Dushoff J (2003) Codon bias and frequency-dependent selection on the hemag-
glutinin epitopes of influenza A virus. Proc Natl Acad Sci U S A 100(12): 7152-7157.
Epub 2003 May 7114.
Plotkin JB, Dushoff J, Fraser HB (2004) Detecting selection using a single genome sequence
of M. tuberculosis and P. falciparum. Nature 428(6986): 942-945.
Plotkin JB, Dushoff J, Desai MM, Fraser HB (2006) Codon usage and selection on proteins. J
Mol Evol 63(5): 635-653. Epub 2006 Oct 2014.
Poirel L, Liard A, Rodriguez-Martinez J-M, Nordmann P (2005) Vibrionaceae as a possible
source of Qnr-like quinolone resistance determinants. Journal of Antimicrobial Che-
motherapy 56(6): 1121.
Ponder RG, Fonville NC, Rosenberg SM (2005) A switch from high-fidelity to error-prone
DNA double-strand break repair underlies stress-induced mutation. Mol Cell 19(6):
791-804.
Poon A, Otto SP (2000) Compensating for our load of mutations: freezing the meltdown of
small populations. Evolution 54(5): 1467-1479.

285
References

Potts RG, Lujan SA, Redinbo MR (2008) Winning the asymmetric war: new strategies for
combating antibacterial resistance. Future microbiology 3: 123.
Pride DT, Blaser MJ (2002) Concerted evolution between duplicated genetic elements in
Helicobacter pylori. J Mol Biol 316(3): 629-642.
Prozorov AA (2001) Recombinational rearrangements in bacterial genome and bacterial ad-
aptation to the environment. Microbiology 70(5): 501-511.
Prudhomme M, Attaiech L, Sanchez G, Martin B, Claverys JP (2006) Antibiotic stress in-
duces genetic transformability in the human pathogen Streptococcus pneumoniae.
Science 313(5783): 89-92.

Queitsch C, Sangster TA, Lindquist S (2002) Hsp90 as a capacitor of phenotypic variation.


Nature 417(6889): 618.
Quinones M, Kimsey HH, Waldor MK (2005) LexA cleavage is required for CTX prophage
induction. Mol Cell 17(2): 300.

Radman M (1975) SOS repair hypothesis: phenomenology of an inducible DNA repair which
is accompanied by mutagenesis. Basic Life Sci 5A: 355-367.
Rando O, Verstrepen K (2007) Timescales of Genetic and Epigenetic Inheritance. Cell
128(4): 668.
Rangarajan S, Woodgate R, Goodman MF (2002) Replication restart in UV-irradiated Es-
cherichia coli involving pols II, III, V, PriA, RecA and RecFOR proteins. Mol Micro-
biol 43(3): 617-628.
Recchia GD, Stokes HW, Hall RM (1994) Characterisation of specific and secondary recom-
bination sites recognised by the integron DNA integrase. Nucleic Acids Res 22(11):
2071-2078.
Recchia GD, Hall RM (1995) Plasmid evolution by acquisition of mobile gene cassettes:
plasmid pIE723 contains the aadB gene cassette precisely inserted at a secondary site
in the incQ plasmid RSF1010. Mol Microbiol 15(1): 179-187.
Recchia GD, Hall RM (1997) Origins of the mobile gene cassettes found in integrons. Trends
Microbiol 5(10): 389-394.
Redfield RJ (1993) Evolution of natural transformation: testing the DNA repair hypothesis in
Bacillus subtilis and Haemophilus influenzae. Genetics 133(4): 755-761.
Redfield RJ (2001) Do bacteria have sex? Nat Rev Genet 2(8): 634.
Redfield RJ, Cameron AD, Qian Q, Hinds J, Ali TR et al. (2005) A novel CRP-dependent
regulon controls expression of competence genes in Haemophilus influenzae. J Mol
Biol 347(4): 735-747.
Reymond A, Henrichsen CN, Harewood L, Merla G (2007) Side effects of genome structural
changes. Curr Opin Genet Dev 17(5): 381-386. Epub 2007 Oct 2024.
Rice KC, Bayles KW (2008) Molecular control of bacterial death and lysis. Microbiol Mol
Biol Rev 72(1): 85-109, table of contents.
Richardson AR, Stojiljkovic I (2001) Mismatch repair and the regulation of phase variation
in Neisseria meningitidis. Mol Microbiol 40(3): 645-655.
Richardson AR, Yu Z, Popovic T, Stojiljkovic I (2002) Mutator clones of Neisseria meningi-
tidis in epidemic serogroup A disease. Proc Natl Acad Sci U S A 99(9): 6103-6107.
Ritz D, Lim J, Reynolds CM, Poole LB, Beckwith J (2001) Conversion of a peroxiredoxin
into a disulfide reductase by a triplet repeat expansion. Science (New York, NY)

286
References

294(5540): 160.
Roberts D, Kleckner N (1988) Tn10 transposition promotes RecA-dependent induction of a
lambda prophage. Proc Natl Acad Sci U S A 85(16): 6037-6041.
Robicsek A, Jacoby GA, Hooper DC (2006a) The worldwide emergence of plasmid-mediated
quinolone resistance. The Lancet infectious diseases 6(10): 640.
Robicsek A, Strahilevitz J, Jacoby GA, Macielag M, Abbanat D et al. (2006b) Fluoroqui-
nolone-modifying enzyme: a new adaptation of a common aminoglycoside acetyltrans-
ferase. Nat Med 12(1): 83-88. Epub 2005 Dec 2020.
Robinson A, Guilfoyle AP, Sureshan V, Howell M, Harrop SJ et al. (2008) Structural genom-
ics of the bacterial mobile metagenome: an overview. Methods Mol Biol 426: 589-
595.
Robleto EA, Yasbin R, Ross C, Pedraza-Reyes M (2007) Stationary phase mutagenesis in B.
subtilis: a paradigm to study genetic diversity programs in cells under stress. Crit Rev
Biochem Mol Biol 42(5): 327-339.
Rocha EP (2003) DNA repeats lead to the accelerated loss of gene order in bacteria. Trends
Genet 19(11): 600-603.
Rocha EP, Matic I, Taddei F (2002) Over-representation of repeats in stress response genes:
a strategy to increase versatility under stressful conditions? Nucleic Acids Res 30(9):
1886-1894.
Rocha EP (2004) Order and disorder in bacterial genomes. Curr Opin Microbiol 7(5): 527.
Rocha EP, Danchin A (2004) An analysis of determinants of amino acids substitution rates in
bacterial proteins. Mol Biol Evol 21(1): 108-116. Epub 2003 Oct 2031.
Ronen M, Rosenberg R, Shraiman BI, Alon U (2002) Assigning numbers to the arrows: pa-
rameterizing a gene regulation network by using accurate expression kinetics. Proc
Natl Acad Sci U S A 99(16): 10560.
Rosenberg SM (2001) Evolving responsively: adaptive mutation. Nat Rev Genet 2(7): 515.
Roth JR, Kugelberg E, Reams AB, Kofoid E, Andersson DI (2006) Origin of mutations under
selection: the adaptive mutation controversy. Annu Rev Microbiol 60: 501.
Rowe-Magnus DA, Guerout AM, Ploncard P, Dychinco B, Davies J et al. (2001) The evolu-
tionary history of chromosomal super-integrons provides an ancestry for multiresis-
tant integrons. Proceedings of the National Academy of Sciences of the United States
of America 98(2): 657.
Rowe-Magnus DA, Guerout AM, Mazel D (2002) Bacterial resistance evolution by recruit-
ment of super-integron gene cassettes. Mol Microbiol 43(6): 1669.
Rowe-Magnus DA, Guerout AM, Biskri L, Bouige P, Mazel D (2003) Comparative analysis
of superintegrons: engineering extensive genetic diversity in the Vibrionaceae. Ge-
nome Res 13(3): 428-442.
Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S et al. (2007) The Sorcerer II
Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pa-
cific. PLoS Biol 5(3): e77.
Ruse M (2003) Perceptions in science. Is evolution a secular religion? Science 299(5612):
1523-1524.
Russo VEA, Martienssen RA, Riggs AD (1996) Epigenetic Mechanisms of Gene Regulation.
Woodbury: Cold Spring Harbor Laboratory Press.
Rutherford SL, Lindquist S (1998) Hsp90 as a capacitor for morphological evolution. Nature
396(6709): 336.

287
References

Sacchetti A, Ciccocioppo R, Alberti S (2000) The molecular determinants of the efficiency of


green fluorescent protein mutants. Histol Histopathol 15(1): 101-107.
Salisbury FB (1969) Natural selection and the complexity of the gene. Nature 224(5217): 342-
343.
Sancar A (2008) Structure and Function of Photolyase and in Vivo Enzymology: 50th Anni-
versary. J Biol Chem 283(47): 32157.
Santoyo G, Romero D (2005) Gene conversion and concerted evolution in bacterial genomes.
FEMS Microbiol Rev 29(2): 183.
Satpute-Krishnan P, Serio TR (2005) Prion protein remodelling confers an immediate pheno-
typic switch. Nature 437(7056): 262-265.
Sauer RT, Ross MJ, Ptashne M (1982) Cleavage of the lambda and P22 repressors by recA
protein. J Biol Chem 257(8): 4458-4462.
Sawyer LA, Hennessy JM, Peixoto AA, Rosato E, Parkinson H et al. (1997) Natural varia-
tion in a Drosophila clock gene and temperature compensation. Science 278(5346):
2117-2120.
Schaaper RM (1998) Antimutator mutants in bacteriophage T4 and Escherichia coli. Genetics
148(4): 1585.
Schaaper RM, Dunn RL (2001) The antimutator phenotype of E. coli mud is only apparent
and results from delayed appearance of mutants. Mutation research 480-481: 75.
Schmidt AL, Mitter V (2004) Microsatellite mutation directed by an external stimulus. Mutat
Res 568(2): 233-243.
Shaner NC, Patterson GH, Davidson MW (2007) Advances in fluorescent protein technology.
J Cell Sci 120(Pt 24): 4247-4260.
Sharp PM (2005) Gene "volatility" is most unlikely to reveal adaptation. Mol Biol Evol 22(4):
807-809. Epub 2004 Dec 2022.
Shaver AC, Sniegowski PD (2003) Spontaneously arising mutL mutators in evolving Es-
cherichia coli populations are the result of changes in repeat length. Journal of bacte-
riology 185(20): 6082.
Shearwin KE, Brumby AM, Egan JB (1998) The Tum protein of coliphage 186 is an antirep-
ressor. J Biol Chem 273(10): 5708-5715.
Shultzaberger RK, Bucheimer RE, Rudd KE, Schneider TD (2001) Anatomy of Escherichia
coli ribosome binding sites. J Mol Biol 313(1): 215-228.
Shuman S, Glickman M (2007) Bacterial DNA repair by non-homologous end joining. Nature
Reviews Microbiology 5(11): 861.
Silander OK, Tenaillon O, Chao L (2007) Understanding the evolutionary fate of finite popu-
lations: the dynamics of mutational effects. PLoS Biol 5(4): e94.
Simpson GG (1953) The Baldwin effect. Evolution 7: 110-117.
Sivanathan V, Emerson JE, Pages C, Cornet F, Sherratt DJ et al. (2009) KOPS-guided DNA
translocation by FtsK safeguards Escherichia coli chromosome segregation. Mol Mi-
crobiol 71(4): 1031-1042.
Smith AB, Siebeling RJ (2003) Identification of genetic loci required for capsular expression
in Vibrio vulnificus. Infect Immun 71(3): 1091-1097.
Smith GR (2001) Homologous recombination near and far from DNA breaks: alternative
roles and contrasting views. Annu Rev Genet 35: 243-274.
Smith HO, Gwinn ML, Salzberg SL (1999) DNA uptake signal sequences in naturally trans-
formable bacteria. Res Microbiol 150(9-10): 603-616.
Sniegowski PD, Gerrish PJ, Lenski RE (1997) Evolution of high mutation rates in experimen-
tal populations of E. coli. Nature 387(6634): 703-705.

288
References

Sniegowski PD, Gerrish PJ, Johnson T, Shaver A (2000) The evolution of mutation rates:
separating causes from consequences. BioEssays: news and reviews in molecular, cel-
lular and developmental biology 22(12): 1066.
Sniegowski PD, Murphy H (2006) Evolvability. Current Biology 16(19): R834.
Sohanpal BK, El-Labany S, Lahooti M, Plumbridge JA, Blomfield IC (2004) Integrated regu-
latory responses of fimB to N-acetylneuraminic (sialic) acid and GlcNAc in Es-
cherichia coli K-12. Proceedings of the National Academy of Sciences of the United
States of America 101(46): 16327.
Sohanpal BK, Friar S, Roobol J, Plumbridge JA, Blomfield IC (2007) Multiple co-regulatory
elements and IHF are necessary for the control of fimB expression in response to
sialic acid and N-acetylglucosamine in Escherichia coli K-12. Mol Microbiol 63(4):
1223-1236.
Sollars V, Lu X, Xiao L, Wang X, Garfinkel MD et al. (2003) Evidence for an epigenetic
mechanism by which Hsp90 acts as a capacitor for morphological evolution. Nat
Genet 33(1): 70.
Solnick JV, Hansen LM, Salama NR, Boonjakuakul JK, Syvanen M (2004) Modification of
Helicobacter pylori outer membrane protein expression during experimental infection
of rhesus macaques. Proc Natl Acad Sci U S A 101(7): 2106-2111. Epub 2004 Feb
2103.
Sorek R, Kunin V, Hugenholtz P (2008) CRISPR - a widespread system that provides ac-
quired resistance against phages in bacteria and archaea. Nature Reviews Microbiol-
ogy 6(3): 186.
Srikhanta YN, Maguire TL, Stacey KJ, Grimmond SM, Jennings MP (2005) The
phasevarion: a genetic system controlling coordinated, random switching of expres-
sion of multiple genes. Proc Natl Acad Sci U S A 102(15): 5547-5551.
Stavnezer J, Guikema JE, Schrader CE (2008) Mechanism and regulation of class switch re-
combination. Annu Rev Immunol 26: 261-292.
Steinmoen H, Teigen A, Havarstein LS (2003) Competence-induced cells of Streptococcus
pneumoniae lyse competence-deficient cells of the same strain during cocultivation. J
Bacteriol 185(24): 7176-7183.
Stellwagen AE, Craig NL (1997) Gain-of-function mutations in TnsC, an ATP-dependent
transposition protein that activates the bacterial transposon Tn7. Genetics 145(3):
573-585.
Stokes HW, Hall RM (1989) A novel family of potentially mobile DNA elements encoding
site-specific gene-integration functions: integrons. Molecular microbiology 3(12):
1683.
Stokes HW, Hall RM (1991) Sequence analysis of the inducible chloramphenicol resistance
determinant in the Tn1696 integron suggests regulation by translational attenuation.
Plasmid 26(1): 10-19.
Stokes HW, O'Gorman DB, Recchia GD, Parsekhian M, Hall RM (1997) Structure and func-
tion of 59-base element recombination sites associated with mobile gene cassettes.
Mol Microbiol 26(4): 731-745.
Stokes HW, Holmes AJ, Nield BS, Holley MP, Nevalainen KM et al. (2001) Gene cassette
PCR: sequence-independent recovery of entire genes from environmental DNA. Ap-
plied and environmental microbiology 67(11): 5246.
Stokes HW, Nesbo CL, Holley M, Bahl MI, Gillings MR et al. (2006) Class 1 integrons po-
tentially predating the association with tn402-like transposition genes are present in a
sediment microbial community. Journal of bacteriology 188(16): 5730.
Streisinger G, Okada Y, Emrich J, Newton J, Tsugita A et al. (1966) Frameshift mutations
and the genetic code. Cold Spring Harb Symp Quant Biol 31: 77-84.

289
References

Streit WR, Schmitz RA (2004) Metagenomics--the key to the uncultured microbes. Curr Opin
Microbiol 7(5): 492-498.
Sturtevant AH (1937) On the effects of selection on the mutation rate. Q Rev Biol(12): 464–
476.
Subramanian S, Kumar S (2004) Gene expression intensity shapes evolutionary rates of the
proteins encoded by the vertebrate genome. Genetics 168(1): 373-381.
Sunde M (2005) Class I integron with a group II intron detected in an Escherichia coli strain
from a free-range reindeer. Antimicrob Agents Chemother 49(6): 2512-2514.
Szathmary E, Maynard Smith J (1997) From replicators to reproducers: the first major tran-
sitions leading to life. J Theor Biol 187(4): 555-571.
Szekeres, Silvia, Dauti, Mira, Wilde et al. (2007) Chromosomal toxinantitoxin loci can dimin-
ish large-scale genome reductions in the absence of selection. Molecular Microbiol-
ogy 63(6): 1605.

Taddei F, Matic I, Radman M (1995) cAMP-dependent SOS induction and mutagenesis in


resting bacterial populations. Proceedings of the National Academy of Sciences of the
United States of America 92(25): 11740.
Taddei F, Halliday JA, Matic I, Radman M (1997a) Genetic analysis of mutagenesis in aging
Escherichia coli colonies. Mol Gen Genet 256(3): 277-281.
Taddei F, Radman M, Maynard-Smith J, Toupance B, Gouyon PH et al. (1997b) Role of mu-
tator alleles in adaptive evolution. Nature 387(6634): 702.
Tautz D, Trick M, Dover GA (1986) Cryptic simplicity in DNA is a major source of genetic
variation. Nature 322(6080): 656.
Taylor JS, Raes J (2004) Duplication and divergence: the evolution of new genes and old
ideas. Annu Rev Genet 38: 615-643.
Taylor J, Rudenko G (2006) Switching trypanosome coats: what's in the wardrobe? Trends in
Genetics 22(11): 620.
Teilhard de Chardin P (1955) The Phenomenon of Man: Harper Perennial.
Tenaillon O, Toupance B, Le Nagard H, Taddei F, Godelle B (1999) Mutators, population
size, adaptive landscape and the adaptation of asexual populations of bacteria. Genet-
ics 152(2): 485-493.
Tenaillon O, Le Nagard H, Godelle B, Taddei F (2000) Mutators and sex in bacteria: conflict
between adaptive strategies. Proceedings of the National Academy of Sciences of the
United States of America 97(19): 10470.
Tenaillon O, Denamur E, Matic I (2004) Evolutionary significance of stress-induced
mutagenesis in bacteria. Trends Microbiol 12(6): 264-270.
Tenover FC (2006) Mechanisms of antimicrobial resistance in bacteria. Am J Med 119(6
Suppl 1): S3-10; discussion S62-70.
Thaler DS (1994) The evolution of genetic intelligence. Science (New York, NY) 264(5156):
225.
Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE et al. (2005) Genotypic Di-
versity Within a Natural Coastal Bacterioplankton Population. Science 307(5713):
1311-1313.
True HL, Berlin I, Lindquist SL (2004) Epigenetic regulation of translation reveals hidden
genetic variation to produce complex traits. Nature 431(7005): 187.
Truglio JJ, Croteau DL, Van Houten B, Kisker C (2006) Prokaryotic nucleotide excision re-
pair: the UvrABC system. Chem Rev 106(2): 233-252.
Tsilibaris V, Maenhaut-Michel G, Mine N, Van Melderen L (2007) What is the benefit to Es-

290
References

cherichia coli of having multiple toxin-antitoxin systems in its genome? J Bacteriol


189(17): 6101-6108. Epub 2007 May 6118.
Twiss E, Coros AM, Tavakoli NP, Derbyshire KM (2005) Transposition is modulated by a
diverse set of host factors in Escherichia coli and is stimulated by nutritional stress.
Mol Microbiol 57(6): 1593-1607.
Tyedmers J, Madariaga ML, Lindquist S (2008) Prion Switching in Response to Environ-
mental Stress. PLoS Biology 6(11): e294.
Ubeda C, Maiques E, Knecht E, Lasa I, Novick RP et al. (2005) Antibiotic-induced SOS re-
sponse promotes horizontal dissemination of pathogenicity island-encoded virulence
factors in staphylococci. Mol Microbiol 56(3): 836-844.
Uptain SM, Lindquist S (2002) Prions as protein-based genetic elements. Annu Rev Micro-
biol 56: 703-741.

van de Putte P, Goosen N (1992) DNA inversions in phages and bacteria. Trends Genet
8(12): 457-462.
van der Woude M, Baumler A (2004) Phase and Antigenic Variation in Bacteria. Clin Mi-
crobiol Rev 17(3): 611.
van der Woude MW (2006) Re-examining the role and random nature of phase variation.
FEMS Microbiol Lett 254(2): 197.
Van Speybroeck L, De Waele D, Van de Vijver G (2002) Theories in early embryology: close
connections between epigenesis, preformationism, and self-organization. Ann N Y
Acad Sci 981: 7-49.
Van Valen L (1973) A new evolutionary law. Evolutionary Theory 1: 1-30.
Veening JW, Hamoen LW, Kuipers OP (2005) Phosphatases modulate the bistable sporula-
tion gene expression pattern in Bacillus subtilis. Mol Microbiol 56(6): 1481-1494.
Veening J-W, Smits WK, Kuipers OP (2008a) Bistability, Epigenetics, and Bet-Hedging in
Bacteria. Annual Review of Microbiology 62(1): 193.
Veening JW, Stewart EJ, Berngruber TW, Taddei F, Kuipers OP et al. (2008b) Bet-hedging
and epigenetic inheritance in bacterial cell development. Proc Natl Acad Sci U S A
105(11): 4393-4398. Epub 2008 Mar 4396.
Vollmer AC, Kwakye S, Halpern M, Everbach EC (1998) Bacterial stress responses to 1-
megahertz pulsed ultrasound in the presence of microbubbles. Appl Environ Micro-
biol 64(10): 3927-3931.

Waddington CH (1942a) Endeavour (1): 18-20.


Waddington CH (1942b) Canalization of Development and the Inheritance of Acquired
Characters. Nature 150: 563-565.
Waddington CH (1953) Evolution. Lawrence, Kans. pp. 118-126.
Wade JT, Reppas NB, Church GM, Struhl K (2005) Genomic analysis of LexA binding re-
veals the permissive nature of the Escherichia coli genome and identifies unconven-
tional target sites. Genes & development 19(21): 2630.
Wagner A (2008) Robustness and evolvability: a paradox resolved. Proc Biol Sci 275(1630):
91-100.
Wagner GP, Pavlicev M, Cheverud JM (2007) The road to modularity. Nat Rev Genet 8(12):
921.

291
References

Wanner R, Guethlein C, Springer B, Boettger E, Ackermann M (2008) Stabilization of the


genome of the mismatch repair deficient Mycobacterium tuberculosis by context-
dependent codon choice. BMC Genomics 9: 249.
Wardle SJ, O'Carroll M, Derbyshire KM, Haniford DB (2005) The global regulator H-NS
acts directly on the transpososome to promote Tn10 transposition. Genes Dev 19(18):
2224-2235.
Watson JD, Crick FH (1953) Genetical implications of the structure of deoxyribonucleic acid.
Nature 171(4361): 964-967.
Weber H, Polen T, Heuveling J, Wendisch VF, Hengge R (2005) Genome-wide analysis of
the general stress response network in Escherichia coli: sigmaS-dependent genes,
promoters, and sigma factor selectivity. J Bacteriol 187(5): 1591-1603.
Weiner RM, Taylor LE, Henrissat B, Hauser L, Land M et al. (2008) Complete Genome Se-
quence of the Complex Carbohydrate-Degrading Marine Bacterium, Saccharophagus
degradans Strain 2-40T. PLoS Genet 4(5). Epub 2008 May 2030.
Weinreich DM, Delaney NF, Depristo MA, Hartl DL (2006) Darwinian evolution can follow
only very few mutational paths to fitter proteins. Science 312(5770): 111-114.
Weiser JN, Pan N (1998) Adaptation of Haemophilus influenzae to acquired and innate hu-
moral immunity based on phase variation of lipopolysaccharide. Mol Microbiol 30(4):
767-775.
White-Ziegler CA, Villapakkam A, Ronaszeki K, Young S (2000) H-NS controls pap and daa
fimbrial transcription in Escherichia coli in response to multiple environmental cues.
J Bacteriol 182(22): 6391-6400.
Whitfield CR, Wardle SJ, Haniford DB (2009) The global bacterial regulator H-NS promotes
transpososome formation and transposition in the Tn5 system. Nucleic Acids Res
37(2): 309-321. Epub 2008 Nov 2028.
Wilkins AS, Holliday R (2009) The evolution of meiosis from mitosis. Genetics 181(1): 3-12.
Wisniewski-Dyé F, Vial L (2008) Phase and antigenic variation mediated by genome modifi-
cations. Antonie van Leeuwenhoek 94(4): 515.
Wloch DM, Szafraniec K, Borts RH, Korona R (2001) Direct estimate of the mutation rate
and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. Genetics
159(2): 441-452.
Wolf DM, Vazirani VV, Arkin AP (2005) Diversity in times of adversity: probabilistic
strategies in microbial survival games. J Theor Biol 234(2): 227-253. Epub 2005 Jan
2024.
Wright S (1931) Evolution in Mendelian Populations. Genetics 16(2): 97-159.

Xu H, Davies J, Miao V (2007) Molecular characterization of class 3 integrons from Delftia


spp. Journal of bacteriology 189(17): 6283.

Yang QL, Gotschlich EC (1996) Variation of gonococcal lipooligosaccharide structure is due


to alterations in poly-G tracts in lgt genes encoding glycosyl transferases. The Journal
of experimental medicine 183(1): 327.
Yarmolinsky MB (1995) Programmed cell death in bacterial populations. Science
267(5199): 836-837.
Yildiz FH, Liu XS, Heydorn A, Schoolnik GK (2004) Molecular analysis of rugosity in a Vi-
brio cholerae O1 El Tor phase variant. Mol Microbiol 53(2): 497-515.

292
References

Zaccolo M, Gherardi E (1999) The effect of high-frequency random mutagenesis on in vitro


protein evolution: a study on TEM-1 beta-lactamase. J Mol Biol 285(2): 775-783.
Zeyl C, Mizesko M, de Visser JA (2001) Mutational meltdown in laboratory yeast popula-
tions. Evolution 55(5): 909-917.
Zhang JR, Hardham JM, Barbour AG, Norris SJ (1997) Antigenic variation in Lyme disease
borreliae by promiscuous recombination of VMP-like sequence cassettes. Cell 89(2):
275-285.
Zhang JR, Norris SJ (1998) Genetic variation of the Borrelia burgdorferi gene vlsE involves
cassette-specific, segmental gene conversion. Infect Immun 66(8): 3698-3704.
Zhang XS (2008) Increase in quantitative variation after exposure to environmental stresses
and/or introduction of a major mutation: G x E interaction and epistasis or canaliza-
tion? Genetics 180(1): 687-695. Epub 2008 Aug 2024.
Zhu Y, Dai J, Fuerst PG, Voytas DF (2003) Controlling integration specificity of a yeast
retrotransposon. Proc Natl Acad Sci U S A 100(10): 5891-5895. Epub 2003 May
5891.

293
Evolutivité – Le cas des integrons
& utilisation de sequences synonymoes en évolution dirigée
La stabilité phénotypique est essentielle au succès d’organismes évoluant sous des
conditions constantes. L’environnement est néanmoins soumis à de perpétuelles variations
stochastiques, auxquelles les êtres vivants doivent sans cesse s’adapter. L’évolutivité
caractérise la capacité d’une population à répondre à de telles pressions sélectives par la
génération de modifications phénotypiques héritables. La majorité des mutations étant
délétères, des processus permettant de limiter la production de telles variations aux seules
périodes de stress, ou de la confiner à des loci et phénotypes bien définis, ont été sélectionnés
au cours de l'évolution.
Les intégrons en constituent une illustration particulièrement sophistiquée.
Initialement identifiés comme vecteurs de résistance à de multiples antibiotiques, ces
systèmes génétiques bactériens spécialisés dans l’échange, la collecte et l’expression de gènes
accesoires constituent une importante source de diversité génétique. Ce travail montre que les
intégrons sont directement couplés à une voie majeure de réponse au stress chez les bactéries,
le système SOS. En permettant de générer de la variabilité phénotypique en période de stress
sans affecter le reste du génome, les intégrons constituent ainsi un exemple paradigmatique
d’évolutivité.
Un autre aspect de ce travail démontre que des séquences codantes synonymes – bien
que spécifiant des protéines identiques – peuvent accéder par mutations ponctuelles à des
régions différentes de l’espace phénotypique. Utilisée de manière adéquate, cette propriété
permet d’étendre l’évolutivité d’une protéine quelconque dans le cadre d’applications
biotechnologiques.

Evolvability – The integron case


& the use of synonymous sequences for directed evolution
Phenotypic stability is essential to the success of organisms evolving under steady
conditions. However, the environment is subjected to perpetual stochastic variations, to which
living beings must constantly adapt. Evolvability characterizes the ability of a population to
respond to such selective pressures through the generation of heritable phenotypic changes.
Most mutations being deleterious, processes enabling the confinement of mutations to periods
of stress, or to specific loci and well-defined phenotypes, have been selected over evolution.
Integrons constitute a particularily sophisticated illustration of such processes. Initially
identified through their involvement in multi-resistance to antibiotics, these bacterial genetic
systems are specialized in the exchange and stockpiling of accessory genes and therefore con-
stitute an important source of genetic diversity. This work shows that integrons are directly
coupled with the SOS system, a major bacterial stress response. By allowing the generation of
significant phenotypic diversity during periods of stress without impacting the rest of the ge-
nome, integrons hence constitute a paradigmatic example of evolvability.
Another aspect of this work demonstrates that synonymous coding sequences – al-
though specifying identical proteins – can access different area of the phenotypic space
through ponctual mutations. When properly exploited, this property can enhance the evolva-
bility of any protein in the context of biotechnological applications.