Vous êtes sur la page 1sur 13

Article in press - uncorrected proof

Biol. Chem., Vol. 392, pp. 277289, April 2011 Copyright  by Walter de Gruyter Berlin New York. DOI 10.1515/BC.2011.042

Review

Clustered regularly interspaced short palindromic repeats


(CRISPRs): the hallmark of an ingenious antiviral defense
mechanism in prokaryotes

Sinan Al-Attar, Edze R. Westra,


John van der Oost and Stan J.J. Brouns*
Laboratory of Microbiology, Wageningen University,
Dreijenplein 10, NL-6703 HB Wageningen,
The Netherlands
* Corresponding author
e-mail: Stan.Brouns@wur.nl

Abstract
Many prokaryotes contain the recently discovered defense
system against mobile genetic elements. This defense system
contains a unique type of repetitive DNA stretches, termed
Clustered Regularly Interspaced Short Palindromic Repeats
(CRISPRs). CRISPRs consist of identical repeated DNA
sequences (repeats), interspaced by highly variable sequences referred to as spacers. The spacers originate from either
phages or plasmids and comprise the prokaryotes immunological memory. CRISPR-associated (cas) genes encode
conserved proteins that together with CRISPRs make-up the
CRISPR/Cas system, responsible for defending the prokaryotic cell against invaders. CRISPR-mediated resistance has
been proposed to involve three stages: (i) CRISPR-Adaptation, the invader DNA is encountered by the CRISPR/Cas
machinery and an invader-derived short DNA fragment is
incorporated in the CRISPR array. (ii) CRISPR-Expression,
the CRISPR array is transcribed and the transcript is processed by Cas proteins. (iii) CRISPR-Interference, the invaders nucleic acid is recognized by complementarity to the
crRNA and neutralized. An application of the CRISPR/Cas
system is the immunization of industry-relevant prokaryotes
(or eukaryotes) against mobile-genetic invasion. In addition,
the high variability of the CRISPR spacer content can be
exploited for phylogenetic and evolutionary studies. Despite
impressive progress during the last couple of years, the elucidation of several fundamental details will be a major challenge in future research.
Keywords: anti-phage; Cas proteins; small RNAs; spacers.

Introduction
To persist, thrive and to be able to pass on their genetic
identity, all life forms invest in highly ingenious systems to
defend their frontiers. Throughout the course of evolution,

prokaryotes have succeeded to devise a number of innate


defensive strategies against prokaryotic viruses (phages). An
illustrative example of such a system is the adsorption inhibition, by which bacteria hide or modify their receptors to
escape recognition by viral particles. Bacteria also employ
other antiviral mechanisms such as the restriction-modification system (RMS). The modification apparatus of the RMS
methylates the bacterial DNA continuously. Methylation
saves the endogenous DNA from nucleolytic cleavage while
foreign non-methylated DNA is cleared by the RMS restriction endonucleases (Wilson, 1991; Hyman and Abedon,
2010; Labrie et al., 2010).
A recent breakthrough in understanding host-virus interactions has been the discovery of a novel type of microbial
defense system. This defense system resides in half of the
bacterial and almost all the archaeal genomes currently
sequenced (Grissa et al., 2007). This immune system is distinguished by its Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) and the CRISPR associated
(cas) genes (Jansen et al., 2002). The CRISPR/Cas system
(CRISPRs and the Cas proteins) relies on non-translated
RNAs to track and inactivate invasive genetic elements (for
example, conjugative plasmids or phages) to protect the cells
genomic integrity. By contrast to the above-mentioned innate
bacterial antiviral mechanisms, the CRISPR/Cas system is
unique in that it is invader-specific, adaptive and heritable.
In this review, we look at the history of the CRISPR discovery and examine the most important findings to date on
CRISPRs. We concentrate on the progress that has been
made in the past few years and speculate on future perspectives of this rapidly expanding field.

History of the CRISPR/Cas research


In 1987, Ishino and colleagues noticed for the first time a
mysterious repetitive sequence downstream of the iap gene
on the chromosome of Escherichia coli K12 (Ishino et al.,
1987). This sequence encompassed a repetitive motif of 29nucleotide (nt) identical direct repeats interspaced by variable 32-nt spacer regions. Although the biological role served
by these repetitive motifs remained obscure, several organisms were found to possess this feature. Five years later,
Groenen and colleagues found 36 base pair (bp) repeats
interspersed by 35- to 41-nt spacers in Mycobacterium tuberculosis and assigned those motifs a name: Direct Variable
Repeats (Groenen et al., 1993). Later, Mojica and coworkers

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

2010/294

Article in press - uncorrected proof


278 S. Al-Attar et al.

studied the same type of repeats from Haloferax volcanii and


Haloferax mediterranei, referring to them as Tandem
Repeats (TREPs) (Mojica et al., 1995). Before a biological
role was determined for these arrays of repeats and spacers,
the group of Goyal et al. showed in 1997 that the diversity
of the spacers in M. tuberculosis, could be exploited for a
peculiar genotyping technique called spoligotyping (discussed below) (Goyal et al., 1997). In the year 2000, Mojica
et al. investigated the occurrence of TREPs in a number of
bacteria and archaea. They proposed that these motifs constituted a new family of prokaryotic repeats and coined a
new acronym, SRSRs (Short Regularly Spaced Repeats) to
appreciate the unique regularity of these repeats (Mojica et
al., 2000). Other names were also used for the repeat/spacer
arrays including spacer interpersed and direct repeats
(SPIDRs) and long clustered tandem repeats (LCTRs) (She
et al., 2001; Lillestl et al., 2006). Later, in accordance with
Mojicas group, Jansen et al. introduced a fresh acronym,
CRISPRs, bringing an end to the confusing nomenclature
used by different groups (Jansen et al., 2002). The latter
coincided with the discovery of the first core cas genes (cas
genes that occur in different Cas subtypes). Four cas genes
(cas14) were first established, found only in CRISPR-positive genomes and located exclusively in the direct vicinity
of CRISPR arrays. Shortly after Jansens publication, Haft
and colleagues expanded the number of cas genes with 41
new genes among which two were new core cas genes: cas5
and cas6 (Haft et al., 2005). The CRISPR/Cas systems were
also classified into eight particular subtypes as will be discussed later (Haft et al., 2005).
Although the biological function of CRISPRs was still
unknown, different groups suggested that CRISPRs were
involved in developmental regulation (Thony-Meyer and
Kaiser, 1993), replicon partitioning during cell division
(Mojica et al., 1995) or in DNA repair (Makarova et al.,
2002). A profound change in our understanding of CRISPRs
came in 2005 when three research groups independently
found that many CRISPR spacers were homologous to fragments of mobile genetic elements (phages and plasmids),

suggesting they were of extrachromosomal origins and possibly the memory of a novel immune system (Bolotin et al.,
2005; Mojica et al., 2005; Pourcel et al., 2005). Subsequently, the immune system was proposed to operate using the
RNAi principle (Makarova et al., 2006). The first experimental evidence that linked CRISPRs to acquired immunity
was provided two years later by Barrangou et al. (2007).
Their observations, supported by bioinformatics predictions
(Makarova et al., 2006), sparked a wave of studies to determine the mechanism of the CRISPR defense system. Important progress was made by different groups leading to the
consensus that the CRISPR/Cas system is indeed an active
prokaryotic immune system against propagation of viruses
and conjugative elements (He and Deem, 2010).

Composition of the CRISPR loci


CRISPR loci (CRISPR arrays including the leader sequence
and the associated cas genes) are found solely in organisms
from the archaeal and bacterial domains of life. CRISPR loci
exhibit several universal features: (i) multiple direct repeats
with identical or nearly identical, often palindromic sequences, (ii) non-repetitive similar-sized spacer sequences, (iii) a
leader sequence flanking the repeats at one end, (iv) the
absence of functional open reading frames within the repeat
arrays and leader sequences, and (v) the genetic association
of the direct repeats with cas genes (Jansen et al., 2002;
Makarova et al., 2006). A CRISPR locus is shown as it is
found in E. coli K12 (Figure 1) (Brouns et al., 2008; Pul et
al., 2010). Several features of this illustration are explained
in more detail below.
Repeat sequences

The direct repeats are 2840 base pairs in length and their
number per CRISPR array can vary dramatically from one
organism to another (Jansen et al., 2002; Haft et al., 2005).
Some CRISPR loci contain only a few direct repeats whereas

Figure 1 Schematic diagram of the CRISPR I locus of E. coli K12.


The cas genes (cas3, casABCDE, cas1 and cas2) are shown along with the coupled CRISPR array. Within the CRISPR array three distinctive
elements are found: leader sequence (L), repeats (R) and spacers (S). One repeat and one spacer constitute one CRISPR unit (shown also
as DNA sequence). The region colored light green wintergenic region between casA (ygcL) and cas3 (ygcB) (IGLB)x is believed to contain
promoters required for the expression of the cas genes.
Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)
Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 279

others can bear hundreds. Similarly, the number of CRISPR


arrays is far from constant in different organisms or even in
different strains of the same species. The highest count of
uninterrupted CRISPR arrays per prokaryotic genome is 20
loci as was found for Methanocaldococcus jannaschii (Bult
et al., 1996; Lillestl et al., 2006). On average, an archaeal
genome contains approximately five CRISPR arrays, whereas three CRISPR arrays are found per bacterial genome
(Grissa et al., 2007; database accessed in August 2010). A
remarkable feature of the CRISPR arrays is the ability of
their transcripts to form RNA secondary structures. An
extensive in silico study performed by Kunin et al. aimed to
classify the CRISPR repeats and attempted to associate them
with the proposed Cas subtypes. Twelve repeat types (called
Cluster112) were defined of which half can encode RNAs
able to form stable conserved secondary stem-loop structures
(Kunin et al., 2007). The stems of these RNAs appeared to
be well conserved based on observations of compensatory
changes that accompany nucleotide substitutions particularly
in the stem-forming region (Kunin et al., 2007). Different
Cas subtypes (see below) appeared to prefer one or more
repeat types (Kunin et al., 2007). Conservation of the folding
patterns (secondary structure) suggests that these repeatderived RNA sequences allow specific interactions with certain Cas proteins. This is further confirmed by the correlation
between Cas subtypes and repeat secondary structures
(Kunin et al., 2007).
Spacer sequences

The spacer sequences are highly variable constituents of the


CRISPR arrays ranging from 2672 bp in length. In the same
array, the spacers have similar lengths. Two identical spacers
are generally not found in the same CRISPR array; with the
exception of spacer duplications in some larger CRISPR
arrays (Lillestl et al., 2006). Although the consensus is that
spacers originate from alien mobile genetic elements, only a
small portion of all spacers map to known extrachromosomal
sequences from phages and plasmids (Shah et al., 2009).
However, the number of currently sequenced microbial
viruses is exceptionally small compared to the huge numbers
of phages that occur in nature. Certainly, the under-representation of phages in the genome databases accounts for the
missing hits for CRISPR spacer (Edwards and Rohwer, 2005;
Mojica et al., 2005; Snyder et al., 2010).
Leader sequence

Another feature of the CRISPR locus is the leader sequence.


This sequence is located at the 59-end of the CRISPR array
and is abundant in adenine and thymine nucleotides (Jansen
et al., 2002; Tang et al., 2002). Leader sequences contain
hundreds of nucleotides but lack coding potential (Lillestl
et al., 2006). Within the same prokaryotic species, leader
sequences are highly similar. This is in contrast to those from
distantly related species (Jansen et al., 2002), where leader
sequences show much higher variability. Regions homologous to the 39-end of the leader sequence were found in
pre-crRNA in Pyrococcus furiosus, suggesting that the tran-

scription of the CRISPR array is initiated in the leader region


(Hale et al., 2008). It was suggested by Jansen et al. that the
leader could bear sequences necessary for CRISPR transcription (Jansen et al., 2002), and indeed it was recently confirmed that the leader sequences in E. coli K12 CRISPR II
contain promoters for CRISPR transcription (Pul et al.,
2010). This promoter was active in vitro as well as in vivo
and was able to form an open transcription initiation complex
(Pul et al., 2010). Bioinformatic analysis has led to the conclusion that CRISPR loci lacking the leader sequence seem
unable to incorporate new spacers (Lillestl et al., 2006) and
to execute the CRISPR-Expression and Interference (Marraffini and Sontheimer, 2008). This further highlights the
crucial role of the leader sequence at least in CRISPR-Adaptation and CRISPR-Expression. It has also been hypothesized that in addition to a promoter function, the leader
sequence could provide a platform for the binding of Cas
proteins required for spacer integration (Lillestl et al., 2006;
Marraffini and Sontheimer, 2008).
cas genes

cas genes are exclusively found in genomes that contain


CRISPRs and are often located in close proximity to
CRISPR arrays (Jansen et al., 2002). Conserved clustering
of genes strongly suggests that the corresponding proteins
have related functions (Ettema et al., 2005). In situ genome
analysis established that cas genes are present as conserved
gene clusters together with CRISPRs, suggesting related
functions. This observation allowed for identification of 45
cas gene families (Haft et al., 2005), which were later further
grouped into superfamilies by Makarova et al. (2006). Haft
and coworkers identified eight different Cas subtypes named:
Ecoli, Ypest, Nmeni, Dvulg, Tneap, Hmari, Apern and
Mtube. The names reflect the organism in which the particular subtype has first been characterized (for example, the
Ypest subtype was found in Yersinia pestis) (Haft et al.,
2005). Six core cas gene families (cas1-6) exist among the
45 cas genes. These core genes are not restricted to certain
Cas subtypes; only two of them (Cas1 and Cas2) are universal and present in all Cas subtypes. Each subtype contains
one or more subtype-specific cas genes indicated by cs
followed by the letter that is indicative for the subtype (for
example, e for E. coli subtype) followed by a digit (e.g.,
cse3) (Haft et al., 2005; van der Oost et al., 2009).
In addition to the eight subtypes mentioned here, a ninth
subtype, called the CRISPR RAMP module (cmr1-6), also
occurs in CRISPR-positives (Makarova et al., 2002; Haft
et al., 2005; Makarova et al., 2006). The acronym RAMP
stands for Repeat-Associated Mysterious Proteins and indicates the superfamily of proteins peculiar in carrying Glycine-rich C-termini. RAMP modules are usually associated with other Cas subtypes rather than forming a separate
CRISPR locus (Haft et al., 2005; van der Oost et al., 2009).
Studies based on sequence analysis predicted that among
Cas proteins there are nucleases, helicases, integrases and
polymerases. These predicted activities were originally
placed in a working model that describes the addition or
deletion of new spacers, processing of the CRISPR trans-

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


280 S. Al-Attar et al.

Table 1 Cas proteins and their proposed function.


Cas protein

Organism

Effector
stage

Activity/proposed biochemical function

References

Core Cas proteins


Cas1

P. aeruginosa

Adaptationa

Metal dependent ssDNA/dsDNA


endonuclease, generates ;80 bp fragments

Wiedenheft et al., 2009

Cas1
Cas2

S. solfataricus
S. solfataricus

Adaptationa
Adaptationa

HD-domain Cas3
Cas3

S. solfataricus
E. coli

Interferencea
Interference

Cas4
Cas6
Cas7

P. furiosus
S. thermophilus

Expression
Adaptation

Specie-specific Cas proteins


Cse1/CasA
E. coli

Cse2/CasB

E. coli

Cse4/CasC

E. coli

Cas5e/CasD

E. coli

Cse3/CasE

E. coli

Csy4

P. aeruginosa

Expression
and
interference
Expression
and
interference
Expression
and
interference
Expression
and
interference
Expression
and
interference
Expression

Csn1/Cas5

S. thermophilus

Interference

ssRNase, cleaves preferentially in


U-rich regions
Nuclease activity on dsDNA/dsRNA
Predicted HD-nuclease domain and
DEXH-helicase domain.
HD-domain possibly carries out
target DNA cleavage.
DEXH-domain possibly releases
Cascade from target DNA
RecB-like nuclease
Processing of pre-crRNA
Required for spacer acquisition

Han et al., 2009


Beloglazova et al., 2008
Han and Krauss, 2009
Brouns et al., 2008;
Makarova et al., 2006

Makarova et al., 2006


Carte et al., 2008, 2010
Barrangou et al., 2007;
Garneau et al., 2010

Cascade subunit

Brouns et al., 2008

Cascade subunit

Brouns et al., 2008

Cascade subunit

Brouns et al., 2008

Cascade subunit

Brouns et al., 2008

Cascade subunit, processing pre-crRNA


and binding crRNA. Structurally similar to
Cas6 and Csy4
Processing pre-crRNA and binding crRNA.
Structurally similar to Cas6 and Cse3
Required for interference

Brouns et al., 2008

Haurwitz et al., 2010


Barrangou et al., 2007;
Garneau et al., 2010

Effector stage indicated is hypothetical.

cripts and mediation of other processes in the CRISPR-assisted resistance (Haft et al., 2005; Makarova et al., 2006).
Subsequent studies have supported the involvement of Cas
proteins in CRISPR-mediated defense and established some
of their functions. All the Cas proteins identified to date that
have been designated (potential) functions are listed (Table
1). Below, we will discuss these proteins in more detail.

The biological relevance of the CRISPR/Cas


system
In situ investigations suggested that the CRISPR/Cas system
has a defensive function in which the system memorizes
invaders (phages and conjugative elements) by incorporating
new invader-derived DNA sequences providing immunity to
future infiltrations by the same invader (Makarova et al.,

2006). Three main stages were proposed (van der Oost et


al., 2009): (i) CRISPR-Adaptation, where the invader is
encountered and an invader-derived short DNA fragment
(pre-spacer) is obtained and incorporated in the CRISPR
array forming a new spacer, (ii) CRISPR-Expression, where
the CRISPR array is transcribed, processed and loaded onto
Cas proteins to be ready for, (iii) CRISPR-Interference, during which the pathogen is recognized and defused by the
Cas-crRNA machinery. The CRISPR-Adaptation stage can
be further divided in two substages: (i) sampling (recognition
of foreign DNA and excising a DNA sample or pre-spacer)
and (ii) Integration (during which the pre-spacer is integrated
into the CRISPR array forming a new spacer). The three
main stages are illustrated (Figure 2).
The stages of the CRISPR defense mechanism are discussed below as if they are separate unconnected events to
facilitate explanation. One has to keep in mind that this

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 281

Figure 2 The stages of the CRISPR/Cas system.


(A) CRISPR-Adaptation: a phage attacks an E. coli K12 cell and injects its DNA. The CRISPR/Cas machinery senses the foreign DNA
and Cas1 obtains a short piece of DNA called the pre-spacer (this is the sampling substage). The pre-spacer is accommodated to the CRISPR
array and the newly acquired piece of DNA is integrated into the CRISPR array as a new spacer (integration substage). In the bottom of
the scheme, the CRISPR locus is shown similar to that in Figure 1. (B) CRISPR-Expression: the CRISPR array is transcribed (including a
piece of the leader sequence) in the process of pre-crRNA transcription induced by promoter sequences in the leader. Cas genes are also
transcribed. The pre-crRNA is bound by Cascade complexes (yellow and gray transparent shapes) at every CRISPR unit (repeatqspacer)
and is cleaved by Cascades CasE subunit (cleavage sites: small black triangles). The resulting (mature) crRNAs remain bound to the Cascade
complexes to guide host defense. (C) CRISPR-Interference: when a new infection takes place, the Cascade-crRNA complexes and Cas3
(a nuclease/helicase) recognize and bind the complementary region to the crRNA. The target is cleaved and the infection is sabotaged.
Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)
Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


282 S. Al-Attar et al.

arrangement is hypothetical and a detailed understanding of


the CRISPR mechanism is far from complete at the present
time.
The first stage: CRISPR-Adaptation

The first core cas gene, cas1, is found in all CRISPR loci
and is considered a hallmark for the different CRISPR/Cas
systems (Haft et al., 2005; Makarova et al., 2006). Despite
the widespread distribution and thus apparent importance of
this gene among the CRISPR loci, its function is still uncertain. In E. coli, however, Cas1 protein appeared to be unnecessary for CRISPR-Expression and Interference (Brouns et
al., 2008). Biochemical examination of Cas1 from Pseudomonas aeruginosa established that Cas1 has a sequence-independent and metal-dependent endonuclease activity with
specificity for DNA (Wiedenheft et al., 2009). Together with
a comparative genome analysis that suggested Cas1 to be a
nuclease and/or integrase (Makarova et al., 2006) makes
speculation about the role of Cas1 much less arbitrary. Cas1
might act during the CRISPR-Adaptation stage and particularly the sampling substage in which a proto-spacer (a region
in the foreign DNA identical to a spacer) is selected and a
pre-spacer is integrated in the CRISPR locus. Furthermore,
Cas1 might also be involved in the integration of the new
spacer because it contains a putative integrase domain
(Makarova et al., 2006). Prolonged incubation of DNA substrates with Cas1 from P. aeruginosa resulted in a minimal
product size of 80 bp (Wiedenheft et al., 2009). If Cas1 is
truly involved in sampling and/or integration of foreign
DNA, a crucial step that further trims this 80 bp crRNA to
32 bp (the size of a typical spacer in P. aeruginosa), remains
to be uncovered. A study using proteome chips indicated that
Cas1 has affinity to damaged DNA (nucleotide mismatches
or presence of abasic sites) (Chen et al., 2008). The potential
of Cas1 to interact with nucleic acids was further confirmed
by Han and colleagues studying Cas1 from Sulfolobus solfataricus (Han et al., 2009). However, no nuclease activity
could be detected associated with this protein; instead, DNAannealing activity was observed, suggesting the involvement
of Cas1 in DNA rearrangements during spacer insertion. In
short, important observations of Cas1 from different organisms suggest that the protein is involved in one or more steps
of the CRISPR-Adaptation stage. Another core Cas protein,
Cas4, has been proposed to be a RecB-like nuclease (Makarova et al., 2006). In several genomes, cas4 and cas1 genes
are either adjacent or fused. This suggests that the products
of these genes could have concerted activity during the
CRISPR-Adaptation stage (van der Oost et al., 2009).
Knock-out experiments of cas7 (another core cas gene) in
Streptococcus thermophilus, suggested that the protein is
involved during the CRISPR-Adaptation stage because its
absence did not affect the CRISPR-Expression and Interference after infection by phages but prevented addition of new
spacers (Barrangou et al., 2007). It is not obvious whether
Cas7 is homologous in function to Cas1 or Cas2 because all
are suspected to be involved in spacer acquisition/insertion
and none of them is required for the CRISPR-Expression and
Interference. In S. thermophilus (Barrangou et al., 2007),

cas7 and cas1 occur in the same CRISPR locus in contrast


to cas2. This could indicate that cas2 corresponds to cas7
(Sorek et al., 2008; van der Oost et al., 2009).
cas2 is widely distributed in CRISPR loci, however, to a
lesser extent than cas1 (Makarova et al., 2006; Beloglazova
et al., 2008). Cas2 from S. solfataricus was crystallized and
investigated biochemically (Beloglazova et al., 2008). The
protein appeared to be a metal-dependent ssRNA endonuclease with a preference for U-rich regions. The protein bears
a peculiar b-a-b protein fold similar to folds observed in
many RNA-binding proteins (Beloglazova et al., 2008). Similar to Cas1, Cas2 was established to be dispensable for the
CRISPR-Expression and Interference in E. coli (Brouns et
al., 2008). In agreement with the latter conclusion, Cas2
appeared not capable of specifically processing pre-crRNA
(Beloglazova et al., 2008). The latter disqualifies the involvement of Cas2 in pre-crRNA maturation (during the CRISPRExpression stage) or target degradation (during the CRISPRInterference stage). It is not straightforward to reconcile the
RNA-specific nuclease, Cas2, with a biological role during
the CRISPR-Adaptation stage because different CRISPR/Cas
systems (which also contain Cas2) were shown to be DNAspecific (Brouns et al., 2008; Marraffini and Sontheimer,
2008). From a different approach, in E. coli, Cas2 was shown
to accumulate at the cellular poles of cells (Tang et al., 2010),
which are the regions in E. coli most susceptible for phage
invasion (Edgar et al., 2008). This could indicate that this
ssRNA endonuclease is involved in the CRISPR-Adaptation
stage because we cannot rule out that CRISPRs also defend
against RNA-phages.
It is tempting to speculate about additional roles for the
CRISPR/Cas system, for example, in (translational) regulation. This idea is supported by the frequent observation of
spacers homologous to genomic sequences (Cui et al., 2008).
A correlation was found between the presence of a CRISPR
locus and swarming motility in phage-infected P. aeruginosa.
Lysogenic infection of the bacterium by phage DMS3
appeared to abolish the swarming behavior and biofilm formation only in CRISPR/Cas-positive P. aeruginosa populations containing anti-DMS3 spacer. Disruption of
the CRISPR array or cas genes followed by DMS3 infection
prevented altered swarming motility (Zegans et al., 2009).
This finding suggested that the CRISPR/Cas system could
be involved in the regulation of cellular processes.
A recent study established that Pelobacter carbinolicus
bears a spacer (spacer 1) homologous to a region in its
own histidyl-tRNA synthetase gene (hisS) (Aklujkar and
Lovley, 2010). Expression of spacer 1 in the closely related
Geobacter sulfurreducens (in which its hisS was replaced
with that of P. carbinolicus) was shown to retard the growth
of G. sulfurreducens. This supported the hypothesis that
spacer 1 is active and induces interference with hisS gene
expression. In the same study, genetic examinations revealed
that ancestral genes in P. carbinolicus containing multiple
closely spaced histidine-codons were eliminated, apparently
due to selective pressure by the impaired expression of histidyl-tRNA synthetase. This study suggests that spacers targeting endogenous genes are maintained in the genome and
serve to carry out regulatory roles in gene expression.

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 283

Another recent study covering many self-targeting CRISPR


spacers contrasted the idea that CRISPRs are involved in
regulation by self-targeting spacers (Stern et al., 2010). This
study showed that self-targeting spacers lacked conservation
among related strains, a finding suggesting that such spacers
lack biological significance. In addition, the study established that the repeats (the sequence of which is believed to
be important for crRNA maturation) flanking self-targeting
spacers were twice as likely to be found altered when compared to other repeat sequences, implying that individual
unwanted spacers can get eliminated (Stern et al., 2010).
Self-targeting spacers might even lead to the arrest of entire
CRISPR loci, an idea based on the observation that selftargeting spacers are often found at positions of recent addition (proximal to the leader sequence) (Stern et al., 2010).
Similar conclusions were also drawn from E. coli studies
(Diez-Villasenor et al., 2010). An alternative view on some
self-targeting spacers was recently published by Touchon and
Rocha (2010). Instead of being the result of erroneous host
spacer integration, these spacers were proposed to target
mobile genetic elements carrying functional CRISPR/Cas
systems to prevent their uptake (Touchon and Rocha, 2010).
Protospacer selection and autoimmunity Intuitively,
immune systems must have a way to discriminate between
self and foreign entities. The RMS generally relies on DNA
methylation (for example, in E. coli dam and dcm methylation) to discriminate between foreign (non-methylated) and
methylated self-DNA. One must note that there should be
two concepts of self/foreign discrimination involved in the CRISPR/Cas system. The first applies during
the CRISPR-Adaptation stage, when the cell is supposed to
detect alien DNA and to ignite the CRISPR/Cas sampling
apparatus. Secondly and more importantly, the cell is
required to discriminate between own and foreign DNA during the CRISPR-Interference stage to prevent autoimmunity.
Recent important findings shed light on aspects of the mechanism employed by the CRISPR/Cas system in proto-spacer

selection and prevention of autoimmunity. A nucleotide


motif (AGAA) was found downstream of proto-spacers corresponding to spacers in the CRISPR arrays of the lactic acid
bacterium S. thermophilus (Deveau et al., 2008; Horvath et
al., 2008). This motif was detected three times more often
on the phages coding than on the non-coding strands
(Deveau et al., 2008). This finding perhaps tells us that protospacer selection is not arbitrary, at least in S. thermophilus,
and that the CRISPR/Cas system prefers the coding strand
over the non-coding one, and that subsequent counter-selection for the motif on the coding strand has taken place.
Phages that mutated the AGAA motif were able to escape
the CRISPR/Cas system.
These motifs were termed Proto-spacer Adjacent Motifs
(PAMs) by Mojica et al. (2009) (Figure 2A) and were found
to coincide with repeat types according to Kunin et al.
(2007). Because repeat types are linked to Cas subtypes, this
implies that different Cas subtypes specifically recognize different PAMs (Mojica et al., 2009). Mutations in the PAM
can prevent CRISPR-mediated defense (Deveau et al., 2008;
Mojica et al., 2009; Semenova et al., 2009; Marraffini and
Sontheimer, 2010).
Proto-spacer-flanking regions were tested very recently in
Staphylococcus epidermidis by Marraffini and Sontheimer
(2010). The data from this study strongly suggested that the
repeat-derived flanks in the crRNA play a pivotal role in self/
non-self discrimination. CRISPR-mediated resistance
appeared to be triggered by at least two consecutive mismatches between positions -4 and -2 (relative to the first
nucleotide of the proto-spacer) in the 59-repeat-derived flank
of the crRNA and the target DNA. A perfect match between
the crRNA and a target DNA indicates recognition of selfDNA (chromosomal CRISPR) and interference leading to
autoimmunity is prevented (Marraffini and Sontheimer,
2010) (Figure 3).
Investigating the spacer content of the genomes of more
than 13 crenarchaeal acidothermophiles and the genome of
S. thermophilus suggested that proto-spacers are integrated

Figure 3 Autoimmunity is prevented by the CRISPR/Cas machinery during the CRISPR-Interference stage.
For simplicity, no Cas proteins are shown. A crRNA induces CRISPR-Interference only when the flanking repeat-derived regions (green
ribbons) show no or limited complementarity to the target, as is the case when viral DNA (red cylinder) is encountered. When the spacerderived region (red ribbon) and the flanking repeat-derived regions both show complementarity to the target, the CRISPR/Cas system does
not interfere with the target. The latter indicates the recognition of the own (chromosomal) CRISPR array and self-interference is prevented
(reproduced from Marraffini and Sontheimer, 2010).
Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)
Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


284 S. Al-Attar et al.

into the CRISPR array randomly and irrespective of strand


polarity (Shah et al., 2009; Mills et al., 2010). Similar findings were also made in Escherichia and Salmonella (Touchon and Rocha, 2010). These findings contrast with early
S. thermophilus results, which suggested that proto-spacers
are not randomly selected (Deveau et al., 2008) and that
PAMs mark potential proto-spacers. The presence of PAMs
appears to be a general phenomenon among the CRISPR/
Cas systems (Mojica et al., 2009). However, spacer uptake
could still be a random process but only spacers corresponding to compatible PAMs are maintained in a population due
to increased fitness of the host.
Investigation of CRISPRs from S. thermophilus by Barrangou and colleagues delivered the first experimental proof
correlating CRISPRs to acquired immunity against bacteriophages (Barrangou et al., 2007). Naturally generated bacteriophage-resistant mutants were obtained after challenging
wild-type S. thermophilus with phages. Analysis of the
CRISPR I locus from the generated bacteriophage insensitive
mutants showed that each immune mutant expanded its
CRISPR array by insertion of one or several new spacers.
The newly acquired spacers interestingly matched genomic
regions from the phage used in the challenge. The artificial
introduction of these naturally acquired spacers in the original host resulted in de novo immunity against that phage
(Barrangou et al., 2007), showing that resistance was due to
the presence of the spacers. Whereas typically only one spacer is added upon a phage challenge, increased resistance is
obtained by the addition of more spacers against a certain
phage (Barrangou et al., 2007; Brouns et al., 2008; Deveau
et al., 2008). It is important to mention that the expansion
of the CRISPR array occurs in a polarized fashion as new
spacers are almost always inserted at the first position of the
CRISPR array (Pourcel et al., 2005; Lillestl et al., 2006;
Deveau et al., 2008; Horvath et al., 2008; Tyson and Banfield, 2008; van der Ploeg, 2009; Mills et al., 2010; Touchon
and Rocha, 2010). The addition of new spacers is suggested
to co-occur with the addition of a new repeat sequence, most
likely by the duplication of an adjacent repeat (van der Oost
et al., 2009; Touchon and Rocha, 2010). Spacer deletion was
also observed but appeared to take place sporadically and
seemed to happen at the time of spacer addition (Deveau et
al., 2008; Mills et al., 2010). This spacer deletion activity is
believed to be a mechanism for preventing overinflation of
the CRISPR array (Pourcel et al., 2005; Sorek et al., 2008).
Aside from the finding that spacers are added in a polarized
form, spacers appeared to have a polarized orientation. In
other words, spacers are inserted with their PAMEs (spacer
edges equivalent to proto-spacer edges adjacent to PAMs)
pointing towards the leader, analogous to magnets in a magnetic field (Mojica et al., 2009).
The second stage: CRISPR-Expression

Transcription of CRISPRs was initially reported in Archaeoglobus fulgidus by Tang et al. in 2002 and later in S. solfataricus in 2005 by the same group (Tang et al., 2002, 2005).
It was suggested that CRISPR arrays are transcribed as long
precursor transcripts and are successively cleaved into small-

er RNAs. Additionally, different studies have provided evidence in other organisms for the existence of transcripts
spanning the entire CRISPR array or intermediate smaller
transcripts only covering parts of the CRISPR array (Brouns
et al., 2008; Hale et al., 2008; Lillestl et al., 2009). In 2008,
significant discoveries were made unraveling aspects of the
processing of CRISPR transcripts and the Cas protein
machinery in E. coli K12 (Brouns et al., 2008). A complex
of five Cas proteins (CasA-E), termed Cascade, appeared to
be responsible for the maturation of crRNAs (production of
crRNAs from pre-crRNA). Generation of mature crRNAs
was demonstrated to be required for CRISPR-mediated resistance (Brouns et al., 2008). A member of the Cascade complex, CasE (or Cse3), was shown to be responsible for the
processing of pre-crRNA by introducing cuts inside the
repeat sequences. An inactivating CasE mutation (H20A) did
not affect incorporation of CasE into the Cascade complex
but resulted in non-cleaved pre-crRNA and negated antiphage immunity (Brouns et al., 2008). Those observations
indicated that CasE is needed for crRNA maturation and that
mature crRNAs are in turn required for immunity. Likewise,
a homolog-in-function of CasE, Cas6, exhibited specific
endonuclease activity that leads to the maturation of crRNA
in P. furiosus (Carte et al., 2008, 2010). Despite their poor
sequence similarity, both CasE and Cas6 proteins appear to
have similar folds (ferredoxin-like folds) and exhibit similar
activity (Ebihara et al., 2006; Carte et al., 2008; van der Oost
et al., 2009). Recently, a novel Ypest-type CasE/Cas6 homologe, Csy4 from P. aeruginosa, was crystallized and analyzed
(Haurwitz et al., 2010). This protein displays low sequence
similarity with CasE and Cas6 but does exhibit structural and
functional homology with these crRNA producing proteins.
The processing of pre-crRNAs involves cleavage within
the repeat sequences. It appears to be a rule that mature
crRNAs start with the last eight nucleotides of the 59-end
flanking repeat sequence (Brouns et al., 2008; Carte et al.,
2008; Hale et al., 2009; Haurwitz et al., 2010). This consistent feature is termed 59-handle (Brouns et al., 2008) and is
believed to aid in binding of crRNAs by the Cascade (-like)
complex. By contrast, the 39-end of the crRNAs is not constant (Tang et al., 2002, 2005; Carte et al., 2008; Hale et al.,
2009), which indicates that a second unknown cleavage
activity is responsible for trimming the 39-end-flanking
repeat part. To summarize, the CRISPR array is transcribed
into a long transcript that is believed to span the entire
CRISPR array, called pre-crRNA. The precursor is cleaved
within the repeats to form (transient) immature crRNAs,
which are further processed in crRNAs in some
microorganisms.
As discussed above, new spacers are added at the leader
side of the repeat array. This implies that the older a spacer
is, the more distal it is located with respect to the leader and
vice versa. In E. coli, no correlation was observed in the
amounts of crRNAs and the spacer-leader distance. This
demonstrates that the distance to the leader is of less importance for the abundance of a certain crRNA. The latter further confirms that the entire CRISPR loci are transcribed and
processed consequently (Pougach et al., 2010). Conversely,

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 285

P. furiosus studies suggest that crRNAs homologous to the


spacers most proximal to the leader sequence (that is, recently acquired spacers) are more abundant in the cell than
crRNAs that correspond to spacers more distal to the leader
sequence (that is, older spacers) (Hale et al., 2008). This
discrepancy seen between the E. coli and P. furiosus is possibly the result of two different systems with different features. It is tempting to speculate about an intelligent
energy-saving procedure employed by the cell in which
recently added spacers are more readily available to combat
current invaders, whereas older spacers are expressed to a
lesser extent to maintain immunity against less probable
older attackers.
Expression regulation Until recently, a limited amount of
information was available on the transcriptional regulation of
CRISPR loci. DNaseI and KMnO4 foot printing showed that
the leader sequence flanking the CRISPR I locus in E. coli
encompassed an active competent promoter (Pcrispr1) that
is proposed to be responsible for the transcription of the
CRISPR array (Pougach et al., 2010; Pul et al., 2010). The
transcription of the CRISPR array was found to start
upstream of the first repeat sequence (inside the leader
sequence) and continues over the entire CRISPR array, evident from Northern blot analysis of E. coli and P. furiosus
RNA (Hale et al., 2008; Lillestl et al., 2009; Pul et al.,
2010). By contrast to S. thermophilus experimental results
(Mills et al., 2010), copious attempts failed to retrieve naturally generated bacteriophage insensitive mutants from E.
coli K12 (Pougach et al., 2010). Depressed endogenous levels of Cas proteins were suggested to be the limiting factor
in CRISPR-mediated resistance in E. coli (Pougach et al.,
2010; Westra et al., 2010). Expression studies in E. coli
revealed a transcriptional inhibitor termed Heat-stable
Nucleoid Structuring (H-NS) protein, which binds the promoter region in the leader sequence and inhibits transcription
of the CRISPR array (Pul et al., 2010). Aside from the
Pcrispr1, the CRISPR I locus of E. coli K12 carries two
other promoters (Pcas and anti-Pcas) in the intergenic region
between casA (ygcL) and cas3 (ygcB) (IGLB) region (Figure
1), located between cas3 and casA (Pul et al., 2010). Pcas
leads to the expression of CasA and expectedly the entire
cluster of cas genes (casABCDE12) as a polycistronic
mRNA (Shimada et al., 2009; Westra et al., 2010). These
promoters were found to be repressed by H-NS mediated
inhibition limiting both the CRISPRs and the cas genes transcription (Pul et al., 2010). N contrast, overexpressing LeuO
activated transcription of CRISPRs and Cas proteins by
antagonizing H-NS (Westra et al., 2010). The proteins H-NS
and LeuO are two main candidates for the regulation of
CRISPR-mediated defense at a transcriptional level (Pougach et al., 2010; Pul et al., 2010; Westra et al., 2010). It
has also been suggested that other unidentified RNase(s)
contribute to post-transcriptional control of CRISPR transcripts (Pougach et al., 2010; Pul et al., 2010).
Transcriptional profiling of Thermus thermophilus HB8
during FYS40 phage infection demonstrated a significant
increase in the transcription of most cas genes and CRISPR

arrays (Agari et al., 2010). The data also suggested that a


number of cas genes were regulated by the cyclic AMP
receptor protein (Shinkai et al., 2007) whereas others were
regulated by unknown factors (Agari et al., 2010). By contrast, transcriptional analysis of E. coli infected with PRD1
phages showed no significant changes in cas gene transcription in response to infection (Poranen et al., 2006). According to our literature records, a small number of studies exist
that investigate cas gene transcription in response to phage
infection. Future studies will be required to provide more
information on the CRISPR response to infection.
The third stage: CRISPR-Interference

The presence of a CRISPR spacer homologous to a region


in phage DNA or plasmid is firmly correlated to immunity
for that particular genetic element. However, a thorough
understanding of the mechanism of CRISPR-Interference is
lacking. Based on their E. coli studies, Brouns and coworkers
revealed that Cascade alone was able to process pre-crRNA
but did not suffice to provide immunity. A core Cas protein,
Cas3, was found to be indispensible for immunity to the
phage used (Brouns et al., 2008). Inspection of the cas3 gene
sequence revealed that Cas3 contains nuclease and helicase
domains (Makarova et al., 2006). Cas3 was hypothesized to
be responsible for the inactivation of the target DNA (Brouns
et al., 2008). Cas5 (or Csn1) was also found to be necessary
for immunity in S. thermophilus and because the protein contains a nuclease domain it was suggested to be involved in
interference (Barrangou et al., 2007). Recently Garneau et
al. showed that knock-out of cas5 in S. thermophilus prevented target DNA from being cleaved in vivo (Garneau et
al., 2010), despite the presence of an anti-target spacer. In
the same study it was also shown that the targets were specifically cleaved within the proto-spacer.
It is widely accepted that the CRISPR/Cas system relies
on RNA-guided DNA-interference to eliminate invaders
(Barrangou et al., 2007; Brouns et al., 2008; Marraffini and
Sontheimer, 2008; Diez-Villasenor et al., 2010; Garneau et
al., 2010). Artificial anti-phage CRISPR spacers in E. coli
could provide resistance irrespective of the viral DNA strand
these spacers were derived from (either coding or non-coding), suggesting that DNA is targeted (Brouns et al., 2008).
This hypothesis was further confirmed in S. epidermidis by
Marraffini and Sontheimer employing conjugation efficiency
to screen for resistance (Marraffini and Sontheimer, 2008).
A strain of S. epidermidis containing a functional CRISPR/
Cas system and an anti-nickase spacer was shown to target
a plasmid carrying the nickase gene as expected. However,
when the proto-spacer (within the nickase region) was interrupted by a self-splicing intron, no interference was observed, providing strong evidence that the DNA is targeted but
not the RNA (Marraffini and Sontheimer, 2008). Yet, these
observations do not preclude that certain types of CRISPR/
Cas systems could operate via an RNA-guided RNA-interference mechanism. This is well exemplified by the RAMP
module protein complex (Cmr1-6) from P. furiosus. The
RAMP module proteins were able to form a complex in
vitro. When armed with a target-specific crRNA, the protein

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


286 S. Al-Attar et al.

complex was able to specifically cleave ssRNA targets but


not DNA (Hale et al., 2009). The different CRISPR/Cas systems within one organism could function in a complementary
way providing the prokaryotic cell with a more comprehensive immune system. In addition, the diversity in gene content, phylogeny and repeat sequence between the Cas
subtypes supports the hypothesis that the different CRISPR/
Cas systems are divergent in function (Kunin et al., 2007).

Variations
Horizontal gene transfer (HGT) is a major contributor to diversity and rapid adaptation in prokaryotes (Brigulla and Wackernagel, 2010). Ironically, the CRISPR loci which are
essential to combat foreign DNA themselves could once have
entered cells as foreign DNA in the process of HGT as suggested by phylogenetic analysis and comparative genomic
studies (Godde and Bickerton, 2006; Tyson and Banfield,
2008; Horvath et al., 2009; Chakraborty et al., 2010; Touchon
and Rocha, 2010). The gain of antibiotic resistance appears to
be negatively affected by the CRISPR/Cas system. A recent
study demonstrated that Enterococci strains that lack complete
CRISPR/Cas systems are more likely to gain resistance genes
for antibiotics (Palmer and Gilmore, 2010). This demonstrates
that a compromised immune system can be beneficial under
certain kinds of stress. In addition, an interesting in vivo study
in S. thermophilus established that the CRISPR/Cas system can
cause loss of a plasmid (carrying antibiotic resistance) when
the bacterium is grown over several generations in antibioticfree medium (Garneau et al., 2010). The re-introduction of the
plasmid to those S. thermophilus strains that lost the plasmid
appeared to be inhibited by the CRISPR/Cas system.
Driven by selection, the CRISPR/Cas system must have
evolved and pervaded prokaryotic genomes testifying to its
value as an immune system to combat the phages which
immensely outnumber their prokaryotic target hosts (Breitbart and Rohwer, 2005). Phages and prokaryotes are involved in a never-ending warfare. In response to the phages
that constantly attempt to escape CRISPR-mediated immunity, CRISPR/Cas systems evolve to ensure their functionality (Andersson and Banfield, 2008; Deveau et al., 2008;
Heidelberg et al., 2009; He and Deem, 2010). Phages escape
the adaptation of the CRISPR/Cas system by introducing
changes in the PAMs or employ nucleotide substitutions
within the proto-spacer (to counteract recognition). The
CRISPR/Cas system is forced to respond by incorporating
new spacers from the viral genome.

Applications
The phenomenon that spacers are added at one end of the
CRISPR array is widely supported. This implies that the
order of the spacers within the CRISPR array can serve as
a timeline for previous genetic aggression and can provide
evolutionary data on phage/host evolution. On a population
level, spacers that are more distal to the leader sequence are

less diversified than those proximal to the leader (He and


Deem, 2010). Variations and similarities in the CRISPR content can thus be exploited for phylogenetic and evolutionary
studies (Cui et al., 2008). Spacer oligonucleotide typing, or
in short spoligotyping, is a term introduced in 1997 by
Goyan and colleagues who successfully established this
CRISPR-based technique for M. tuberculosis genotyping
(Goyal et al., 1997). Spoligotyping is not only helpful for
studying evolution but it could also be of good use during
pathogen outbreaks to quickly determine the source of the
outbreaks and design appropriate prevention and treatment
strategies (Cui et al., 2008). In addition, spoligotyping could
be extremely useful for studies of historical epidemics using
ancient DNA (for example, DNA obtained from human
remains). Spoligotyping has a number of benefits, including
small targets and low contamination sensitivity, making it a
potential effective method for investigation of ancient DNA
(Vergnaud et al., 2007). From an industrial perspective,
knowledge of the CRISPR/Cas system helps in generating
industry-relevant phage-resistant bacterial mutants. For
example, the frequent use of S. thermophilus in the manufacturing of dairy products led to increased phage-related
problems. Generation of phage-resistant bacterial mutants
can help in employing strategies to rotate between strains
(that differ only in spacer content) to minimize the impact
of phage infections (Mills et al., 2010). It should be noted
that the spacer acquisition is a natural process performed by
the CRISPR/Cas syatem. This process is not regarded as a
genetic modification which is particularly important in foodrelated applications. From an entirely different approach,
Snyder and colleagues recently proposed the use of prokaryotic spacer-based microarrays to screen for novel viruses
in extreme thermal environments and detect transient changes in phage populations (Snyder et al., 2010).

Concluding remarks
It is mindboggling how the CRISPR/Cas system has gone
unnoticed for years, despite its presence in many prokaryotes.
Perhaps the repression of the CRISPR system in prokaryotic
species (e.g., E. coli K12), sterile lab conditions and the relative
underrepresentation of CRISPRs in many bacteria relative to
archaea all contributed to conceal the CRISPR/Cas system. The
CRISPR/Cas system was proposed to act in a manner similar
to RNA interference (RNAi) in eukaryotes (Makarova et al.,
2006). The RNAi mechanism indeed shares some similarities
to the CRISPR/Cas system, for instance, the interference principle where a small invader-derived RNA is used to search for
unwanted target nucleic acid(s). Whether the CRISPR/Cas system is also involved in internal regulation of expression, as the
case is for RNAi, remains unknown at present. On the level
of amino acid sequence, the CRISPR/Cas machinery does not
exhibit homology to the RNAi proteins (Makarova et al.,
2006). In addition, unlike the eukaryotic RNAi, DNA is the
target of most characterized CRISPR systems (van der Oost
and Brouns, 2009).

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 287

How the CRISPR apparatus is triggered upon invasion is


an important question that remains to be answered. The
answer to this would allow the understanding and the manipulation of the CRISPR/Cas system for possible future applications. Other imperative questions also remain unanswered,
such as, what is the principle of foreign DNA recognition;
how new repeats are generated; what is the mechanism of
spacer integration; how does the cell adapt and survive that
fast, etc. This review demonstrates that a lot of valuable data
on the CRISPR/Cas system has been generated in a relatively
short period of time; however, it also demonstrates that a
vast number of questions remain to be answered.

Acknowledgments
Special thanks to Moqadas Sayed for his help with the illustrations
used in this review. The Netherlands Organization for Scientific
Research (NWO) is acknowledged for financial support.

References
Agari, Y., Sakamoto, K., Tamakoshi, M., Oshima, T., Kuramitsu,
S., and Shinkai, A. (2010). Transcription profile of Thermus
thermophilus CRISPR systems after phage infection. J. Mol.
Biol. 395, 270281.
Aklujkar, M. and Lovley, D.R. (2010). Interference with histidyltRNA synthetase by a CRISPR spacer sequence as a factor in
the evolution of Pelobacter carbinolicus. BMC. Evol. Biol. 10,
230.
Andersson, A.F. and Banfield, J.F. (2008). Virus population dynamics and acquired virus resistance in natural microbial communities. Science 320, 10471050.
Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P.,
Moineau, S., Romero, D.A., and Horvath, P. (2007). CRISPR
provides acquired resistance against viruses in prokaryotes.
Science 315, 17091712.
Beloglazova, N., Brown, G., Zimmerman, M.D., Proudfoot, M.,
Makarova, K.S., Kudritska, M., Kochinyan, S., Wang, S.,
Chruszcz, M., Minor, W., et al. (2008). A novel family of
sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats. J. Biol.
Chem. 283, 2036120371.
Bolotin, A., Quinquis, B., Sorokin, A., and Ehrlich, S.D. (2005).
Clustered regularly interspaced short palindrome repeats
(CRISPRs) have spacers of extrachromosomal origin. Microbiology 151, 25512561.
Breitbart, M. and Rohwer, F. (2005). Here a virus, there a virus,
everywhere the same virus? Trends Microbiol. 13, 278284.
Brigulla, M. and Wackernagel, W. (2010). Molecular aspects of gene
transfer and foreign DNA acquisition in prokaryotes with regard
to safety issues. Appl. Microbiol. Biotechnol. 86, 10271041.
Brouns, S.J., Jore, M.M., Lundgren, M., Westra, E.R., Slijkhuis,
R.J., Snijders, A.P., Dickman, M.J., Makarova, K.S., Koonin,
E.V., and van der Oost, J. (2008). Small CRISPR RNAs guide
antiviral defense in prokaryotes. Science 321, 960964.
Bult, C.J., White, O., Olsen, G.J., Zhou, L., Fleischmann, R.D.,
Sutton, G.G., Blake, J.A., FitzGerald, L.M., Clayton, R.A.,
Gocayne, J.D., et al. (1996). Complete genome sequence of the
methanogenic archaeon, Methanococcus jannaschii. Science
273, 10581073.

Carte, J., Wang, R., Li, H., Terns, R.M., and Terns, M.P. (2008).
Cas6 is an endoribonuclease that generates guide RNAs for
invader defense in prokaryotes. Genes Dev. 22, 34893496.
Carte, J., Pfister, N.T., Compton, M.M., Terns, R.M., and Terns, M.P.
(2010). Binding and cleavage of CRISPR RNA by Cas6. RNA
16, 21812188.
Chakraborty, S., Snijders, A.P., Chakravorty, R., Ahmed, M., Tarek,
A.M., and Hossain, M.A. (2010). Comparative network clustering of direct repeats (DRs) and cas genes confirms the possibility
of the horizontal transfer of CRISPR locus among bacteria. Mol.
Phylogenet, Evol. 56, 878887.
Chen, C.S., Korobkova, E., Chen, H., Zhu, J., Jian, X., Tao, S.C.,
He, C., and Zhu, H. (2008). A proteome chip approach reveals
new DNA damage recognition activities in Escherichia coli.
Nat. Methods 5, 6974.
Cui, Y., Li, Y., Gorge, O., Platonov, M.E., Yan, Y., Guo, Z., Pourcel,
C., Dentovskaya, S.V., Balakhonov, S.V., Wang, X., et al. (2008).
Insight into microevolution of Yersinia pestis by clustered regularly interspaced short palindromic repeats. PLoS One 3, e2652.
Deveau, H., Barrangou, R., Garneau, J.E., Labonte, J., Fremaux, C.,
Boyaval, P., Romero, D.A., Horvath, P., and Moineau, S. (2008).
Phage response to CRISPR-encoded resistance in Streptococcus
thermophilus. J. Bacteriol. 190, 13901400.
Diez-Villasenor, C., Almendros, C., Garcia-Martinez, J., and Mojica, F.J. (2010). Diversity of CRISPR loci in Escherichia coli.
Microbiology 156, 13511361.
Ebihara, A., Yao, M., Masui, R., Tanaka, I., Yokoyama, S., and
Kuramitsu, S. (2006). Crystal structure of hypothetical protein
TTHB192 from Thermus thermophilus HB8 reveals a new protein family with an RNA recognition motif-like domain. Protein
Sci. 15, 14941499.
Edgar, R., Rokney, A., Feeney, M., Semsey, S., Kessel, M., Goldberg, M.B., Adhya, S., and Oppenheim, A.B. (2008). Bacteriophage infection is targeted to cellular poles. Mol. Microbiol. 68,
11071116.
Edwards, R.A. and Rohwer, F. (2005). Viral metagenomics. Nat Rev
Microbiol 3, 504510.
Ettema, T.J., de Vos, W.M., and van der Oost, J. (2005). Discovering
novel biology by in silico archaeology. Nat. Rev. Microbiol. 3,
859869.
Garneau, J.E., Dupuis, M.E., Villion, M., Romero, D.A., Barrangou,
R., Boyaval, P., Fremaux, C., Horvath, P., Magadan, A.H., and
Moineau, S. (2010). The CRISPR/Cas bacterial immune system
cleaves bacteriophage and plasmid DNA. Nature 468, 6771.
Godde, J.S. and Bickerton, A. (2006). The repetitive DNA elements
called CRISPRs and their associated genes: evidence of horizontal transfer among prokaryotes. J. Mol. Evol. 62, 718729.
Goyal, M., Saunders, N.A., van Embden, J.D., Young, D.B., and
Shaw, R.J. (1997). Differentiation of Mycobacterium tuberculosis isolates by spoligotyping and IS6110 restriction fragment
length polymorphism. J. Clin. Microbiol. 35, 647651.
Grissa, I., Vergnaud, G., and Pourcel, C. (2007). The CRISPRdb
database and tools to display CRISPRs and to generate dictionaries of spacers and repeats. BMC Bioinformatics 8, 172.
Groenen, P.M., Bunschoten, A.E., van Soolingen, D., and van Embden, J.D. (1993). Nature of DNA polymorphism in the direct
repeat cluster of Mycobacterium tuberculosis; application for
strain differentiation by a novel typing method. Mol. Microbiol.
10, 10571065.
Haft, D.H., Selengut, J., Mongodin, E.F., and Nelson, K.E. (2005).
A guild of 45 CRISPR-associated (Cas) protein families and
multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
PLoS Comput. Biol. 1, e60.

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


288 S. Al-Attar et al.

Hale, C., Kleppe, K., Terns, R.M., and Terns, M.P. (2008). Prokaryotic silencing (psi)RNAs in Pyrococcus furiosus. RNA 14,
25722579.
Hale, C.R., Zhao, P., Olson, S., Duff, M.O., Graveley, B.R., Wells,
L., Terns, R.M., and Terns, M.P. (2009). RNA-guided RNA
cleavage by a CRISPR RNA-Cas protein complex. Cell 139,
945956.
Han, D. and Krauss, G. (2009). Characterization of the endonuclease
SSO2001 from Sulfolobus solfataricus P2. FEBS Lett. 583,
771776.
Han, D., Lehmann, K., and Krauss, G. (2009). SSO1450 a CAS1
protein from Sulfolobus solfataricus P2 with high affinity for
RNA and DNA. FEBS Lett. 583, 19281932.
Haurwitz, R.E., Jinek, M., Wiedenheft, B., Zhou, K., and Doudna,
J.A. (2010). Sequence- and structure-specific RNA processing
by a CRISPR endonuclease. Science 329, 13551358.
He, J. and Deem, M.W. (2010). Heterogeneous diversity of spacers
within CRISPR (clustered regularly interspaced short palindromic repeats). Phys. Rev. Lett. 105, 128102.
Heidelberg, J.F., Nelson, W.C., Schoenfeld, T., and Bhaya, D.
(2009). Germ warfare in a microbial mat community: CRISPRs
provide insights into the co-evolution of host and viral genomes.
PLoS One 4, e4169.
Horvath, P., Romero, D.A., Coute-Monvoisin, A.C., Richards, M.,
Deveau, H., Moineau, S., Boyaval, P., Fremaux, C., and Barrangou, R. (2008). Diversity, activity, and evolution of CRISPR loci
in Streptococcus thermophilus. J. Bacteriol. 190, 14011412.
Horvath, P., Coute-Monvoisin, A.C., Romero, D.A., Boyaval, P.,
Fremaux, C., and Barrangou, R. (2009). Comparative analysis
of CRISPR loci in lactic acid bacteria genomes. Int. J. Food
Microbiol. 131, 6270.
Hyman, P. and Abedon, S.T. (2010). Bacteriophage host range and
bacterial resistance. Adv. Appl. Microbiol. 70, 217248.
Ishino, Y., Shinagawa, H., Makino, K., Amemura, M., and Nakata,
A. (1987). Nucleotide sequence of the iap gene, responsible for
alkaline phosphatase isozyme conversion in Escherichia coli,
and identification of the gene product. J. Bacteriol. 169,
54295433.
Jansen, R., Embden, J.D., Gaastra, W., and Schouls, L.M. (2002).
Identification of genes that are associated with DNA repeats in
prokaryotes. Mol. Microbiol. 43, 15651575.
Kunin, V., Sorek, R., and Hugenholtz, P. (2007). Evolutionary conservation of sequence and secondary structures in CRISPR
repeats. Genome Biol. 8, R61.
Labrie, S.J., Samson, J.E., and Moineau, S. (2010). Bacteriophage
resistance mechanisms. Nat. Rev. Microbiol. 8, 317327.
Lillestl, R.K., Redder, P., Garrett, R.A., and Brugger, K. (2006). A
putative viral defence mechanism in archaeal cells. Archaea 2,
5972.
Lillestl, R.K., Shah, S.A., Brugger, K., Redder, P., Phan, H., Christiansen, J., and Garrett, R.A. (2009). CRISPR families of the
crenarchaeal genus Sulfolobus: bidirectional transcription and
dynamic properties. Mol. Microbiol. 72, 259272.
Makarova, K.S., Aravind, L., Grishin, N.V., Rogozin, I.B., and Koonin, E.V. (2002). A DNA repair system specific for thermophilic
Archaea and bacteria predicted by genomic context analysis.
Nucleic Acids Res. 30, 482496.
Makarova, K.S., Grishin, N.V., Shabalina, S.A., Wolf, Y.I., and Koonin, E.V. (2006). A putative RNA-interference-based immune
system in prokaryotes: computational analysis of the predicted
enzymatic machinery, functional analogies with eukaryotic
RNAi, and hypothetical mechanisms of action. Biol. Direct 1, 7.
Marraffini, L.A. and Sontheimer, E.J. (2008). CRISPR interference

limits horizontal gene transfer in staphylococci by targeting


DNA. Science 322, 18431845.
Marraffini, L.A. and Sontheimer, E.J. (2010). Self versus non-self
discrimination during CRISPR RNA-directed immunity. Nature
463, 568571.
Mills, S., Griffin, C., Coffey, A., Meijer, W.C., Hafkamp, B., and
Ross, R.P. (2010). CRISPR analysis of bacteriophage-insensitive
mutants (BIMs) of industrial Streptococcus thermophilus-implications for starter design. J. Appl. Microbiol. 108, 945955.
Mojica, F.J., Ferrer, C., Juez, G., and Rodriguez-Valera, F. (1995).
Long stretches of short tandem repeats are present in the largest
replicons of the Archaea Haloferax mediterranei and Haloferax
volcanii and could be involved in replicon partitioning. Mol.
Microbiol. 17, 8593.
Mojica, F.J., Diez-Villasenor, C., Soria, E., and Juez, G. (2000).
Biological significance of a family of regularly spaced repeats
in the genomes of Archaea, Bacteria and mitochondria. Mol.
Microbiol. 36, 244246.
Mojica, F.J., Diez-Villasenor, C., Garcia-Martinez, J., and Soria, E.
(2005). Intervening sequences of regularly spaced prokaryotic
repeats derive from foreign genetic elements. J. Mol. Evol. 60,
174182.
Mojica, F.J., Dez-Villasenor, C., Garca-Martnez, J., and Almendros, C. (2009). Short motif sequences determine the targets of
the prokaryotic CRISPR defence system. Microbiology 155,
733740.
Palmer, K.L. and Gilmore, M.S. (2010). Multidrug-resistant Enterococci lack CRISPR-cas. MBio. 1, e0022700210.
Poranen, M.M., Ravantti, J.J., Grahn, A.M., Gupta, R., Auvinen, P.,
and Bamford, D.H. (2006). Global changes in cellular gene
expression during bacteriophage PRD1 infection. J. Virol. 80,
80818088.
Pougach, K., Semenova, E., Bogdanova, E., Datsenko, K.A., Djordjevic, M., Wanner, B.L., and Severinov, K. (2010). Transcription,
processing and function of CRISPR cassettes in Escherichia
coli. Mol. Microbiol. 77, 13671379.
Pourcel, C., Salvignol, G., and Vergnaud, G. (2005). CRISPR elements in Yersinia pestis acquire new repeats by preferential
uptake of bacteriophage DNA, and provide additional tools for
evolutionary studies. Microbiology 151, 653663.
Pul, U., Wurm, R., Arslan, Z., Geissen, R., Hofmann, N., and Wagner, R. (2010). Identification and characterization of E. coli
CRISPR-cas promoters and their silencing by H-NS. Mol.
Microbiol. 75, 14951512.
Semenova, E., Nagornykh, M., Pyatnitskiy, M., Artamonova, I.I.,
and Severinov, K. (2009). Analysis of CRISPR system function
in plant pathogen Xanthomonas oryzae. FEMS Microbiol. Lett.
296, 110116.
Shah, S.A., Hansen, N.R., and Garrett, R.A. (2009). Distribution of
CRISPR spacer matches in viruses and plasmids of crenarchaeal
acidothermophiles and implications for their inhibitory mechanism. Biochem. Soc. Trans. 37, 2328.
She, Q., Singh, R.K., Confalonieri, F., Zivanovic, Y., Allard, G.,
Awayez, M.J., Chan-Weiher, C.C., Clausen, I.G., Curtis, B.A.,
De Moors, A., et al. (2001). The complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci.
USA 98, 78357840.
Shimada, T., Yamamoto, K., and Ishihama, A. (2009). Involvement
of the leucine response transcription factor LeuO in regulation
of the genes for sulfa drug efflux. J. Bacteriol. 191, 45624571.
Shinkai, A., Kira, S., Nakagawa, N., Kashihara, A., Kuramitsu, S.,
and Yokoyama, S. (2007). Transcription activation mediated by
a cyclic AMP receptor protein from Thermus thermophilus HB8.
J. Bacteriol. 189, 38913901.

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


Authenticated | 172.16.1.226
Download Date | 2/22/12 2:14 PM

Article in press - uncorrected proof


Clustered regularly interspaced short palindromic repeats (CRISPRs) 289

Snyder, J.C., Bateson, M.M., Lavin, M., and Young, M.J. (2010).
Use of cellular CRISPR (clusters of regularly interspaced short
palindromic repeats) spacer-based microarrays for detection of
viruses in environmental samples. Appl. Environ. Microbiol. 76,
72517258.
Sorek, R., Kunin, V., and Hugenholtz, P. (2008). CRISPR a widespread system that provides acquired resistance against phages
in bacteria and archaea. Nat. Rev. Microbiol. 6, 181186.
Stern, A., Keren, L., Wurtzel, O., Amitai, G., and Sorek, R. (2010).
Self-targeting by CRISPR: gene regulation or autoimmunity?
Trends Genet. 26, 335340.
Tang, T.H., Bachellerie, J.P., Rozhdestvensky, T., Bortolin, M.L.,
Huber, H., Drungowski, M., Elge, T., Brosius, J., and Huttenhofer, A. (2002). Identification of 86 candidates for small nonmessenger RNAs from the archaeon Archaeoglobus fulgidus.
Proc. Natl. Acad. Sci. USA 99, 75367541.
Tang, T.H., Polacek, N., Zywicki, M., Huber, H., Brugger, K., Garrett, R., Bachellerie, J.P., and Huttenhofer, A. (2005). Identification of novel non-coding RNAs as potential antisense
regulators in the archaeon Sulfolobus solfataricus. Mol. Microbiol. 55, 469481.
Tang, J., Akerboom, J., Vaziri, A., Looger, L.L., and Shank, C.V.
(2010). Near-isotropic 3D optical nanoscopy with photon-limited
chromophores. Proc. Natl. Acad. Sci. USA 107, 1006810073.
Thony-Meyer, L. and Kaiser, D. (1993). devRS, an autoregulated
and essential genetic locus for fruiting body development in
Myxococcus xanthus. J. Bacteriol. 175, 74507462.
Touchon, M. and Rocha, E.P. (2010). The small, slow and specialized CRISPR and anti-CRISPR of Escherichia and Salmonella.
PLoS One 5, e11126.
Tyson, G.W. and Banfield, J.F. (2008). Rapidly evolving CRISPRs
implicated in acquired resistance of microorganisms to viruses.
Environ. Microbiol. 10, 200207.

van der Oost, J. and Brouns, S.J. (2009). RNAi: prokaryotes get in
on the act. Cell 139, 863865.
van der Oost, J., Jore, M.M., Westra, E.R., Lundgren, M., and
Brouns, S.J. (2009). CRISPR-based adaptive and heritable
immunity in prokaryotes. Trends Biochem. Sci. 34, 401407.
van der Ploeg, J.R. (2009). Analysis of CRISPR in Streptococcus
mutans suggests frequent occurrence of acquired immunity
against infection by M102-like bacteriophages. Microbiology
155, 19661976.
Vergnaud, G., Li, Y., Gorge, O., Cui, Y., Song, Y., Zhou, D., Grissa,
I., Dentovskaya, S.V., Platonov, M.E., Rakin, A., et al. (2007).
Analysis of the three Yersinia pestis CRISPR loci provides new
tools for phylogenetic studies and possibly for the investigation
of ancient DNA. Adv. Exp. Med. Biol. 603, 327338.
Westra, E.R., Pul, U., Heidrich, N., Jore, M.M., Lundgren, M., Stratmann, T., Wurm, R., Raine, A., Mescher, M., Van Heereveld,
L., et al. (2010). H-NS-mediated repression of CRISPR-based
immunity in Escherichia coli K12 can be relieved by the transcription activator LeuO. Mol. Microbiol. 77, 13801393.
Wiedenheft, B., Zhou, K., Jinek, M., Coyle, S.M., Ma, W., and
Doudna, J.A. (2009). Structural basis for DNase activity of a
conserved protein implicated in CRISPR-mediated genome
defense. Structure 17, 904912.
Wilson, G.G. (1991). Organization of restriction-modification systems. Nucleic Acids Res. 19, 25392566.
Zegans, M.E., Wagner, J.C., Cady, K.C., Murphy, D.M., Hammond,
J.H., and OToole, G.A. (2009). Interaction between bacteriophage DMS3 and host CRISPR region inhibits group behaviors
of Pseudomonas aeruginosa. J. Bacteriol. 191, 210219.

Received October 27, 2010; accepted December 27, 2010

Brought to you by | Bibliotheek TU Delft (Bibliotheek TU Delft)


All in-text references underlined in blue are linked to publications
Authenticated
on ResearchGate,
| 172.16.1.226letting you access and read them immediately.
Download Date | 2/22/12 2:14 PM