Vous êtes sur la page 1sur 27

3 Vectors: Key concepts _ Vectors are autonomously replicating DNA molecules that can be used to carry foreign DNA

fragments _ DNA fragments cloned into vectors can be isolated in quantities sufficient for most laboratory manipulations _ Vectors are based on naturally occurring DNA sequences that have been modified and combined to serve particular functions _ The choice of vector depends chiefly upon the size of the DNA molecules that must be inserted into it _ Vectors have been developed for a variety of specific purposes, including the production of single-stranded DNA, the high-level expression of protein encoding genes, and the production of RNA The vast majority of molecular cloning experiments utilize the bacterium Escherichia coli for the propagation of cloned DNA fragments. Even if the final destination of a cloned DNA fragment is a eukaryotic cell, DNA constructs are invariably produced in E. coli prior to being shuttled into their ultimate host. Cloning is possible in other organisms, but the advantages of E. coli have led to its widespread acceptance as the genetic engineering organism of choice. Escherichia coli, named for the German physician Theodor Escherich (18571911), is a gram-negative, rod shaped bacterium propelled by long, rapidly rotating flagella (Figure 3.1). It is part of the normal flora of the hu man mouth and gut, helping to protect the intestinal tract from bacterial infection, aiding digestion and producing small amounts of vitamins B12 and K. The bacterium, which is also found in soil and water, is widely used in laboratory research and is probably the most thoroughly studied life form. As a laboratory organism, E. coli has a number of distinct advantages. 2 m Figure 3.1. Structure of the bacterium Escherichia coli. A cut-away model of the bacterium showing some of the cellular layers and components, and E. coli cells viewed using an electron microscope It is easy to grow in simple, inexpensive growth medium. The organism has a rapid doubling time of about 2030 minutes during log-phase growth. Its genetics are well understood. Laboratory strains of E. coli are generally safe and contain mutations that do not allow them to escape the laboratory environment. It has a fully mapped and sequenced genome. Extra-chromosomal copies of DNA (plasmids and bacteriophage DNA) can be exploited to carry foreign DNA fragments. E. coli cells generally reproduce asexually but, in order to increase diversity and share the gene pool, they have mechanisms for the transfer of genetic material from one bacterium to another. As far back as the mid-1940s it was known that bacterial cells were able to exchange genetic material with each other in a semi sexual manner. The experiments of Lederberg and Tatum clearly demonstrated the transfer of genetic information through bacterial conjugation (Lederberg and Tatum, 1946). The ability to perform this transfer is conferred by a set of genes called F (for fertility). These genes can exist on a circular piece of DNA that replicates independently from the bacterial chromosome, or they can be integrated into the chromosome. A bacterium containing these genes (often VECTORS 111 referred to as the male bacterium) uses a pilus to attach to a neighbouring bacterium. The two cells then are drawn together, and DNA is transferred from one bacterium to another. There are three manifestations of the fertility factor .

F also called the F episome. This is a large circular double-stranded DNA molecule (99 159 bp) that carries only the fertility genes. It is maintained in the bacterium as an extra-chromosomal plasmid. Hfr the F element has become integrated into the E. coli genome. When conjugation occurs, the F genes start travelling across the pilus, dragging the rest of the genome behind them. Eventually, the pilus breaks, so most often the entire genome is not transferred. The bacterial genome can be measured, in minutes, from the origin of transfer with the amount of time it takes for a particular gene to be transferred from one bacterium to another indicating how far it is from the origin of replication. F_ this is a large circular double-stranded DNA molecule that contains the fertility genes and a few other genes. These other genes are transferred very efficiently from one bacterium to the next because the length of the transferred DNA is short enough that it can move across the connection between the two bacterial cells before the pilus breaks. The ability of the F_ episome to replicate and be maintained as a separate entit y from the bacterial chromosomes would make it seem an ideal candidate to carry foreign DNA sequences. As we will see later, engineered versions of the F_ episome have indeed been modified to carry foreign DNA fragments. However, the size of the F_ episome precludes easy analysis and manipulation and its gene transfer properties can make it unstable. In general, foreign DNA fragments need to be carried in a vector to ensure propagation and replication within a host cell. A vector is probably best described as autonomously replicating DNA sequences that can be used to carry foreign DNA fragments. All vectors are based on naturally occurring DNA sequences that can be replicated under particular circumstances. Most commonly used vectors are based either upon plasmids or bacteriophage lambda (). In genera , vectors can be thought of as a series of discrete modu es that provide requirements essentia for efficient mo ecu ar c oning. A vector that is used predominant y for reproducing the DNA fragment is often referred to as a c oning vector, whi e if it is used for expressing a gene contained with in the c oned DNA, it is ca ed an expression vector. A vector must possess the fo owing characteristics to make it usefu for mo ecu ar c oning: 112 VECTORS 3 Tab e 3.1. The genera properties of common y used vectors Vector Main uses Maximum insert size (kb) Examp e P asmid Genera DNA manipu ation Numerous specia ized derivatives 1020 pBR322, pUC18 (insertion) Construction of cDNA ibraries 10 gt11 (rep acement) Construction of genomic ibraries 23 ZAP, EMBL4 Cosmid Construction of genomic ibraries 44 pJB8 M13 In vitro mutagenesis DNA sequencing 89 M13mp18 Phagemid Genera DNA manipu ation In vitro mutagenesis 1020 pB uescript YAC Construction of genomic ibraries 10002000 pYAC4 PAC Construction of genomic ibraries 7590 pAd10SacBII BAC Construction of genomic ibraries 130150 pBAC108L the abi ity to se f-rep icate a se ectab e characteristic so that transformed ce s may be recognized from untransformed ce s.

Additiona y, most vectors wi contain at east one, and often mu tip e, restri ction enzyme recognition sites so that DNA fragments can be c oned into the vector re ative y easi y. A huge array of different types of vector is avai ab e today, with many being high y specia ized and designed to perform a specific function. In this chapter, I wi discuss some of the genera points of vector design, but wi concentrate on vectors that are common y used in c oning experiments (Tab e 3.1). 3.1 P asmids P asmids are natura y occurring extra-chromosoma DNA fragments that are stab y inherited from one generation to another in the extra-chromosoma state. P asmids are wide y distributed throughout prokaryotes and range in size from approximate y 1500 bp to over 300 kbp. Most p asmids exist as c osed-circu ar doub e-stranded DNA mo ecu es that often confer a particu ar phenotype onto the bacteria ce in which they are rep icated. That is, the p asmid wi often 3.1 PLASMIDS 113 carry a gene that encodes resistance to either antibiotics or heavy meta s, or t hat produces DNA restriction and modification enzymes, that the bacterium wou d not norma y possess. The rep ication of the p asmid is often coup ed to that of the host ce in which it is maintained, with p asmid rep ication occurring a t the same time as the host genome is rep icated. P asmids are often described as being either re axed or stringent on the basis of the number of copies of the p asmid that are maintained within the ce . Re axed p asmids (not to be confused with re axed DNA, Chapter 1) are maintained at mu tip e copies per ce (10200), whi e stringent p asmids are present at a sing e copy, or a ow number of copies (12) per ce . At east part of the basis of this difference is the different mechanisms emp oyed by p asmids in order to rep icate themse ves. In genera , re axed p asmids rep icate using host derived proteins, whi e stringent p asmids encode protein factors that are necessary for their own rep ication. Box 3.1. Naming genes and DNA The names given to p asmids, genes and other DNA fragments may, at first g ance, appear to be nothing more than a jumb ed co ection of etters and numbers. The process of naming genes and segments of DNA began by geneticists describing genes associated with visib e phenotypes, e.g. in mouse coat co our c and A for the a bino and agouti oci, respective y. Other gene names ref ect bio ogica function, for examp e, Hbb for the haemog obin -chain, and Adh for alcohol dehydrogenase enzymatic activity. Drosophila geneticists have rought the most whimsical approach gene naming with names like fushi tarazu (ftz, from the Japanese words meaning segment deficient), sp atzle (spz, a type of German noodle), dunce (dnc), forkhead (fkh), hedgehog (hh) and ether-a-go-go (eag). The assignment of chromosomal locations for genes of unknown function developed soon after the esta lishment of successful metaphase spreads, chromosome anding methodologies, somatic cell hy rids, isozyme separation and the a ility to associate genes and phenotypes with a particular site on the chromosome. It has also ecome common to use the same name for the gene as for the enzyme or other protein that it encodes. Often, the gene name is italicized whereas the gene product is not to distinguish etween the two. This can, however, e a source of confusion, since there is not necessarily a one-to-one relationship etween the two entities. Additionally, nomenclature used for one organism or species may e different in another. This non-systematic approach to gene naming can result not only in the same gene function in different organisms eing designated differently, ut also in the same gene 114 VECTORS 3 from the same organism having several different names depending upon which research group is descri ing it. Traditionally, recom inant plasmids tend to ear the initials of their

creator(s) followed y a num er that may indicate the numerical order in which the plasmids were produced, or perhaps has some deeper meaning. For example, the name of the plasmid pBR322 can e dissected into the following components: p plasmid, BR named y Paco Bolivar and Ray Rodrigues, who developed the plasmid (Bolivar et al., 1977) and 322 the num er of the plasmid within their stock collection. Plasmids constructed within my own la oratory have the nomenclature pRJRXXX, where XXX is a three-digit num er indicating the order in which plasmids have een constructed so they can e easily identified in a la oratory plasmid list. Other plasmids are named for specific, ut still comparatively o scure, reasons, e.g. the pUC plasmids are named after The University of California, while others have names that more readily indicate their predominant function, e.g. pYAC4. Although many attempts have een made to standardize nomenclature (White, Mattais and Ne ert, 1998), historical names tend to e maintained. Indeed, the naming of DNA y its constructor does allow for wide variations that may not otherwise e seen. Most plasmids in common use today are ased upon the replication origin of the naturally occurring E. coli plasmid ColE1, or its very close relative pMB1 see Box 3.1 for a description of plasmid names. ColE1 is a 6646 p closed-circular DNA molecule that encodes a acteriocin, colicin E1 (the product of the cea gene), and a resistance gene that allows the host acterium to escape the effects of the acteriocin (the product of the imm gene). Colicin E1 is a transmem rane protein that causes lethal mem rane depolarization in acteria (Konisky and Tokuda, 1979). The drug resistance gene codes for a protein that interferes with the action of colicin y inhi iting its a ility to form a channel through the acterial mem rane (Zhang and Cramer, 1993). Bacteria har ouring the ColE1 plasmid can e distinguished from their counterparts that do not possess the plasmid y their a ility to grow on plates containing colicin E1. However, screening for this type of growth is technically difficult 3.1 PLASMIDS 115 and has long since een a andoned in favour of simpler anti iotic screening methods (see elow). The ColE1 plasmid is replicated in a relaxed fashion using the DNA polymerase provided from the host cell. Unlike the acterial origin of replication (OriC), however, the replication of ColE1 proceeds in one direction only. The plasmid does not encode proteins that initiate the replication process, ut it does code for several non-translated RNA molecules that are involved in the initiation of replication and one protein that plays a role in regulating the RN A molecules. The ColE1 DNA replication origin is in a region of DNA from which two RNA molecules (RNAI and RNAII) are constitutively transcri ed from their own promoters (Figure 3.2). RNAII is complementary to the ColE1 origin of replication and inds to it to form a DNARNA hy rid (Cesareni et al., 1991). The ound RNA II molecule is cleaved at the origin, y the host encoded enzyme RNase H, and serves as a primer from which DNA replication occurs. Through complementary ase pairing, RNAII can ind on one DNA strand only, and hence the replication of the ColE1 plasmid is unidirectional. Figure 3.2. The ColE1 plasmid and origin of replication. The ColE1 plasmid is a dou lestranded closed-circular DNA molecule, 6646 p in length. It codes for, amongst other things, the acteriocin colicin E1 (the product of the cea gene) and an immunity protein (the product of the imm gene) that prevents the toxic effects of the acteriocin in cells har ouring the plasmid. DNA replication occurs in one direction only, as indicat ed y the direction of the ori arrow. Two non-translated RNA molecules, termed RNAI and RN

AII, and the protein product of the rop gene, control the replication process see the text for details. The relative positions of these replication control elements to the ori gin (ori) are shown. The arrows on the genes indicate the direction of transcription 116 VECTORS 3 Control of the replication process is achieved y another non-translated RNA molecule. RNAI is complementary to the 5_-end of RNAII, and the RNA duplex formed etween RNAII and RNAI cannot serve as a replication primer. RNAI is a relatively short-lived species, and consequently the ColE1 plasmid is maintained at a level of a out 1520 copies per cell. The interaction etween RNAI and RNAII is sta ilized y the ROP protein (Helmer-Citterich et al., 1988). The mechanism of DNA replication employed y ColE1 has several important consequences. The arrangement of regulators results in plasmid incompati ility that is, two plasmids with the same origin of replication cannot coexist in the same cell. RNAI of the resident plasmid prevents RNAII of the incoming plasmid from forming a primer, so it is not replicated. This is importa nt for cloning experiments when a mixed population of plasmids, which is often the result of a ligation reaction, is transformed into acteria. Individua l transformants produced in this way will contain a single plasmid and not a mixed population. Plasmids with different replication origins (e.g. ColE1 and p15A) are a le to co-exist within the same cell. An additional consequence of the ColE1 replication mechanism is that fresh protein synthesis is not required to initiate DNA replication. In the presence of anti iotics that lock protein synthesis (e.g. chloramphenicol), chromosomal DNA replication is halted, ut ColE1 ased plasmids continue to e replicated and accumulate at high levels (10002000 copies per cell) (Clewell, 1972). 3.1.1 pBR322 ColE1, and its very close relative pMB1, have the potential to e useful cloning vectors, ut they suffer from a num er of disadvantages. Primarily, the difficul ty in the identification of recom inant plasmids (those in which the plasmid DNA has een ligated to an insert) meant that they did not gain widespread usage. What was required was a plasmid that could e replicated in the same way as ColE1, ut in which recom inants could e easily recognized. The plasmid pBR322 (Figure 3.3) was the first widely used plasmid vector. It is a small plasmid (4363 p) that was constructed using components from naturally occurring plasmids and other DNA fragments (Bolivar et al., 1977). pBR322 contains the following components. Origin of replication. pBR322 carries the ColE1 replication origin and rop gene to ensure reasona ly high plasmid copy num er (1520 copies per cell), which can e increased 200-fold y chloramphenicol amplification. 3.1 PLASMIDS 117 Figure 3.3. The plasmid pBR322. This plasmid contains the ColE1 origin of replic ation (ori and rop), together with two anti iotic resistance genes. The ampicillin res istance gene (AMPR, or la encoding -lactamase) and the tetracycline resistance gene (TETR) ea ch contain a num er of restriction enzyme recognition sites that occur only once wi thin the plasmid sequence Anti iotic resistance genes. pBR322 carries two genes that can e used as selecta le markers. The ampicillin resistance gene (termed la or, more commonly, AMPR) was cloned into the plasmid from the Tn3 transposon, and the tetracycline resistance gene (termed tet or TETR) was cloned from

the plasmid pSC101 (Bernardi and Bernardi, 1984). Cloning sites. The plasmid carries a num er of unique restriction enzyme recognition sites. Some of these are located in one or other of the anti iotic resistance genes. For example, sites for PstI, PvuI and SacI are found within AMPR, and sites for BamHI and HindIII are located within TETR. The anti iotic resistance genes in pBR322 allow for the direct selection of recom inants in a process called insertional inactivation. For example, if we want to clone a DNA fragment into the BamHI site of pBR322, then the insert DNA will interrupt the gene responsi le for tetracycline resistance, ut the gene for ampicillin resistance will not e altered. Transformed cells are first grown on acterial plates containing ampicillin to kill all the cells that do not contain a plasmid. Those cells that grow on ampicillin are then replica plated onto medium containing oth ampicillin and tetracycline. Those cells that grow in the presence of the ampicillin, ut die under tetracycline selectio n, contain plasmids that have foreign DNA inserts (Figure 3.4). In other words, the insertion of a foreign DNA fragment into an anti iotic resistance gene inactivates the gene product and leads to anti iotic sensitivity. 118 VECTORS 3 Figure 3.4. Insertional inactivation of anti iotic resistance genes in pBR322. I f a DNA fragment is inserted into the BamHI restriction enzyme recognition site within t he TETR gene, the resulting recom inant plasmid, when transformed into acteria, will st ill give rise to ampicillin resistance, ut will e una le to promote acterial growth on plates containing tetracycline. The TETR gene will e functionally inactivated y the p resence of the insert DNA 3.1 PLASMIDS 119 3.1.2 pUC Plasmids pBR322 was a reakthrough for molecular iology as the first widely used plasmid for molecular cloning, ut the dou le screening procedure required to identify recom inant clones was oth time consuming and error prone. In 1982, a new series of plasmids were developed that permitted the identification of the foreign DNA containing cells in a single screening step. These are called the pUC plasmids (Vieira and Messing, 1982). They have three important additional features compared with pBR322 (Figure 3.5). High copy num er a mutation within the origin of replication produces 500600 copies of the plasmid per cell without the need for chloroamphenicol amplification. The mutation, a G to A change one AMPR Figure 3.5. The pUC plasmids. pUC18 is a small plasmid that contains a mutated version of the ColE1 replication origin that promotes high-copy-num er DNA repli cation. The plasmids also contain the ampicillin resistance gene (AMPR) and the lacZ_ ge ne, which encodes the first 63 amino acids of lacZ the -peptide. Embedded within the coding sequence of l cZ_ re the recognition sites for number of restriction e nzymes. This multiple cloning site (or polylinker) is used to clone in DNA fr gments. Th e presence of insert DNA will disrupt the function of the l cZ -peptide nd is used for scre ening. Different pUC pl smids differ in the composition of the multiple cloning site 120 VECTORS 3 3.1 PLASMIDS 121 nucleotide upstre m of the initi tion site of RNAI, reduces the level of the RNAI tr nscript nd consequently results in n incre se in DNA replic tion

using RNAII s the primer. Bluewhite screening screening of this type is speci l form of insertion l ctiv tion th t c n be used during the prim ry selection of tr nsform nts, r ther th n requiring second round of screening. It utilizes the minotermin l portion of E. coli -galactosidase (called the -peptide) encoded by the vector in form of intermolecul r complement tion th t restores -galactosidase activity to a defective enzyme (the -peptide) encoded by the host. The E. coli enzyme -galactosidase is a large polypeptide (monomer = 1173 amino acids; 117 kDa) that is the product of the lacZ gene. The active form of the enzyme is a homo-tetramer (468 kDa). Certain mutations in the 5_ region of lacZ prevent su unit association of the resultant protein (-peptide) and the monomers lack enzyme activity (Ullmann, 1992). In some such mutants, subunit assembly (and enzyme activity) can be restored by the presence of a small (50- or so amino-acid) amino-terminal fragment of the lacZ product (the -polypeptide) (Juers et l., 2000). Such l cZ mut nts re s id to be subject to -complement tion. The product of the l cZ_M15 llele l cks mino cids 1141 of wild-type -galactosidase and is su ject to -complement tion. Messing nd co-workers took dv nt ge of -complement tion in constructing the pUC pl smid series of cloning vectors (Vieir nd Messing, 1982). These vectors c rry multiple cloning site (MCS) embedded in the sequence for the -peptide gene fr gment. The MCS does not lter the re ding fr me of l cZ or destroy the bility of the fr gment to -complement (Figure 3.6). The insertion of other DNA fr gments into the MCS will, however, interrupt the coding sequence of the -peptide, rendering it non-function l. If chromogenic substr te (XG l, 5-bromo-4-chloro-3-indoyl--D-galactopyranoside) and a -galactosidase Figure 3.6. -complement tion nd XG l st ining. XG l (5-bromo-4-chloro-3-indolyl-D-galactoside) is cleaved y a functional -galactosidase enzyme into galactose an d a 5- romo-4-chloro-indoxyl derivative. The indoxyl spontaneously dimerizes and oxi dizes to form an insolu le lue dye (5,5_-di romo-4,4_-dichloro-indigo; not shown). Th e galactosidase enzyme is the product of the lacZ gene and the active form of the enzyme is a tetramer of identical polypeptides. Certain mutants of lacZ (such as lacZ_M15) produce versions of the protein that do not include the extreme amino-terminal end of th e 1173 amino acid polypeptide. Such derivatives, termed the -peptide, are unable to form the active tetramer and are not functional as -galactosidase enzymes. The -peptide can be made functional by co-expressing the lacZ -peptide ( mino cids 163) in the s m e cell. The -peptide promotes tetr mer form tion nd restores enzym tic function 122 VECTORS 3 inducer (IPTG) re included in the pl tes on which the prim ry tr nsform nts re selected, non-recombin nt molecules will c t bolize the colourless substr te to give blue colonies, while recombin nts will give white colonies. A multiple cloning site or polylinker. This is synthetic piece of DNA th t h rbours the sequence of sever l unique restriction enzyme recognition sites. It w s inserted within the portion of the vector encoding the -galactosidase -peptide in such w y th t it does not ffect its expression or function. However, inserting foreign DNA fr gment into ny one of the polylinker restriction enzyme recognition sites inv ri bly disrupts the ctivity of the -peptide. Thus recombin nt colonies rem in white but non-recombin nts turn blue. The m jor cloning dv nt ge of pUC pl smids over pBR322 is th t foreign DNA fr gments c n be cloned into v riety of restriction enzyme sites nd recombin nts r pidly screened. Addition lly, -complement tion requires only

sm ll gene to be c rried on the pl smid. The DNA encoding the -peptide in the pUC-series of vectors is less th n 400 bp in length. If the entire l cZ open re ding fr me were needed, over 3500 bp of DNA would need to be m int ined on these vectors. In gener l, the st bility of m ny pl smid vectors decre ses s their size incre ses. This results in limiting the length of DNA th t c n be cloned into ny p rticul r pl smid. For ex mple, the bility of b cteri l cells to m int in recombin nt pUC pl smids decre ses signific ntly s their size ppro ches 15 kbp (T ble 3.1). Since the -polypeptide is sm ll, the tot l size of the vector is minimized, llowing it to c rry correspondingly l rger insert . 3.2 Select ble M rkers An essenti l fe ture of the pl smids we h ve discussed so f r is the bility to ccur tely select for cells th t h ve t ken up the pl smid. M ny such selection systems re currently v il ble. The choice of select ble m rker usu lly rests with the type of cell th t is being tr nsformed. Some of the m rkers will only function g inst prok ryotes, while others h ve bro der spectrum of ction. Some of the commonly used select ble m rkers re listed below, together with their mech nism of ction. Ampicillin binds to nd inhibits number of enzymes in the b cteri l membr ne th t re involved in the synthesis of the gr m-neg tive cell w ll. Therefore, proper cell replic tion c nnot occur in the presence of mpicillin. The mpicillin resist nce gene (AMPR or bl ) codes for the enzyme 3.2 SELECTABLE MARKERS 123 -lactamase that is secreted into the periplasmic space of the acterium, where it catalyzes hydrolysis of the -lactam ring of the ampicillin. Thus, the gene product of the AMPR gene destroys the anti iotic. Over time the ampicillin in a culture medium or petri dish may e su stantially destroyed y -lactamase. When this occurs, selective pressure to maintain the plasmid is lost and cell populations can arise that lack the plasmid. Tetracycline inds to a protein of the 30S su unit of the ri osome and inhi its ri osomal translocation along the mRNA and there y interferes with protein translation. The tetracycline resistance gene (TETR) encodes a 399 amino acid outer mem rane associated protein of gram-negative cells that prevents the anti iotic from entering the cell. Thus, the drug resistance gene does not destroy the anti iotic. Selective pressure will e maintained throughout the cell culture process to keep the plasmid containing the drug resistant gene. Chloramphenicol inds to the ri osomal 50S su unit and inhi its protein synthesis. The chloramphenicol resistance gene (CMR) codes for chloroamphenicol acetyltransferase (CAT). The CAT protein is a tetrameric cytosolic protein that, in the presence of acetyl coenzyme A, catalyzes the formation of hydroxyl acetoxy derivatives of chloramphenicol that are una le to ind to the ri osome. As with ampicillin, the CMR gene product destroys the anti iotic. Kanamycin and neomycin ind to ri osomal components and inhi it protein synthesis. The KANR gene codes for a protein that is secreted into the periplasmic space and interferes with the transport of these anti iotics into the cell. Like tetracycline resistance, the KANR gene does not destroy the anti iotic. Bleomycin and zeocin glycopeptide anti iotics that ind to DNA and inhi it DNA and RNA synthesis. They are active against most acteria (including E. coli), eukaryotic microorganisms (e.g. yeast), plant cells and animal cells. The Sh le gene from the acterium Streptoalloteichus hindustanus encodes a small protein that confers resistance to zeocin y inding to the anti iotic (Gatignol, Durand and Tira y, 1988). Hygromycin B inhi its translation y interfering with ri osome translocation. The anti iotic is active against oth prokaryotes and eukaryotes. The resistance gene (HYGR, encoding hygromycin-B-phosphotransferase) inactivates the anti iotic y phosphorylation.

124 VECTORS 3 Plasmid ased vectors are extremely widely used and have een adapted to serve a variety of functions. Many of these will e discussed in later chapters of this ook, ut here I will list a num er of examples to give the reader a flavou r of the diversity of plasmid use. General cloning most DNA manipulation performed in the la oratory is carried out in plasmids. The ease of oth use and storage of plasmid DNA molecules makes them a popular choice for most recom inant DNA experiments. Shuttle vectors these plasmids contain not only the origin of replication and selecta le marker for E. coli, ut also functionally similar sequences for maintenance in other hosts. For example, plasmids for the cloning and expression of genes in the yeast Saccharomyces cerevisiae contain oth replication origins and selecta le markers for oth E. coli and yeast. Most DNA manipulations will e performed using E. coli as a host, prior to transformation of the final DNA construct into yeast. RNA production many plasmids have een designed so that the foreign DNA fragments cloned into then can e transcri ed into RNA. Such plasmids contain the promoter sites for an RNA polymerase, e.g. those from the acteriophages T3, T7 or SP6, such that RNA can e made in vitro using the purified RNA polymerase and the plasmid DNA. The RNA made y this method is often used as pro es for hy ridization in Northern lotting. Protein production many plasmids contain promoter sequences to express the foreign genes that they contain. Often the expression is performed in E. coli, ut, using the appropriate promoters, protein expression can e driven in almost any organism. High-level protein production could e driven from a strong promoter, while low-level production would e driven from weaker promoters. Levels of protein production may also e modulated y altering the copy num er of the plasmid. Given the wide range of plasmids that are availa le to researchers today, systems have een developed to move DNA fragments etween a variety of plasmids with differing functions. For example, if you have cloned a gene, you might want to express the gene product at high levels in E. coli, while also expressing the protein in mammalian tissue culture cells and producing a tagged version of the protein to which monoclonal anti odies are availa le. Systems have therefore een devised for the shuttling of DNA fragments etween 3.2 SELECTABLE MARKERS 125 Figure 3.7. Gene shuffling etween plasmids using recom ination. Genes are trans ferred from the donor plasmid to the acceptor plasmid at the loxP sites using the Cre recom inase. See the text for details vectors without having to use restriction enzymes. One such system is outlined in Figure 3.7. Here, the DNA fragment encoding the target gene is transferred from one plasmid to another y the action of the Cre recom inase. Cre is a 38 kDa recom inase protein from the acteriophage P1 (Stern erg et al., 1981). It mediates recom ination etween or within DNA sequences at specific locations called loxP sites (A remski and Hoess, 1984). These sites consist of two 13 p inverted repeats separated y an 8 p spacer region. The 8 p spacer region in the loxP site has a defined orientation that forces recom ination to occur in a precise direction and orientation. Donor plasmids contain two loxP sites, which flank the target gene. Acceptor plasmid contain a single loxP site and elements to which the target gene will ecome fused. The target gene, once 126 VECTORS 3 transferred, will ecome linked to the specific expression elements for which th e acceptor vector has een designed. Furthermore, if the coding sequence for the gene of interest is in frame with the upstream loxP site in the donor vector, it

will automatically e in frame with all peptides designed in the acceptor vector . An alternative donor and acceptor plasmid system is ased upon site-specific recom ination reactions mediated y phage (Karimi, Inze and Depicker, 2002). In this case, DNA fragments f anked by recombination sites (att) can be transferred into vectors containing compatib e recombination sites (att attP or attL attR) in a reaction mediated by the recombination proteins. The versati ity of p asmids has ead to their widespread acceptance as the vectors of choice for many gene manipu ation experiments. P asmids do, however, suffer from a number of significant shortcomings. First, the efficiency at which the p asmid is transferred to a bacteria ce is very ow. P asmid DNA mo ecu es must be transformed into competent bacteria ce s (see Chapter 2), but this process is inefficient. At best, E. co i ce s can be made competent to a eve such that 1 109 transformed ce s can be generated per microgram of p asmid DNA. For a typica p asmid, 1 g represents about 1.5 1011 mo ecu es. This means that the ce s are taking up ess than 1 per cent of the avai ab e DNA mo ecu es. Second, the capacity of p asmids to carry arge fragments of foreign DNA is imited (Tab e 3.1). Most p asmids become unstab e if their overa size exceeds about 15 kbp. P asmids arger than this tend to undergo recombination events, which can resu t in the reordering or e imination of DNA from them. Other types of vector have thus been deve oped to overcome these difficu ties. 3.3 Vectors In the ear y 1950s, Andre Lwoff described an astonishing property of Escherichia co i. When he irradiated certain strains of E. co i ce s with a moderate dose of u travio et ight, the bacteria stopped growing and, after abou t 90 min, the bacteria ysed and re eased many vira partic es, ca ed , into the cu ture media (Figure 3.8). The viruses, more common y ca ed bacteriophages or simp y phages, are ab e to infect other E. co i ce s that had not previous y been infected by phages. Not a bacteria ce s underwent this ytic phase when irradiated in this way. Most E. co i ce s are re ative y unaffected by suc h sma u travio et doses. However, bacteria that had previous y been exposed to phage , but had not undergone ysis, showed this remarkab e property. Upon infection by phage , E. co i ce s wi undergo one of two fates. Either ce ysis proceeds and new y synthesized phage partic es wi be re ease d into the surrounding medium, or, a ternative y, the phage can switch into a 3.3 VECTORS 127 Head Vira genome Tai 110000 nnm Figure 3.8. The structure of bacteriophage . An e ectron micrograph of phages tha t have been re eased upon bacteria ysis, and a diagrammatic representation of th e overa structure of . The EM image is courtesy of Professor Ross Inman (University of Wi sconsin) and is reproduced with permission. Other exce ent EM images of can be found on Professor Inmans web site (http://www.biochem.wisc.edu/inman/empics/) dormant ifesty e in which the phage DNA becomes integrated into the E. co i chromosome in this case a ysogenic ifesty e is adopted. The ife cyc e of phage is shown diagrammatica y in Figure 3.9. It is the ysogenic bacteria that show rapid ysis upon u travio et radiation. In the aboratory, phage growth and rep ication is monitored on petri dishes (Figure 3.10). The phage is mixed with E. co i ce s is a soft agar so ution ca ed top agar. The mixture is poure d onto the surface of a nutrient agar p ate and incubated to a ow bacteria

growth. growth wi cause the death ( ysis) of the E. co i ce s surrounding the site of initia infection. Such sites are observed on the p ates as somewhat turbid p aques in the bacteria awn. DNA can be purified from the phage partic es contained with these p aques. The genetics and mo ecu ar bio ogy of bacteriophage have been extensive y studied for further information on , readers are directed to the exce ent text by Mark Ptashne (Ptashne, 1992). The DNA contained within the phage is a inear doub e-stranded mo ecu e 48 502 bp in ength. The extreme 5_- and 3_-ends of the genome have 12 bases that are sing e-stranded ca ed the cohesive or cos ends. These sequences are comp ementary and can annea with each other to form a circu ar doub e-stranded DNA mo ecu e (Figure 3.11). Functiona y re ated genes of are genera y c ustered together on the genome, except for the two positive regu atory genes N and Q. Genes on the eft-hand side of the conventiona genome map code for head and tai proteins of the phage partic e. These are fo owed by genes whose protein products 128 VECTORS 3 Lysogenic pathway Lytic pathway are concerned with recombination and the processes of ysogeny in which the circu arized phage chromosome is inserted into the host chromosome and is stab y rep icated as a prophage. To the right of the map are genes concerned with transcriptiona regu ation and prophage immunity to superinfection (N, cro, cI), fo owed by the genes for DNA synthesis, ate function regu ation (Q) and host ce ysis. Our extensive know edge of phage and the ways in which the ytic and ysogenic ife cyc es are regu ated have made an idea vector to carry foreign DNA fragments. The major advantage of based vectors over p asmids is the efficiency at which the phage can infect E. co i ce s. As we have a ready discussed, the transformation of p asmid DNA into bacteria is not an efficient process, whereas infection is a very efficient way to introduce DNA into a bacteria ce . To understand how can be exp oited as a vector, it is important to have a basic know edge of the phage itse f. phage infection and ysis occurs in number of defined steps. Infection occurs as a resu t of the adsorption of th e phage partic e to the bacteria ce by binding to the ma tose receptor. The genomic DNA is injected into the ce and a most immediate y circu arizes. At this point it can enter one of two pathways. Lysogenic pathway. The phage DNA becomes integrated into the bacteria genome (via homo ogous recombination between attP and the bacteria genomic attB site) and is rep icated a ong with the bacteria DNA. The prophage DNA remains integrated unti it is induced to enter the ytic pathway. Lytic pathway. Large-sca e production of bacteriophage partic es (proteins and DNA) occurs that eventua y eads to the ysis of the ce . The decision as to whether ysis or ysogeny occurs is the resu t of the activit y of the cII protein. Active cII is required for the transcription of the cI repre ssor Figure 3.9. ife cyc e. Upon infection, bacteriophage attaches to the surface of a bacteria ce , and its DNA enters the bacterium. A most immediate y, the DNA circu arizes. The DNA can then enter either the ysogenic or the ytic pathway. During ysogeny, the DNA integrates into the E. co i chromosome and is rep icated, a on g with the host DNA, such that the prophage is passed onto subsequent generations. In the ytic phase, the DNA does not integrate, but is immediate y rep icated and trans cribed to produce new phage partic es. Eventua y, bacteria ce ysis occurs and the new y formed phage are re eased into the surrounding medium. The ysogenic prophage ma

y be induced into the ytic cyc e by, for examp e, treatment with UV ight. In thi s case the DNA oops out of the E. co i genome and the ytic pathway is initiated 130 VECTORS 3 phage Bacteria Top agar Mix Pour onto agar p ate Incubate p aque Bacteria awn Figure 3.10. p aques. phage is grown in the aboratory on a awn of bacteria ce s. The bacteria and the phage partic es are mixed with iquid, but coo , top agar. The mixture is then poured onto an a ready set agar p ate where the top agar is a owed to so idify. The p ate is then incubated for 1216 h at 37 C. p aques form as turbi d circ es in the bacteria awn and for some of the genes required for phage DNA integration into the E. co i chromosome. Active cII resu ts in the adoption of the ysogenic pathway, whi e inactive cII resu ts in the ytic pathway being fo owed. The cII protein is re ative y unstab e and is susceptib e to c eavage and destruction by bacteria proteases. Environmenta conditions inf uence the activities of these proteases. When grown in rich medium, for examp e, the proteases are genera y active, such that cII is degraded and ysis occurs. Under conditions of E. co i starvation, the proteases are ess functiona and, consequent y, wi more 3.3 VECTORS 131 cos DNA 48502 bp attP cIII N cI cro cII Q Head Tai CG GCCCCGCCGCTGGA GGGCGGCGACCTCG GC Non-essentia for ytic growth Tai Head Recombination DNA rep ication Lysis Non-essentia for ytic growth cos cos C I N c cro c Q Recombination DNA rep ication Lysis Figure 3.11. The circu ar and inear forms of the genome. DNA exists in a inear form in the bacteriophage and in a circu ar form upon entering the bacterium. Th e switch from the inear to the circu ar form occurs through comp ementation of the overh anging DNA ends at the cos sites. Many of the genes required for the integration of int o the host chromosome, or for new phage rep ication and assemb y, are grouped together on

the chromosome. Some of these genes, or sets of genes, are shown. A region of th e genome that is not required for ytic growth is indicated frequent y ysogenize. This behaviour makes sense, for in starved ce s there wi be ess of the components necessary to make new phage partic es. The ytic pathway is characterized by a series of transcriptiona events that produce different sets of proteins that are required for rep ication of the phag e DNA and the production of new phage partic es. Ear y transcription. Transcription of the N and cro genes occurs. This transcription is subject to repression by the product of the cI gene and in a ysogen this repression is the basis of immunity to superinfection. De ayed ear y transcription. The N protein product binds to the bacteria RNA po ymerase and promotes transcription of the phage genes invo ved in DNA rep ication. Rep ication. Ear y rep ication proceeds from a sing e origin of rep ication site. Later rep ication proceeds via a ro ing circ e mechanism to produce 132 VECTORS 3 ong concatamers of the phage DNA that are connected to each other at the cos sites. Late transcription. The protein product of the cro gene bui ds up to a critica eve and then stops ear y transcription. The product of the Q gene activates transcription, resu ting in the production of the proteins required for the head and tai of the mature phage partic e, and those required for bacteria ce ysis. Fina y, phage assemb y occurs when a unit ength of DNA is p aced into the assemb ed head by c eavage of the concatameric DNA at the cos sites. The tai is added and the mature phage partic e is comp eted. Upon ce ysis, approximate y 100 new y synthesized phage partic es are re eased from a sing e infected bacteria ce . Wi d-type DNA contains few unique restriction enzyme recognition sites into which foreign DNA fragments cou d be c oned, and is consequent y not who y suitab e as a vector to carry such sequences. Additiona y, the packaging of DNA into the phage is size imited. Efficient packaging wi on y occur with DNA fragments representing between 78 and 105 per cent of the wi dtype genome size (3751 kbp). These imits pose severe restrictions upon the amount of DNA that can be c oned into the phage genome. Two important deve opments, however, suggested that might be suitab e as a c oning vector. First y it was determined that the gene products required for recombination cou d be removed from the genome and the ytic ife cyc e cou d sti be comp eted and p aques wou d form. The remaining DNA, often referred to as the eft-hand and right-hand arms of the genome, is capab e of providing a necessary functions for the ytic pathway to occur. Second y, natura y occurring restriction enzyme recognition sites cou d be e iminated without oss of gene function, which permitted the deve opment of vectors with unique sites for the insertion of foreign DNA. vectors cou d thus be constructed that acked the genes required for recombination, and therefore cou d on y enter the ytic cyc e, but were capab e of carrying much arger foreign DNA inserts. Two basic types of vector have been deve oped: insertiona vector DNA is inserted into a specific restriction enzyme recognitio n site; rep acement vector foreign DNA rep aces a piece of DNA (stuffer fragment) of the vector. 3.3 VECTORS 133 The advantage of rep acement vectors is that they are capab e of carrying arger DNA inserts. For examp e, EMBL4 is a 42 kbp vector that contains 14 of kbp stuffer DNA between the eft-hand and right-hand arms of . The igation of just the arms wou d generate a 28 kbp genome. This is too

sma to be packaged into a partic e. The insertion of foreign DNA between the two arms wi , however, enab e the genome to attain a suitab e packaging size. The packaging size imit means that EMBL4 is capab e of ho ding foreign DNA fragments up to approximate y 23 kbp in size (Figure 3.12). insertion vectors cos Accepts-8 kbp EcoRI cI cos gt10 (43,430 bp) cos acZ cos ZAPII (40,820 bp) cos Accepts-10 kbp MCS Tai Head Recombination CIII N cI cro cII DNA rep ication Q Lysis Non-essentia for ytic growth cos Tai Head N cI cro cII DNA rep ication Q Lysis cos cos Cut and igate Wi d-type DNA (48,502 bp) vector (3043 kbp) rep acement vectors cos cos Sa I BamHI EcoRI EMBL4 (42,360 bp) Cut with EcoRI, or BamHI or Sa I (or in combination) and igate with insert DNA cos cos Insert DNA EcoRI BamHI Sa I cos cos Cut with EcoRI and igate with insert DNA EcoRI EcoRI WES. B cos cos Insert DNA Se ection on basis of size non-recombinant vector too sma to be packaged cos cos Stuffer DNA cos cos Insert DNA Cut and igate 14 kbp Accepts-23 kbp Accepts 417 kbp Figure 3.12. insertion and rep acement vectors. A vectors have regions nonesse ntia for ytic growth removed to increase the amount of DNA that wi be packaged into the mature phage. Two insertion vectors are shown. gt10 contains a unique

No protein E, no heads No protein D, no DNA packaging Mix and add concatemerized l DNA cos cos cos cos Mature l phage 3751 k p etween cos sites Figure 3.13. In vitro packaging of phage partic es. Two different ysogens are used to produce the various components required for the packaging of partic es. One of these ysogens (BHB2688) has a defective E gene, which resu ts in no heads be ing produced. The other (BHB2960) has a defective D gene, resu ting in a defect in D NA packaging. Mixing ce ysates of the two wi resu t in an extract that is ab e to package concatamerized DNA. The mu timerized DNA (3751 kbp) wi be c eaved at the cos sites and packaged into a mature phage partic e 3.4 COSMID VECTORS 135 expressing the -fragment of -galactosidase in a similar way to screening for recom inants in pUC ased plasmids. Once a recom inant genome has been constructed, the prob em arises of how to get the DNA into a vira partic e so that it can be rep icated in E. co i ce s (Figure 3.13). Norma in vivo packaging of DNA invo ves first making pre-heads, structures composed of the major capsid protein encoded by gene E. A unit ength of DNA is then inserted into the pre-head, with the unit ength

EcoRI restriction enzyme recognition site in the cI gene. Recombinants wi form c ear rather than turbid p aques. ZAPII contains a mu tip e c oning site (MCS) in the acZ_ gene and recombinants are identified using b uewhite screening. Recombinants in rep acement vectors are the on y phages that wi grow; if the two ends are iga ted in the absence of insert DNA, the DNA is too sma to be packaged 134 VECTORS 3 The size imitations of packaging thus provide a mechanism to ensure that foreign DNA has been inserted in between the arms to form a recombinant. Severa other basic strategies have been devised to identify phage recombinants. Inactivation of the cI gene. Severa phage vectors (e.g. gt11) have unique restriction enzyme recognition sites contained within the cI gene. Phages in which the cI gene has been disrupted by foreign DNA insertion have an a tered morpho ogy, in which the p aques produced appear c ear as opposed to turbid. Screening of this type is technica y difficu t and requires a dea of ski on the part of the observer. B uewhite screening. phage vectors (e.g. ZAP) have been constructed to contain the acZ_ gene expressing the -fr gment of -galactosidase. Screening for recom inant phages can then e preformed in E. coli cells l lysogen BHB2688 (mutant protein E) l lysogen BHB2960 (mutant protein D) Tails Heads Tails Protein D and other assem ly proteins Protein E and other assem ly proteins

being prepared by c eavage of concatamerized genomes at neighbouring cos sites. A minor capsid protein D is then inserted in the pre-heads to comp ete head maturation, and the products of other genes serve as assemb y proteins, ensuring joining of the comp eted tai s to the comp eted heads. packaging of recombinant genomes can occur in vitro by uti izing two E. co i strains that bear ysogens containing different defects in the packaging pathway. A defect in producing protein E, resu ting from a mutation introduced into gene E, prevents pre-heads being formed in strain BHB2688. A mutation in gene D prevents maturation of the pre-heads, with enc osed DNA, into comp ete heads in strain BHB2690. The components of the BHB2688/BHB2690 mixed ysate, however, comp ement each others deficiencies and provide a the products for correct packaging (Figure 3.13). Consequent y, recombinant genomes can be constructed in vitro and packaged into mature phage partic es before being propagated and rep icated in E. co i ce s. 3.4 Cosmid Vectors The on y DNA requirements for in vitro packaging into phage are the presence of two cos sites that are separated by 3751 kbp of intervening sequence. Cosmids were deve oped in ight of this observation, and are simp y p asmids that contain a phage cos site (Co ins and Br uning, 1978). Figure 3.14 shows the overa architecture of a cosmid vector and a c oning scheme for the insertion of foreign DNA. As p asmids, cosmids contain an origin of rep ication and a se ectab e marker. Cosmids a so possess a unique restriction enzyme recognition site into which DNA fragments can be igated. After the packaging reaction has occurred, the new y formed partic es are used to infect E. co i ce s. The DNA is injected into the bacterium ike norma DNA and circu arizes through comp ementation of the cos ends. The ack of other sequences means, however, that infection wi not proceed beyond this stage. The circu arized DNA wi , however, be maintained in the E. co i ce as a p asmid. Therefore se ection of transformants is made on the basis of antibiotic resistance and bacteria co onies (rather than p aques) wi form that contain 136 VECTORS 3 BamHI cos BamHI BamHI BamHI BamHI BamHI BamHI BamHI BamHI BamHI BamHI Cut with BamHI Ligate Insert DNA Infect E. co i partic es Petri dish containing agar with ampici in Co ony containing circu ar recombinant cosmid BamHI AMPR AMPR pJB8 cos DNA 5.4 kbp ori ori In vitro packaging Figure 3.14. C oning using a cosmid vector. The overa architecture of a cosmid vector, pJB8, is shown, together with a scheme for the insertion of foreign DNA into a c osmid. Since the cosmid acks other genes, when the DNA is inserted into the E. co i ce it is maintained as a p asmid and se ected for on the basis of antibiotic resistance the recombinant cosmid. Since phage partic es can accept between 37 and 51

kbp of DNA, and most cosmids are about 5 kbp in size, between 32 and 47 kbp of DNA can c oned into these vectors. This represents considerab y more than cou d be c oned into a vector itse f. 3.5 M13 VECTORS 137 Cosmids, ike p asmids, are very stab e, but the insertion of arge DNA fragments can mean that recombinant cosmids are difficu t to maintain in a bacteria ce . Repeat DNA sequences are common in eukaryotic DNA, and DNA rearrangements can occur via recombination of the repeats present on the DNA inserted into the cosmid. The major difficu ty in working with cosmids is, however, the production of inear, igated DNA fragments in which the cosmid and insert are concatamerized together. Two basic prob ems exist. Ligation reactions of cosmid and insert DNA, ike those shown in Figure 3.14, wi generate circu ar DNA mo ecu es that are unab e to participate in the in vitro packaging reaction. More than one insert DNA mo ecu e can be igated between each cosmid DNA fragment. This cou d give a fa se impression of the DNA organization of the insert. These difficu ties can be overcome by cutting the cosmid with two different restriction enzymes to generate eft-hand and right-hand ends that cannot re igate to each other (Ish-Horowicz and Burke, 1981). Suitab e phosphatase treatment of the insert DNA ensures that mu tip e inserts cannot be igated to the cosmid DNA (see Chapter 2). 3.5 M13 Vectors M13, and its very c ose re atives f1 and fd, are fi amentous E. co i bacteriopha ges. M13 is a ma e-specific ysogenic phage with a circu ar sing e-stranded DNA genome 6407 bp in ength (Figure 3.15). M13 phage partic es have dimensions of about 900 nm 9 nm and contain a sing e-stranded circu ar DNA mo ecu e (designated as the + strand). M13 infects bacteria that harbour the F pi us. The phage partic e absorbs via one end to the F pi us, and the sing e-stranded phage DNA enters the bacterium (Figure 3.16). Very rapid y, the sing e-stranded DNA is converted into doub e-stranded (rep icative form, RF) DNA by the synthesis of a comp ementary DNA strand (the strand) using bacteria DNA po ymerase. The RF form of the phage genome is rapid y mu tip ied unti about 100 RF mo ecu es are present within the bacterium. Transcription of the vira genes occurs to produce proteins required for the assemb y of new vira partic es. The production of a vira y encoded sing estran ded binding protein (the protein product of gene 2) eventua y forces asymmetric rep ication of the RF DNA. This resu ts in on y one vira DNA strand being synthesized (the + strand). These sing e-stranded DNA mo ecu es are assemb ed into new vira partic es, and are re eased from the ce without 138 VECTORS 3 Mu tip e c oning site acZ M13mp18 f1 ori 7250 bp f1 ori 2 10 5 7 8 3 6 4 9 1 acZ (M) T M I T N S S S V P G D P L E S T C R H A S L A

EcoRI KpnI BamHI Sa I SphI 5 -atgaccatgattacGAATTCGAGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGcact -3 3 -tactggtactaatgCTTAAGCTCGAGCCATGGGCCCCTAGGAGATCTCAGCTGGACGTCCGTACGTTCGAACCgtga -5 SacI SmaI XbaI PstI HindIII M13 wi d-type 6407 bp 1 2 4 3 5 6 7 8 9 10 Figure 3.15. The genomes of wi d-type M13 and an engineered derivative, M13mp18. The wi d-type M13 genome encodes 10 open reading frames that are a transcribed in the c ockwise direction. Rep ication of the genome initiates bi-directiona y fr om a specific sequence between genes 2 and 4. M13mp18 additiona y bears the acZ_ gene for b uewhite screening of recombinants. Embedded within this gene is a mu tip e c on ing site providing a number of unique restriction enzyme recognition site to aid the c oning of foreign DNA fragments into the vector. The maps shown here represent the doub estranded from, or rep icative form (RF), of the vector that exists within the E. co i hos t. Vira partic es contain on y a sing e strand of the DNA ce ysis occurring. Up to 1000 phage partic es can be re eased into the medium per ce per generation. M13 phage infection does not resu t in bacteria ce death and, consequent y, M13 infections appear as turbid p aques. The E. co i ce s around the site of infection have not been ki ed, but they grow more s ow y due to the burden p aced upon them by producing phage partic es. The M13 origin of rep ication (ca ed the f1 ori) contains two over apping, but distinct, DNA sequences that act to contro the synthesis of DNA. These sites the f1 initiator and the f1 terminator signa the beginning and end of DNA rep ication. The initiator is recognized by the protein product of gene 2, which nicks the + strand in the RF DNA. The nick indicates the position at which unidirectiona ro ing-circ e DNA rep ication wi commence. The new y formed + strand is c eaved at the terminator sequence, again by the protein 3.5 M13 VECTORS 139 M13 infection Vira DNA strand enters bacterium Conversion to doub e-stranded RF form Transcription of vira proteins Ro ing circ e rep ication to form new strands Phage assemb y Phage re ease

Bacterium F pi us Sing e-stranded M13 phage DNA Protein coat 140 VECTORS 3 product of gene 2. Fo owing c eavage, the two ends of the + strand are igated to form the sing e-stranded genome. The switch between the doub e-stranded RF form and the sing e-stranded + form of the M13 vira genome made it an idea candidate for exp oitation as a vector. As we wi see in ater chapters, the sing e-stranded DNA produced in the phage partic e have ed to great advances in mutagenesis in vitro (Chapter 7 ) and DNA sequencing (Chapter 8). Un ike ,M13 does not have a non-essentia region that can be de eted prior to the insertion of foreign DNA. However, there is an intergenic region between the origin of rep ication and gene 2 (Figure 3.15) into which foreign DNA fragments may be inserted. M13 vectors were deve oped in the ate 1970s when the acZ_ gene (encoding the -peptide of -galactosidase) was inserted into the M13 genome (Messing et al., 1977). Su sequently, the same polylinker and -peptide fr gments s the pUC pl smid series were engineered into M13 nd n tur lly occurring restriction enzyme recognition sites were elimin ted (Y nisch-Perron, Vieir nd Messing, 1985; Norr nder, Kempe nd Messing, 1983). The RF form of M13 vectors c n be isol ted by st nd rd pl smid DNA prep r tion procedures nd foreign DNA c n be inserted into them s if they were convention l pl smids. The specific use of M13 vectors is s n id to the form tion of singlestr nded DNA. Once foreign DNA fr gment h s been cloned into M13, l rge mounts of the single-str nded form c n be e sily isol ted from the m ture ph ge th t re extruded from infected E. coli cells. The m in difficulty with vectors of this type is th t they tend to be unst ble when DNA fr gments l rger th n few kilob ses re inserted into them (Zinder nd Boeke, 1982). 3.6 Ph gemids Ph gemids re pl smids th t cont in the f1 ph ge origin of replic tion for the production of single-str nded DNA. Ph gemids re gener lly sm ll pl smids so th t they h ve the bility to ccept l rger DNA inserts th n M13-b sed vectors. Ph gemids were origin lly developed in the e rly 1980s, when it w s found Figure 3.16. The M13 life cycle. The single-str nded M13 genome is enc sed by co t proteins. B cteri l infection occurs when the ph ge p rticle tt ches to the E. coli pilus nd the single DNA str nd is injected into the host. The DNA is immedi tely conv erted to double-str nded form nd is replic ted nd tr nscribed to produce vir l pro teins. The build-up of vir l protein 2 eventu lly forces symmetric DNA replic tion to produce single DNA str nds. These re p ck ged into new vir l p rticles, which re secre ted from the b cteri without cell lysis occurring 3.6 PHAGEMIDS 141 th t the insertion of the f1 origin of replic tion could be cloned into pBR322 to drive the production of single-str nded DNA (Dotto nd Horiuchi, 1981; Dotto, Ene nd Zinder, 1981). The f1 replic tion origin w s not sufficient to direct single-str nded DNA production, but if b cterium c rrying ph gemid w s superinfected with function l wild-type M13 or f1 helper ph ge, then the production of single-str nded ph gemid DNA would occur. The ph gemid single-str nded DNA would be p ck ged into vir l p rticles nd secreted into the surrounding medium in the s me w y th t M13 ph ge p rticles re produced. Addition lly, it w s found th t cloning the f1 origin in the reverse

orient tion would le d to the production of the opposite str nd of DNA (Dente, Ces veni nd Cortese, 1983). Thus, single-str nded DNA representing either str nd of cloned fr gment could be produced fter cloning into suit ble ph gemid vector. Ph gemids h ve the dv nt ge th t, in the bsence of helper ph ge, double-str nded DNA c n be isol ted s norm l pl smid. Moreover, the l ck of ddition l ph ge genes the vectors need to c rry me ns th t their sm ll size h s n incre sed c p city for c rrying l rger foreign DNA fr gments. Other ph gemids h ve been developed th t t ke dv nt ge of v rious spects of pl smids, phage and M13 phage. We have a ready seen that the f1 rep ication origin is composed of an initiator and a terminator. In the wi d-type M13 phage genome these sequences over ap with each other, such that rep ication initiates and then terminates after the fu circu ar genome has been rep icated. The initiator and terminator e ements may be separated from each other to provide starting and ending points for DNA rep ication on a inear DNA mo ecu e. A insertiona vector, ZAP, was constructed such that the eft-hand and right-hand arms were connected via the DNA sequence of a phagemid beginning with the f1 initiator and ending with the f1 terminator (Short et a ., 1988). This vector (shown in Figure 3.17) has the abi ity to function as a phage for, for examp e, the construction of a cDNA ibrary. However, the foreign DNA can be excised from the phage in the form of a p asmid after superinfection with a wi d-type M13 based phage. ZAP contains a the DNA sequences required for ytic growth, and in between these is the DNA sequence needed for p asmid rep ication and se ection (the Co E1 ori and AMPR). Additiona y, ZAP contains the acZ_ gene and mu tip e c oning site sequence in a simi ar fashion to the pUC p asmids. The p asmid sequences in the vector begin with the f1 initiator and end with the f1 terminator. Vectors bearing foreign DNA can be se ected by b uewhite screening of p aques, and the insert DNA can be iso ated in the form of a p asmid when bacteria harbouring the phage are superinfected with an f1 he per phage. Proteins produced by the he per phage wi resu t in DNA rep ication between the f1 initiator and terminator. The 142 VECTORS 3 AMPR pB uescript Co E1 ori f1 ori Terminator Initiator cos cos cos cos ZAPII f1 terminator f1 initiator AMPR Co E1 ori acZ acZ SacI NotI XbaI SpeI EcoRI XhoI MCS Phagemid DNA Left arm Right arm Insert DNA Insert DNA Construct DNA ibrary

Iso ate positive c one Excise p asmid containing insert by co-infection with f1 he per phage Left arm Right arm Figure 3.17. The in vivo excision of phagemid DNA from a phage vector. ZAPII is a sophisticated phage vector containing the e ements of the and M13 phages as we as the sequences required for stab e phagemid production. The DNA sequence for t he entire pB uescript phagemid is contained within the vector between the f1 initia tor and terminator. Foreign DNA inserted into the mu tip e c oning site (MCS) of ZAPII ca n be recovered in the form of a phagemid. Bacteria harbouring the phage are superinfe cted with an M13 based phage to drive DNA rep ication of sequences between the f1 ini tiator and terminator. The M13 phages produced using this DNA can be used to infect F_ E. co i ce s and doub e-stranded p asmid DNA iso ated sing e-stranded DNA wi circu arize and wi be packaged as an M13- ike phage and secreted from the ce . The introduction of the M13 phage partic es into an F_ E. co i strain and se ection on ampici in wi resu t in the formati on of co onies containing the recombinant p asmid, which can then be iso ated as doub e-stranded p asmids. 3.7 ARTIFICIAL CHROMOSOMES 143 3.7 Artificia Chromosomes The major imitation of most of the vectors that we have discussed so far is the size imit of the DNA that can be c oned into them. Natura eukaryotic chromosomes consist of hundreds or thousands of genes, together with DNA e ements required for chromosoma stabi ity and function such as te omeres and centromeres. Te omeres, which consist of DNA and protein, are ocated at the ends of chromosomes and protect them from damage. Centromeres are segments of high y repetitive DNA that are essentia for the proper contro of chromosome distribution during ce division. A ogica extension of vector design to c one very arge DNA fragments is, therefore, to reconstruct an autonomous y rep icating chromosome into which DNA fragments may be c oned. C oning in this way is conceptua y simi ar to c oning in phage with the reconstruction of a rep ication competent DNA mo ecu e except that the sca e of the foreign DNA that can be c oned is much greater. 3.7.1 YACs Yeast artificia chromosome (YAC) vectors a ow the c oning, within yeast ce s, of fragments of foreign genomic DNA that can approach 500 kbp in size. These vectors contain severa e ements of typica yeast chromosomes, inc uding the fo owing. A yeast centromere (CEN4). The yeast centromere is specified by a 125 bp DNA segment. The consensus sequence consists of three e ements: a 7886 bp region with more than 90 per cent AT residues, f anked by a conserved sequence on one side and a short consensus sequence on the other (reviewed by C arke (1990)). Yeast autonomous y rep icating sequence (ARS1). Yeast ARS e ements are essentia y origins of rep ication that function in yeast ce s autonomous y from the rep ication of yeast chromosoma rep ication origins. Yeast te omeres (TEL). Te omeres are the specific sequences (5_TGTGGGTGTGGTG-3_) that are present at the ends of chromosomes in mu tip e copies and are necessary for rep ication and chromosome maintenance. Genes for YAC se ection in yeast. The vector has a functiona copy of URA3, a gene invo ved in uraci biosynthesis, and TRP1, a gene invo ved in

tryptophan biosynthesis, that a ow se ection of yeast ce s that have taken up the vector. The YAC is transformed into a host yeast ce that is defective in these biosynthetic pathways, and transformants are identified by their abi ity to comp ement the nutritiona defect. 144 VECTORS 3 Bacteria rep ication origin and a bacteria se ectab e marker. In order to propagate the YAC vector in bacteria ce s, prior to insertion of genomic DNA, YAC vectors usua y contain the Co E1 ori and the ampici in resistance gene for growth and ana ysis in E. co i. The c oning of DNA fragments into a YAC is shown diagrammatica y in Figure 3.18. The YAC is c eaved using restriction enzymes to generate two arms that each have a te omere sequence at the end. One of the arms contains an autonomous rep ication sequence (ARS1), a centromere (CEN4) and a BamHI TEL URA3 AMPR ori TEL CEN4 ARS2 TRP1 EcoRI BamHI Cut with EcoRI and BamHI Partia digest with EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI EcoRI Mix and igate Transform into ade, ura, trp yeast and select for red, URA , TRP colonies AMPR pYAC4 ori EcoRI BamHI BamHI URA3 CEN4 ARS 2 TEL TEL TRP1 Genomic DNA TEL URA3 AMPR ori TEL CEN4 ARS2 TRP1 Genomic DNA SUP4 Figure 3.18. Cloning of very large DNA fragments into a YAC vector. See the text for details 3.7 ARTIFICIAL CHROMOSOMES 145 selectable marker (TRP1). The other arm contains a second selectable marker (URA3). Large DNA fragments (>100 kbp) are then ligated between the two arms (Anand, Villasante and Tyler Smitu, 1989). The insertion of foreign DNA into the cloning site inactivates the suppressor tRNA gene SUP4, expressing tRNATyr, in the vector DNA. In an ade2ochre host yeast cell, the expression of SUP4 results in the formation of white colonies, while in those in which it

has been insertionally inactivated will give rise to red yeast colonies (Burke, Carle and Olson, 1987). Yeast cells that are mutated in the ADE2 gene product (coding for the enzyme phosphoribosylamino imidazole carboxylase) have a block in the adenine biosynthetic pathway, causing an intermediate to accumulate in the vacuole. This intermediate gives the cell a red colour. The recombinant YACs are therefore transformed into a yeast strain that has defects in its chromosomal copies of the ura3, trp1 and ade2 genes. Transformants are identified as those red colonies that grow on media lacking both uracil and tryptophan. This ensures that the cell has received an artificial chromosome wit h both telomeres (because of complementation of the two nutritional mutations) and the artificial chromosome contains insert DNA (because the cell is red). There are difficulties associated with working with YACs. Some of these are listed below. Very large DNA molecules are very fragile and prone to breakage, leading to problems of rearrangement. It is estimated that between 10 and 60 per cent of clones in YAC genomic libraries are chimaeric, i.e. regions from different parts of the genome become joined in a single YAC clone (Green et al., 1991). Clones tend to be unstable, with their foreign DNA inserts often being deleted. Naturally occurring repetitive DNA sequences are rare in the yeast genome, and the insertion of such sequences from, say, human DNA inserts appears to increase the recombination frequency within the YAC. This may make the YAC unstable. Interestingly, however, larger YAC vectors are more stable in yeast than shorter ones, which consequently favours cloning of large stretches of DNA (Smith, Smyth and Moir, 1990). There is a high rate of loss of the entire YAC during mitotic growth. It is difficult to separate the YAC from the other host chromosomes because of their similar size. Separation requires sophisticated pulsed field gel electrophoresis (PFGE). The yield of DNA is not high when the YAC is isolated from yeast cells. 146 VECTORS 3 3.7.2 PACs To overcome some of the problems associated with using cosmid or YAC systems, a method for cloning and packaging DNA fragments using a bacteriophage P1 system has been developed that offers the ability to clone large genomic DNA fragments of between 70 and 95 kbp in size. P1 bacteriophage has a much larger genome than phage (in the range of 110115 kbp), and vectors have been designed with the essentia rep ication components of P1 incorporated into a p asmid (Ioannou et a ., 1994). Upon infecting E. co i, bacteriophage P1 may either express its ytic functions, producing 100200 new bacteriophage partic es and ysing the infected bacterium, or the infecting bacteriophage may repress its ytic functions, and the bacteriophage genome is maintained as a arge, stab e, ow-copy p asmid. P1 phage has two rep ication origins one to contro ytic DNA rep ication, and the other to maintain the p asmid during non- ytic growth. During the ytic cyc e, new phage DNA is produced and c eaved at a pac site prior to insertion into phage partic es. The c oning of foreign DNA fragments into a P1 vector, or P1 artificia chromosome (PAC), is shown in Figure 3.19. The PAC vector is digested with the restriction enzymes ScaI and BamHI to generate two vector arms: a short and a ong arm. Genomic DNA is partia y digested with MboI (recognition sequence 5_-GTAC-3_, yie ding BamHI-compatib e sticky ends) and size se ected on a sucrose gradient. Fragments between 70 and 95 kb in ength are iso ated and igated in between the vector arms to generate a series of inear mo ecu es. If igation occurs between two short arms, the resu ting mo ecu e wi contain neither the P1 rep ication origins nor the KANR gene, and wi be non-viab e. If both arms are ong there wi be no pac site, and no packaging into the phage heads wi occur. The on y viab e recombinant wi consist of the insert sequenc e f anked by both a short and ong arm. Phage P1 uses a head-fu packaging

strategy and can accommodate a tota DNA ength of approximate y 110115 kbp. This means that any inserts onger than 95100 kbp wi resu t in the truncation of the packaged DNA before both oxP sites are inserted into the phage, and the mo ecu e wi be unab e to circu arize upon transfection into the host. Once injected into the cre+ E. co i host ce , the Cre protein circu arize s the DNA at the oxP sites, and DNA then rep icates using the p asmid origin of rep ication. The origina vector BamHI restriction enzyme site, into which the foreign DNA was inserted, is ocated within the bacteria sacB gene (encoding evansucrase). The expression of this gene is toxic to E. co i ce s growing on sucrose. Thus sucrose growth provides a mechanism of positive se ection for those PACs containing inserts. Propagation of E. co i ce s harbouring the recombinant PAC on media containing sucrose permits growth of co onies with DNA inserts. 3.7 ARTIFICIAL CHROMOSOMES 147 BamHI and ScaI Ligate genomic DNA ori pac oxP oxP oxP oxP ori pac P1 p asmid rep icon KANR P1 ytic rep icon P1 p asmid rep icon P1 ytic rep icon P1 p asmid rep icon P1 ytic rep icon Package in vitro Transfect cre E. co i ce s AMPR AMPR KANR oxP AMPR KANR oxP AMPR KANR PAC vector ori P1 p asmid rep icon oxP oxP P1 ytic rep icon sac B pac ScaI BamHI Recombinant PAC P1 p asmid rep icon oxP KANR P1 ytic

rep icon 7095 kbp Figure 3.19. C oning into a PAC vector. The PAC vector contains the P1 bacteriop hage p asmid and ytic rep icons, together with a pac c eavage site to a ow DNA asse mb y into phage partic es. Additiona y, the vector contains the pUC origin of rep ic ation (ori) and ampici in resistance gene for the propagation and se ection of the vector i tse f. These sequences are ost in the recombinant PAC. Large DNA inserts are c oned into the sacB gene (whose function can be se ected against) and packaged in vitro into P1 phag e partic es. Transfection of the P1 phage partic es into an E. co i ce harbourin g a copy of the Cre recombinase wi resu t in circu arization of the recombinant PAC at the oxP sites. The circu ar form is then maintained at ow copy number using the P1 p asmid rep icon 148 VECTORS 3 The recombinant PACs are maintained as p asmids within the E. co i using kanamycin resistance as a se ection marker. P asmid copy number can be increased more than 25-fo d by isopropy -D-thiogalactopyranoside (IPTG) induction of a lac promoter controlled high-copy P1 lytic replicon that is present within the recom inant PAC (Pierce et al., 1992). The recom inant DNA molecules are then isolated as plasmids using traditional methods. Using the P1 DNA packaging system, genomic DNA from 70 to 95 k can e readily cloned and manipulated. Improvements in vector design have allowed the production of PACs that can accommodate 130150 k p inserts. The major advantages of the P1 DNA packaging method over other genomic cloning methods are the large size of the DNA fragments that may e inserted into the vectors, no rearrangement or deletion of methylated DNA occurs ecause of the use of restriction-minus host strains and recom inant DNA is easily recovered as plasmids for further screening and manipulation. 3.7.3 BACs Bacterial artificial chromosomes (BACs) are engineered versions of F_ plasmids (Shizuya et al., 1992). BACs are capa le of carrying approximately 200 k p of inserted DNA sequence, and the F-factor origin of replication (oriS) maintains their level at approximately one copy per cell. In addition to oriS, BACs contain four F-factor genes required for replication and maintenance of copy num er, repE, parA, parB and parC. The overall architecture of a typical BAC is shown in Figure 3.20. In addition to the F-factor genes, pBeloBac11 also contains a selecta le anti iotic resistance maker (CAMR) and the lacZ_ gene har ouring a multiple cloning site for the luewhite screening of BACs containing inserts (Kim et al., 1996 ). Additionally, the BAC contains a cos site (cosN) and a oxP site. These sites are used for specific c eavage of the insert containing BAC during restriction mapping. The cosN site can be c eaved using terminase (Rackwitz et a ., 1985), whi e the oxP site can be c eaved by the Cre protein in the presence of an o igonuc eotide to the oxP sequence (Abremski, Hoess and Stanbers, 1983). Additiona BACs have been constructed that contain the recognition sites for extreme y rare-cutting restriction enzymes. For examp e, I-SceI is an intron encoded restriction enzyme from the mitochondria of the 3.7 ARTIFICIAL CHROMOSOMES 149 pBe oBAC11 7.4 kbp oxP

ori S repE parA par B acZ CAMR cos BamHI Sa I HindIII par C Figure 3.20. The structure of a BAC vector. See the text for detai s yeast Saccharomyces cerevisiae (Montei het et a ., 1990). Its arge recognition sequence (5_-TAGGGATAACAGGGTAAT-3_) does not occur once in the human genome, and is consequent y very usefu for inearizing the vector without c eaving the insert DNA fragments. The DNA inserted into a BAC appears to be very stab e. It can survive intact for many hundreds of generations in E. co i ce s, and appears to be ess prone to rearrangements and de etions when maintained in a recombination defective E. co i host ce . The main drawback of using BAC vectors is that they are present in on y one or two copies per ce . This can comp icate iso ation and screening. 3.7.4 HACs Human artificia chromosomes (HACs) have been constructed that can survive for extended periods in tissue cu ture ce s (Harrington et a ., 1997) (Figure 3.21). As we have a ready seen, three e ements are required for the stabi ity of inear chromosomes centromeres, te omeres and an origin of rep ication. The human te omere repeat sequence (5_-TTAGGG-3_) is we known, but it is distinct from its yeast equiva ent and the two are not intercha ngeab e. A YAC wi not function as a chromosome in human ce s. To aid in the iso ation of human centromere and rep ication origin sequences, HACs have been constructed that can be transfected into and maintained within 150 VECTORS 3 Figure 3.21. A human artificia chromosome in tissue cu ture ce s. A synthetic s tellite cont ining microchromosome formed by tr nsfection of s tellite DNA nd telomeric sequences into tissue culture cells. A clon l line w s isol ted nd shown to con t in HAC (denoted by the rrow) derived from tr nsfected DNA nd not from trunc tion of endogenous chromosomes. Reprinted, with permission, from Will rd (2000). Copyrig ht (2000) Americ n Associ tion for the Adv ncement of Science hum n cells (Henning et l., 1999). This ppro ch identified multiple repe ts of 171 bp DNA sequence (c lled n s tellite repe t) cont ined within 3 million b se p ir DNA fr gment of the hum n X chromosome th t functions s centromere (Schueler et l., 2001). The repetitive n ture of these sequence s m kes them difficult to study nd m kes the identific tion of the centromere itself extremely h rd. The hum n genome cont ins multiple origins of replic tion. The ver ge hum n chromosome cont ins pproxim tely 150 106 bp of DNA, while DNA polymer se functions m xim lly t bout 3000 replic ted b ses per minute. Were replic tion to begin t single site, e ch chromosome would t ke over month to be replic ted, r ther th n the hour it ctu lly t kes. Multiple replic tion origins me n th t there re m ny pl ces on the euk ryotic chromosome where replic tion c n begin nd th t the process of complete replic tion proceeds t more r pid r te. The sequence of the hum n replic tion origin is very degener te (V shee et l., 2001), but the development of HACs should llow more precise m pping of these regions. 3.7 ARTIFICIAL CHROMOSOMES 151 HACs h ve gre t potenti l s tools for both b sic rese rch nd medic l

ther py. Artifici l chromosomes m y ultim tely le d to gene ther py vectors (see Ch pter 13) with some dv nt ges over existing vir l b sed vectors. They exist s extr chromsom l elements nd so would not result in insertion l mut genesis. They should h ve no size constr int on the mount on DNA th t they could c rry. By virtue of differences between centromere beh viour in mitosis nd meiosis, they might be designed not to function in the germ line. These possibilities rem in for the future nd will depend on h ving much gre ter underst nding of chromosome function th n is currently v il bl

Vous aimerez peut-être aussi